--- title: "Google Search Console API Query Builder" description: "Build GSC API requests with dimensions, filters, and regex. Learn the 5K row bug, searchAppearance limitations, and undocumented quirks." canonical_url: "https://gscdump.com/learn-google-search-console/api/query-builder" last_updated: "2026-04-30T06:36:30.928Z" --- The GSC API lets you filter and group search performance data by dimensions like page, query, country, and device. Understanding how to build queries, and their undocumented bugs, is critical for reliable data extraction. ## Implementation Examples ```typescript [TypeScript] const response = await fetch( `https://searchconsole.googleapis.com/webmasters/v3/sites/${encodeURIComponent(siteUrl)}/searchAnalytics/query`, { method: 'POST', headers: { 'Authorization': `Bearer ${accessToken}`, 'Content-Type': 'application/json' }, body: JSON.stringify({ startDate: '2025-01-01', endDate: '2025-01-31', dimensions: ['query', 'page'], dimensionFilterGroups: [{ filters: [{ dimension: 'country', operator: 'equals', expression: 'usa' }] }], rowLimit: 25000, startRow: 0 }) } ) ``` ```python [Python] import requests from datetime import date, timedelta def query_gsc(site_url, access_token, dimensions=['query'], start_days_ago=7): """Fetch GSC data for last N days""" end_date = date.today() - timedelta(days=3) # Account for 2-3 day lag start_date = end_date - timedelta(days=start_days_ago) url = f'https://searchconsole.googleapis.com/webmasters/v3/sites/{site_url}/searchAnalytics/query' payload = { 'startDate': start_date.isoformat(), 'endDate': end_date.isoformat(), 'dimensions': dimensions, 'rowLimit': 25000 } headers = { 'Authorization': f'Bearer {access_token}', 'Content-Type': 'application/json' } response = requests.post(url, json=payload, headers=headers) response.raise_for_status() return response.json() # Usage data = query_gsc( site_url='sc-domain:example.com', access_token='ya29.a0Ae...', dimensions=['page', 'query'], start_days_ago=28 ) ``` **Key fields:** - `startDate` (YYYY-MM-DD) - Start date for data collection. - `endDate` (YYYY-MM-DD) - End date (inclusive). - `dimensions` - Array of grouping keys (e.g., `query`, `page`, `date`). - `dimensionFilterGroups` - Nested filters for narrowing results (AND/OR logic). - `rowLimit` - Max rows per response (up to 25,000). - `startRow` - Zero-based index for pagination. ## Dimensions Explained ### Available Dimensions

Dimension	Description	Example Values
`date`	Daily breakdown	`2025 - 01 - 27`
`query`	Search keywords	`"gsc api query"`
`page`	Landing page URL	`"https://example.com/article"`
`country`	User country (ISO 3166-1)	`"usa"` , `"gbr"` , `"jpn"`
`device`	Device type	`"DESKTOP"` , `"MOBILE"` , `"TABLET"`
`searchAppearance`	SERP feature type	`"VIDEO"` , `"RICH_RESULT"`

### Dimension Combinations You can request up to **7 dimensions** in a single query, but some combinations have hidden costs: **Safe combinations** (no data loss): - `date` only - `date` + `country` - `date` + `device` - `query` only - `page` only **Lossy combinations** (Google drops data): - `page` + `query` - **~66% impression loss** on large sites (Google's [documented behavior](https://developers.google.com/search/blog/2021/11/understanding-gsc-data)) - `date` + `query` - Triggers 5K row bug (see below) - Any combination with `searchAppearance` - Must be ONLY dimension ### The 5K Row Bug When querying with `date` + `query` dimensions, the API returns **only 5,000 rows** despite setting `rowLimit: 25000`. **Affected query:** ```typescript { dimensions: ['date', 'query'], rowLimit: 25000 // Ignored! Returns 5,000 max } ``` **Workaround:** Query by `query` alone, then make separate date-range queries per keyword: ```typescript // Step 1: Get top queries (works fine) const queries = await queryGSC({ dimensions: ['query'], rowLimit: 25000 }) // Step 2: Loop queries, get daily breakdown for (const q of queries.rows) { const dailyData = await queryGSC({ dimensions: ['date'], dimensionFilterGroups: [{ filters: [{ dimension: 'query', operator: 'equals', expression: q.keys[0] }] }] }) } ``` **Source:** Developer forums and API users [consistently report](https://support.google.com/webmasters/thread/116904664/search-console-api-returns-only-5000-rows-instead-of-25000) this truncation issue when multiple dimensions are present. Google recommends BigQuery for datasets of this scale. ## Filters Deep Dive ### Filter Operators

Operator	Behavior	Use Case
`equals`	Exact match	`country = "usa"`
`notEquals`	Exclude exact match	`device != "TABLET"`
`contains`	Substring match	`page contains "/blog/"`
`notContains`	Exclude substring	`query not contains "brand"`
`includingRegex`	RE2 regex match	`page matches "\/2025\/"`
`excludingRegex`	Exclude regex match	`query excludes "^brand"`

### Regex Filtering Google [officially added RE2 regex support](https://developers.google.com/search/blog/2021/04/regex-filtering-performance-report) to Search Console in April 2021, and to the API in October 2021. Syntax follows [RE2 spec](https://github.com/google/re2/wiki/Syntax). **Example: Match blog posts from 2025:** ```typescript { dimensionFilterGroups: [{ filters: [{ dimension: 'page', operator: 'includingRegex', expression: '\\/blog\\/2025\\/[0-9]{2}\\/' }] }], dimensions: ['page'] } ``` **Example: Exclude branded queries:** ```python { 'dimensionFilterGroups': [{ 'filters': [{ 'dimension': 'query', 'operator': 'excludingRegex', 'expression': '^(brand|company|product)' }] }], 'dimensions': ['query'] } ``` Regex is powerful but **not anchored by default**, use `^` (start) and `$` (end) explicitly. ### searchAppearance Filter Bug The `searchAppearance` dimension has two critical bugs that remain **unresolved as of March 2026**: **Bug 1: Must be ONLY dimension** This fails: ```typescript { dimensions: ['searchAppearance', 'page'], // ERROR } ``` `searchAppearance` cannot combine with `page`, `query`, `country`, or `device`. Must query alone, then filter other dimensions separately. **Bug 2: notContains/notEquals return OPPOSITE results** When filtering searchAppearance with `notContains` or `notEquals`, the API returns the **opposite** of what you requested (returning only the rows you tried to exclude). ```typescript // Request: Exclude VIDEO results { dimensionFilterGroups: [{ filters: [{ dimension: 'searchAppearance', operator: 'notEquals', expression: 'VIDEO' }] }] } // Bug: Returns ONLY VIDEO results instead ``` **Workaround:** Manually filter results client-side. Google acknowledged this bug in early 2025 but [Search Engine Roundtable confirmed](https://www.seroundtable.com/google-search-console-api-search-appearance-bug-37234.html) it remains "under investigation" with no fix date. ## Advanced Filter Patterns ### Multiple Filters (AND Logic) Filters in the same group are ANDed: ```typescript { dimensionFilterGroups: [{ filters: [ { dimension: 'country', operator: 'equals', expression: 'usa' }, { dimension: 'device', operator: 'equals', expression: 'MOBILE' } ] }] } // Returns: USA AND Mobile traffic only ``` ### Multiple Filter Groups (OR Logic) Separate groups are ORed: ```typescript { dimensionFilterGroups: [ { filters: [ { dimension: 'country', operator: 'equals', expression: 'usa' } ] }, { filters: [ { dimension: 'country', operator: 'equals', expression: 'gbr' } ] } ] } // Returns: USA OR UK traffic ``` ### Combining AND + OR ```python { 'dimensionFilterGroups': [ { 'filters': [ {'dimension': 'country', 'operator': 'equals', 'expression': 'usa'}, {'dimension': 'device', 'operator': 'equals', 'expression': 'MOBILE'} ] }, { 'filters': [ {'dimension': 'country', 'operator': 'equals', 'expression': 'gbr'}, {'dimension': 'device', 'operator': 'equals', 'expression': 'DESKTOP'} ] } ] } # Returns: (USA AND Mobile) OR (UK AND Desktop) ``` ## Data Loss Warning Google's [deep dive blog](https://developers.google.com/search/blog/2021/11/understanding-gsc-data) states: > "When you group by page and/or query, the system may drop some data to reduce cardinality." Translation: Large sites lose **~66% of impression data** when querying `page` + `query` together. **Why?** Google pre-aggregates data to reduce storage. When you cross-reference page × query, the result set explodes (millions of combinations). Google drops low-traffic combinations. **Impact example:** ```typescript // Query 1: Pages alone { dimensions: ['page'] } // Returns: 10M total impressions // Query 2: Pages + Queries { dimensions: ['page', 'query'] } // Returns: 3.4M total impressions (66% data loss) ``` **Workaround:** Query dimensions separately when precision matters: 1. Get top pages: `dimensions: ['page']` 2. Get top queries: `dimensions: ['query']` 3. For specific page, get queries: `dimensionFilterGroups: [{ filters: [{ dimension: 'page', ... }] }]` This avoids cross-dimensional data loss. ## Sorting and Limits ### No Sort Parameter The API **does not support custom sorting**. Results are always sorted by **clicks DESC**. You cannot sort by impressions, CTR, or position. Must sort client-side after fetching. **Source:** [AnalyticsEdge documentation on GSC API limitations](https://www.analyticsedge.com/google-search-console/gsc-api-limitations/) ### Pagination Max 25,000 rows per request. For larger datasets, use `startRow`: ```typescript // Page 1 { rowLimit: 25000, startRow: 0 } // Page 2 { rowLimit: 25000, startRow: 25000 } // Page 3 { rowLimit: 25000, startRow: 50000 } ``` **Daily limit:** 50,000 rows per property. Two requests max before hitting quota. ## Anonymized Queries GSC hides queries with ** 100 ] # Sort by impressions (opportunity size) striking_distance.sort(key=lambda x: x['impressions'], reverse=True) ``` ### Brand vs Non-Brand Traffic ```typescript // Brand queries { dimensions: ['query'], dimensionFilterGroups: [{ filters: [{ dimension: 'query', operator: 'includingRegex', expression: '(brand|company|product)' }] }] } // Non-brand (query separately, subtract) { dimensions: ['query'], dimensionFilterGroups: [{ filters: [{ dimension: 'query', operator: 'excludingRegex', expression: '(brand|company|product)' }] }] } ``` ## Best Practices **1. Account for 2-3 day lag** Don't query today's date. Always subtract 3 days: ```typescript const endDate = new Date() endDate.setDate(endDate.getDate() - 3) ``` **2. Avoid lossy combinations** Never query `page` + `query` for accurate totals. Query separately. **3. Use regex for complex filters** `contains` is slow on large datasets. Regex is optimized: ```typescript // Slow { operator: 'contains', expression: '/blog/' } // Fast { operator: 'includingRegex', expression: '\\/blog\\/' } ``` **4. Paginate large results** Don't assume <25k rows. Always implement pagination: ```python all_rows = [] start_row = 0 while True: data = query_gsc(row_limit=25000, start_row=start_row) rows = data.get('rows', []) if not rows: break all_rows.extend(rows) start_row += 25000 if len(rows) < 25000: # Last page break ``` **5. Cache aggressively** GSC data updates once daily. Cache responses for 24 hours: ```typescript const cacheKey = `gsc:${siteUrl}:${hash(query)}` const cached = await cache.get(cacheKey) if (cached) return cached const data = await queryGSC(...) await cache.set(cacheKey, data, { ttl: 86400 }) // 24h ``` ## Limitations Summary

Issue	Impact	Workaround
5K row bug with `date` +`query`	Returns 5k instead of 25k	Query dimensions separately
`searchAppearance` bugs	`notContains` returns opposite	Filter client-side
Data loss with `page` +`query`	66% impressions missing	Query dimensions separately
No sort parameter	Always sorted by clicks	Sort client-side
25k row limit	Large sites need pagination	Use `startRow` + loop
Anonymized queries	Missing from results	Accept data gap

## Next Steps - [Rate Limits](/learn-google-search-console/api/rate-limits) - Understand API quotas and 429 errors - [Authentication](/learn-google-search-console/api/authentication) - OAuth setup for API access - [MCP Server](/learn-google-search-console/ai-agents/mcp-server) - Query GSC data with AI ## Why gscdump Exists The GSC API's bugs, limits, and data loss make reliable querying difficult. gscdump syncs your full dataset daily, stores it without limits, and fixes API quirks: - No 5K row bug (query any dimension combination) - No 25K row limit (query millions of rows) - No data loss (raw data stored before aggregation) - No rate limits (query your own database) Try gscdump free: [gscdump.com](https://gscdump.com)