Google Search Console API Query Builder

Build GSC API requests with dimensions, filters, and regex. Learn the 5K row bug, searchAppearance limitations, and undocumented quirks.

Harlan WiltonHarlan Wilton
1 min

The GSC API lets you filter and group search performance data by dimensions like page, query, country, and device. Understanding how to build queries, and their undocumented bugs, is critical for reliable data extraction.

Implementation Examples

const response = await fetch(
  `https://searchconsole.googleapis.com/webmasters/v3/sites/${encodeURIComponent(siteUrl)}/searchAnalytics/query`,
  {
    method: 'POST',
    headers: {
      'Authorization': `Bearer ${accessToken}`,
      'Content-Type': 'application/json'
    },
    body: JSON.stringify({
      startDate: '2025-01-01',
      endDate: '2025-01-31',
      dimensions: ['query', 'page'],
      dimensionFilterGroups: [{
        filters: [{
          dimension: 'country',
          operator: 'equals',
          expression: 'usa'
        }]
      }],
      rowLimit: 25000,
      startRow: 0
    })
  }
)

Key fields:

  • startDate (YYYY-MM-DD) - Start date for data collection.
  • endDate (YYYY-MM-DD) - End date (inclusive).
  • dimensions - Array of grouping keys (e.g., query, page, date).
  • dimensionFilterGroups - Nested filters for narrowing results (AND/OR logic).
  • rowLimit - Max rows per response (up to 25,000).
  • startRow - Zero-based index for pagination.

Dimensions Explained

Available Dimensions

DimensionDescriptionExample Values
dateDaily breakdown2025-01-27
querySearch keywords"gsc api query"
pageLanding page URL"https://example.com/article"
countryUser country (ISO 3166-1)"usa", "gbr", "jpn"
deviceDevice type"DESKTOP", "MOBILE", "TABLET"
searchAppearanceSERP feature type"VIDEO", "RICH_RESULT"

Dimension Combinations

You can request up to 7 dimensions in a single query, but some combinations have hidden costs:

Safe combinations (no data loss):

  • date only
  • date + country
  • date + device
  • query only
  • page only

Lossy combinations (Google drops data):

  • page + query - ~66% impression loss on large sites (Google's documented behavior)
  • date + query - Triggers 5K row bug (see below)
  • Any combination with searchAppearance - Must be ONLY dimension

The 5K Row Bug

When querying with date + query dimensions, the API returns only 5,000 rows despite setting rowLimit: 25000.

Affected query:

{
  dimensions: ['date', 'query'],
  rowLimit: 25000  // Ignored! Returns 5,000 max
}

Workaround: Query by query alone, then make separate date-range queries per keyword:

// Step 1: Get top queries (works fine)
const queries = await queryGSC({
  dimensions: ['query'],
  rowLimit: 25000
})

// Step 2: Loop queries, get daily breakdown
for (const q of queries.rows) {
  const dailyData = await queryGSC({
    dimensions: ['date'],
    dimensionFilterGroups: [{
      filters: [{
        dimension: 'query',
        operator: 'equals',
        expression: q.keys[0]
      }]
    }]
  })
}

Source: Developer forums and API users consistently report this truncation issue when multiple dimensions are present. Google recommends BigQuery for datasets of this scale.

Filters Deep Dive

Filter Operators

OperatorBehaviorUse Case
equalsExact matchcountry = "usa"
notEqualsExclude exact matchdevice != "TABLET"
containsSubstring matchpage contains "/blog/"
notContainsExclude substringquery not contains "brand"
includingRegexRE2 regex matchpage matches "\/2025\/"
excludingRegexExclude regex matchquery excludes "^brand"

Regex Filtering

Google officially added RE2 regex support to Search Console in April 2021, and to the API in October 2021. Syntax follows RE2 spec.

Example: Match blog posts from 2025:

{
  dimensionFilterGroups: [{
    filters: [{
      dimension: 'page',
      operator: 'includingRegex',
      expression: '\\/blog\\/2025\\/[0-9]{2}\\/'
    }]
  }],
  dimensions: ['page']
}

Example: Exclude branded queries:

{
    'dimensionFilterGroups': [{
        'filters': [{
            'dimension': 'query',
            'operator': 'excludingRegex',
            'expression': '^(brand|company|product)'
        }]
    }],
    'dimensions': ['query']
}

Regex is powerful but not anchored by default, use ^ (start) and $ (end) explicitly.

searchAppearance Filter Bug

The searchAppearance dimension has two critical bugs that remain unresolved as of March 2026:

Bug 1: Must be ONLY dimension

This fails:

{
  dimensions: ['searchAppearance', 'page'],  // ERROR
}

searchAppearance cannot combine with page, query, country, or device. Must query alone, then filter other dimensions separately.

Bug 2: notContains/notEquals return OPPOSITE results

When filtering searchAppearance with notContains or notEquals, the API returns the opposite of what you requested (returning only the rows you tried to exclude).

// Request: Exclude VIDEO results
{
  dimensionFilterGroups: [{
    filters: [{
      dimension: 'searchAppearance',
      operator: 'notEquals',
      expression: 'VIDEO'
    }]
  }]
}
// Bug: Returns ONLY VIDEO results instead

Workaround: Manually filter results client-side. Google acknowledged this bug in early 2025 but Search Engine Roundtable confirmed it remains "under investigation" with no fix date.

Advanced Filter Patterns

Multiple Filters (AND Logic)

Filters in the same group are ANDed:

{
  dimensionFilterGroups: [{
    filters: [
      { dimension: 'country', operator: 'equals', expression: 'usa' },
      { dimension: 'device', operator: 'equals', expression: 'MOBILE' }
    ]
  }]
}
// Returns: USA AND Mobile traffic only

Multiple Filter Groups (OR Logic)

Separate groups are ORed:

{
  dimensionFilterGroups: [
    {
      filters: [
        { dimension: 'country', operator: 'equals', expression: 'usa' }
      ]
    },
    {
      filters: [
        { dimension: 'country', operator: 'equals', expression: 'gbr' }
      ]
    }
  ]
}
// Returns: USA OR UK traffic

Combining AND + OR

{
    'dimensionFilterGroups': [
        {
            'filters': [
                {'dimension': 'country', 'operator': 'equals', 'expression': 'usa'},
                {'dimension': 'device', 'operator': 'equals', 'expression': 'MOBILE'}
            ]
        },
        {
            'filters': [
                {'dimension': 'country', 'operator': 'equals', 'expression': 'gbr'},
                {'dimension': 'device', 'operator': 'equals', 'expression': 'DESKTOP'}
            ]
        }
    ]
}
# Returns: (USA AND Mobile) OR (UK AND Desktop)

Data Loss Warning

Google's deep dive blog states:

"When you group by page and/or query, the system may drop some data to reduce cardinality."

Translation: Large sites lose ~66% of impression data when querying page + query together.

Why? Google pre-aggregates data to reduce storage. When you cross-reference page × query, the result set explodes (millions of combinations). Google drops low-traffic combinations.

Impact example:

// Query 1: Pages alone
{ dimensions: ['page'] }
// Returns: 10M total impressions

// Query 2: Pages + Queries
{ dimensions: ['page', 'query'] }
// Returns: 3.4M total impressions (66% data loss)

Workaround: Query dimensions separately when precision matters:

  1. Get top pages: dimensions: ['page']
  2. Get top queries: dimensions: ['query']
  3. For specific page, get queries: dimensionFilterGroups: [{ filters: [{ dimension: 'page', ... }] }]

This avoids cross-dimensional data loss.

Sorting and Limits

No Sort Parameter

The API does not support custom sorting. Results are always sorted by clicks DESC.

You cannot sort by impressions, CTR, or position. Must sort client-side after fetching.

Source: AnalyticsEdge documentation on GSC API limitations

Pagination

Max 25,000 rows per request. For larger datasets, use startRow:

// Page 1
{ rowLimit: 25000, startRow: 0 }

// Page 2
{ rowLimit: 25000, startRow: 25000 }

// Page 3
{ rowLimit: 25000, startRow: 50000 }

Daily limit: 50,000 rows per property. Two requests max before hitting quota.

Anonymized Queries

GSC hides queries with <few dozen users over 2-3 months (exact threshold undocumented).

These "anonymized queries" are:

  • Included in totals (clicks/impressions aggregated)
  • Excluded from query dimension (missing from results)

Example:

{ dimensions: ['query'] }
// Returns: 500 queries, 10K clicks

// But totals show:
// 12K clicks (2K from anonymized queries)

You cannot retrieve anonymized queries via API. They exist only in aggregate metrics.

gscdump Query Builder

gscdump stores GSC data in SQLite (D1) and exposes Drizzle-style query syntax:

// Native GSC API (complex)
await fetch('https://searchconsole.googleapis.com/...', {
  body: JSON.stringify({
    startDate: '2025-01-01',
    endDate: '2025-01-31',
    dimensions: ['query'],
    dimensionFilterGroups: [{
      filters: [{
        dimension: 'query',
        operator: 'includingRegex',
        expression: '^mcp'
      }]
    }]
  })
})
-- gscdump MCP (simple)
SELECT query, SUM(clicks) as clicks
FROM gsc_keywords
WHERE date BETWEEN '2025-01-01' AND '2025-01-31'
  AND query LIKE 'mcp%'
GROUP BY query
ORDER BY clicks DESC

gscdump removes:

  • 25k row limit (query unlimited historical data)
  • 5K row bug (dimensions work correctly)
  • searchAppearance bugs (data stored correctly)
  • Data loss (no pre-aggregation)
  • Rate limits (query your own DB)

Common Query Patterns

Top Keywords by Clicks

{
  startDate: '2025-01-01',
  endDate: '2025-01-31',
  dimensions: ['query'],
  rowLimit: 100
}

Pages Losing Traffic (Month-over-Month)

# Get current month
current = query_gsc(dimensions=['page'], start_date='2025-01-01', end_date='2025-01-31')

# Get previous month
previous = query_gsc(dimensions=['page'], start_date='2024-12-01', end_date='2024-12-31')

# Compare client-side
for page in current['rows']:
    prev_clicks = find_page_clicks(previous, page['keys'][0])
    diff = page['clicks'] - prev_clicks
    if diff < -100:
        print(f"Declining: {page['keys'][0]} ({diff} clicks)")

Mobile vs Desktop Performance

// Mobile
{
  dimensions: ['page'],
  dimensionFilterGroups: [{
    filters: [{ dimension: 'device', operator: 'equals', expression: 'MOBILE' }]
  }]
}

// Desktop (separate query)
{
  dimensions: ['page'],
  dimensionFilterGroups: [{
    filters: [{ dimension: 'device', operator: 'equals', expression: 'DESKTOP' }]
  }]
}

Striking Distance Keywords (Position 4-15)

GSC API doesn't filter by position directly. Fetch all, filter client-side:

data = query_gsc(dimensions=['query'])

striking_distance = [
    row for row in data.get('rows', [])
    if 4 <= row['position'] <= 15 and row['impressions'] > 100
]

# Sort by impressions (opportunity size)
striking_distance.sort(key=lambda x: x['impressions'], reverse=True)

Brand vs Non-Brand Traffic

// Brand queries
{
  dimensions: ['query'],
  dimensionFilterGroups: [{
    filters: [{
      dimension: 'query',
      operator: 'includingRegex',
      expression: '(brand|company|product)'
    }]
  }]
}

// Non-brand (query separately, subtract)
{
  dimensions: ['query'],
  dimensionFilterGroups: [{
    filters: [{
      dimension: 'query',
      operator: 'excludingRegex',
      expression: '(brand|company|product)'
    }]
  }]
}

Best Practices

1. Account for 2-3 day lag

Don't query today's date. Always subtract 3 days:

const endDate = new Date()
endDate.setDate(endDate.getDate() - 3)

2. Avoid lossy combinations

Never query page + query for accurate totals. Query separately.

3. Use regex for complex filters

contains is slow on large datasets. Regex is optimized:

// Slow
{ operator: 'contains', expression: '/blog/' }

// Fast
{ operator: 'includingRegex', expression: '\\/blog\\/' }

4. Paginate large results

Don't assume <25k rows. Always implement pagination:

all_rows = []
start_row = 0

while True:
    data = query_gsc(row_limit=25000, start_row=start_row)
    rows = data.get('rows', [])

    if not rows:
        break

    all_rows.extend(rows)
    start_row += 25000

    if len(rows) < 25000:  # Last page
        break

5. Cache aggressively

GSC data updates once daily. Cache responses for 24 hours:

const cacheKey = `gsc:${siteUrl}:${hash(query)}`
const cached = await cache.get(cacheKey)

if (cached) return cached

const data = await queryGSC(...)
await cache.set(cacheKey, data, { ttl: 86400 })  // 24h

Limitations Summary

IssueImpactWorkaround
5K row bug with date+queryReturns 5k instead of 25kQuery dimensions separately
searchAppearance bugsnotContains returns oppositeFilter client-side
Data loss with page+query66% impressions missingQuery dimensions separately
No sort parameterAlways sorted by clicksSort client-side
25k row limitLarge sites need paginationUse startRow + loop
Anonymized queriesMissing from resultsAccept data gap

Next Steps

Why gscdump Exists

The GSC API's bugs, limits, and data loss make reliable querying difficult. gscdump syncs your full dataset daily, stores it without limits, and fixes API quirks:

  • No 5K row bug (query any dimension combination)
  • No 25K row limit (query millions of rows)
  • No data loss (raw data stored before aggregation)
  • No rate limits (query your own database)

Try gscdump free: gscdump.com

gscdump
© 2026 GSCDUMP.COM - BUILT FOR DEVELOPERS