Skip to content

[FEATURE] Wire OData + GraphQL query-option push-down through the SQL planner #659

@jeffreyaven

Description

@jeffreyaven

Summary

stackql currently has zero query-option push-down working end-to-end for OData providers, and zero SQL LIMIT N push-down for GraphQL providers. SQL LIMIT, WHERE, ORDER BY, and SELECT projections are all applied client-side: the upstream API computes a large result set and stackql truncates after the fact. The any-sdk side has scaffolding for most of this already (QueryParamPushdown config tree, inheritance walker, getters), but the stackql planner never reads it.

The user-facing consequence: real OData providers (Microsoft Graph / Entra ID being the immediate motivator) cannot abstract OData query semantics into SQL — users must write protocol-leaky WHERE-clause string literals like WHERE "$filter" = 'startswith(displayName,''A'')' instead of natural SQL. Similarly, GraphQL providers (Cloudflare, Trevorblades) force users to write WHERE limit = N instead of LIMIT N.

This issue proposes the comprehensive consumer-side wiring: OData v1 surface ($top, $select, $orderby, $count, $filter, $skip, and @odata.nextLink pagination) plus GraphQL LIMIT N push-down. Trippin fixture, Flask mock server, and robot tests for everything.

Dependencies

Blocking: any-sdk issue Complete the OData scaffolding: add SkipPushdown + named odata_next_link pagination algorithm must be merged and published first. That issue adds:

  • SkipPushdown struct (needed for Phase 6 below).
  • Named PaginationAlgorithmODataNextLink (needed for Phase 7 below).

Once that lands and stackql/go.mod is bumped to the published any-sdk version, this issue is unblocked.

The remaining four any-sdk pushdowns (SelectPushdown, FilterPushdown, OrderByPushdown, TopPushdown, CountPushdown) are already shipped, so Phases 1-5 of this issue have no any-sdk dependency.

Inventory

Status of OData query options before this issue:

OData option any-sdk stackql Tests End-to-end
$select shipped none unit only NO
$filter shipped none unit only NO
$orderby shipped none unit only NO
$top shipped none unit only NO
$count shipped none unit only NO
$skip shipped after dependency PR none unit after dep NO
@odata.nextLink shipped after dependency PR generic semantic consumer exists at mono_valent_execution.go:318, 350; no algorithm dispatch entry none NO
$expand greenfield n/a n/a NO (deferred)
$search greenfield n/a n/a NO (deferred)

GraphQL LIMIT push-down has no precedent at all — no consumer for queryParamPushdown.top in the GraphQL acquire path (graphql_single_select_acquire.go).

Scope

Eight phases, each independently shippable. The first five are OData-only consumer-side wiring with no any-sdk dependency. Phases 6-7 depend on the any-sdk PR. Phase 8 is the GraphQL analogue. Each phase ships with at least one robot test against the Trippin (OData) or Trevorblades (GraphQL) fixture.

Phase 1 — $top

The simplest case. Foundation for all subsequent OData push-down.

stackql changes:

  • Implement the body of the existing empty case *sqlparser.Limit: stub at internal/stackql/astvisit/query_rewriting.go:1037. Type-assert node.Rowcount to *sqlparser.SQLVal with Type == IntVal, parse the bytes.
  • In the planner, look up op.GetQueryParamPushdown().GetTop(). If present, attach the int as a push-down hint on the plan node.
  • In internal/stackql/primitivebuilder/single_select_acquire.go, when the hint is present, set the query parameter named by TopPushdown.GetParamName() ($top for dialect: odata).
  • Belt-and-braces: keep the existing client-side LIMIT truncation as a safety net in case the upstream ignores $top.

Robot test: LIMIT 3 On Trippin People Sends $top=3 (see Test Cases).

Phase 2 — $select

stackql changes:

  • At plan time, inspect the SELECT projection list. If every projection is a bare column reference (no expressions / function calls / aliases of expressions), and the operation declares queryParamPushdown.select with dialect: odata, attach a push-down hint with the comma-separated field list.
  • Dispatch site sets $select=field1,field2,... query param.

Constraint: push-down only fires when projections are bare columns. SELECT JSON_EXTRACT(meta, '$.foo') AS bar keeps meta server-side and the JSON extraction client-side.

Robot test: SELECT user_name, first_name FROM trippin.people.people LIMIT 5 asserts upstream received ?$select=user_name,first_name&$top=5.

Phase 3 — $orderby

stackql changes:

  • Walk Select.OrderBy at plan time. For each Order element, push down only if Expr is a bare *ColName; if it's an expression, abort push-down.
  • Render as field [asc|desc][, field [asc|desc]]....
  • Dispatch site sets $orderby= query param.

Robot test: ORDER BY user_name DESC LIMIT 3 asserts ?$orderby=user_name desc&$top=3.

Phase 4 — $count

stackql changes:

  • Detect SELECT COUNT(*) FROM ... with no other projections and no GROUP BY at plan time.
  • If queryParamPushdown.count is declared, set $count=true AND $top=0 so the response is just the OData envelope with @odata.count.
  • On response, project the count value into a single-row single-column result. Bypass the normal row-iteration projection.

Robot test: SELECT COUNT(*) FROM trippin.people.people returns one row, mock asserts $count=true was sent.

Phase 5 — $filter (highest user value, largest scope)

stackql changes:

  • Implement a WHERE-clause AST walker that translates a subset of predicates to OData filter expression syntax. The walker must be partial-pushdown-safe: predicates it can translate get spliced into $filter=; predicates it cannot stay client-side. Result equivalence is preserved because the upstream returns a superset.

  • Subset supported in v1:

    • Comparison: = -> eq, <> -> ne, < -> lt, <= -> le, > -> gt, >= -> ge
    • String pattern: LIKE 'foo%' -> startswith(col,'foo'); LIKE '%foo' -> endswith(col,'foo'); LIKE '%foo%' -> contains(col,'foo')
    • Logical: AND, OR, NOT, parenthesised groups
    • Column-against-literal only (one side *ColName, other side *SQLVal)
  • Excluded from v1, stay client-side:

    • Expressions on both sides (a + b > 10)
    • Function calls other than the LIKE patterns above
    • Subqueries, JOINs
    • References to columns not in the operation's projection
    • IN lists (defer to v2)
  • supportedOperators allowlist enforcement: FilterPushdown.supportedOperators already lets the provider author further restrict the set. If the spec declares supportedOperators: [eq, ne, contains], the translator won't push gt/ge/etc. even if it could.

Robot tests: equality, range, LIKE prefix, AND composition, mixed pushable + non-pushable (asserting partial push-down).

Phase 6 — $skip (depends on any-sdk PR)

stackql changes:

  • Extract Limit.Offset from Select.Limit (parser already populates it).
  • If op.GetQueryParamPushdown().GetSkip() returns non-nil, attach an int hint.
  • Dispatch site sets $skip= query param.

Constraint: skip push-down conflicts with @odata.nextLink pagination (server cursors don't compose with client-driven offsets). Define the precedence: if the operation also configures pagination.algorithm: odata_next_link, skip push-down is silently ignored and OFFSET applies client-side. Log a debug-level note when this happens.

Robot test: LIMIT 3 OFFSET 6 against a non-paginated Trippin endpoint asserts ?$top=3&$skip=6.

Phase 7 — @odata.nextLink pagination (depends on any-sdk PR)

stackql changes:

  • Generic pagination consumer at mono_valent_execution.go:318, 350 should dispatch to the new odata_next_link algorithm by name. The any-sdk PR provides the handler implementation; stackql just needs to recognise the algorithm string.

Trippin fixture changes: configure serviceConfig.pagination.algorithm: odata_next_link. Mock server emits @odata.nextLink until rows exhausted.

Robot test: SELECT user_name FROM trippin.people.people (no LIMIT) iterates through all mock pages, terminates on absent @odata.nextLink, asserts the row count equals the total mocked population AND the mock observed N requests (where N is the number of pages).

Phase 8 — GraphQL LIMIT push-down (no OData dependency)

The OData phases cover REST. GraphQL providers (Cloudflare, Trevorblades, future SCIM-like providers) have the same UX wart with a different mechanism: SQL LIMIT N should render into the GraphQL query template's {{ .limit }} (or whatever the operation declares via TopPushdown.ParamName).

stackql changes:

  • In internal/stackql/primitivebuilder/graphql_single_select_acquire.go, around the existing paramMap construction (~line 117), if op.GetQueryParamPushdown().GetTop() is non-nil and the plan node carries a LIMIT hint, inject the value into paramMap under the configured ParamName. The any-sdk GraphQL reader will then render {{ .<paramName> }} with that value.

Provider-side declaration (in Trevorblades fixture):

x-stackQL-config:
  queryParamPushdown:
    top:
      paramName: limit

And the GraphQL query template uses {{ .limit }} (or {{ if .limit }}{{ .limit }}{{ else }}<default>{{ end }} for graceful absence) — providers already do this in workaround form, so the template change is zero work.

Robot tests:

  • Live: SELECT code FROM trevorblades.geo.continents LIMIT 3 against the real countries.trevorblades.com endpoint. Trevorblades has 7 continents; should return exactly 3 rows. Without push-down, all 7 come back and stackql client-caps to 3.
  • Mocked: same query against a mock that records the rendered GraphQL query body. Assert limit: 3 appears in the rendered query.

Test cases

Trippin fixture

Copy any-sdk/test/registry/unsigned-src/odata_trippin/v00.00.00000/ into stackql/test/registry/src/odata_trippin/v0.1.0/. Extend with the consolidated push-down config:

# services/main.yaml (excerpt)
x-stackQL-config:
  queryParamPushdown:
    select:
      dialect: odata
    filter:
      dialect: odata
      supportedOperators: [eq, ne, gt, ge, lt, le, contains, startswith, endswith]
    orderBy:
      dialect: odata
    top:
      dialect: odata
      maxValue: 1000
    count:
      dialect: odata
    skip:
      dialect: odata          # added in Phase 6
serviceConfig:
  pagination:
    algorithm: odata_next_link   # added in Phase 7

Trevorblades fixture extension

Add a top push-down declaration to the existing test/registry/src/trevorblades/v0.1.0/services/geo.yaml. Splice {{ .limit }} into the GraphQL continents(...) field args. (Trevorblades doesn't natively support a limit arg on continents, so for the test we either pick a different field that does, or use a small mock.) The mocked variant is the more reliable assertion target.

Mock HTTP servers

  • OData Trippin: new test/python/stackql_test_tooling/flask/trippin/app.py. Reads $top, $skip, $select, $filter, $orderby, $count from request query string. Applies OData semantics over a hardcoded 20-person collection. Emits @odata.nextLink when more pages remain. Records every request to an in-memory log accessible via a Mock Should Have Received keyword.
  • GraphQL Trevorblades (mocked variant): extend the existing Flask GraphQL helper (or add one if absent) to record the raw POST body so robot tests can assert what stackql sent.

Robot tests

*** Test Cases ***
LIMIT 3 On Trippin People Sends $top=3
    [Tags]    odata    pushdown    top
    ${output} =    StackQL Exec Inline    SELECT user_name FROM odata_trippin.people.people LIMIT 3;
    Should Match Pipe Table Row Count    ${output}    3
    Mock Should Have Received    GET    /People    $top=3
    Mock Call Count Should Be    1

SELECT Specific Columns Sends $select With Those Columns
    [Tags]    odata    pushdown    select
    ${output} =    StackQL Exec Inline    SELECT user_name, first_name FROM odata_trippin.people.people LIMIT 5;
    Mock Should Have Received    GET    /People    $select=user_name,first_name    $top=5

ORDER BY Sends $orderby
    [Tags]    odata    pushdown    orderby
    ${output} =    StackQL Exec Inline    SELECT user_name FROM odata_trippin.people.people ORDER BY user_name DESC LIMIT 3;
    Mock Should Have Received    GET    /People    $orderby=user_name desc    $top=3

COUNT(*) Sends $count=true And Returns Single Row
    [Tags]    odata    pushdown    count
    ${output} =    StackQL Exec Inline    SELECT COUNT(*) FROM odata_trippin.people.people;
    Should Match Pipe Table Row Count    ${output}    1
    Mock Should Have Received    GET    /People    $count=true

WHERE Equality Pushes Down As eq Filter
    [Tags]    odata    pushdown    filter
    ${output} =    StackQL Exec Inline    SELECT user_name FROM odata_trippin.people.people WHERE last_name = 'Smith';
    Mock Should Have Received    GET    /People    $filter=last_name eq 'Smith'

WHERE LIKE Prefix Pushes Down As startswith
    [Tags]    odata    pushdown    filter
    ${output} =    StackQL Exec Inline    SELECT user_name FROM odata_trippin.people.people WHERE first_name LIKE 'Russ%';
    Mock Should Have Received    GET    /People    $filter=startswith(first_name,'Russ')

WHERE AND Composition Pushes Both Predicates Down
    [Tags]    odata    pushdown    filter
    ${output} =    StackQL Exec Inline    SELECT user_name FROM odata_trippin.people.people WHERE last_name = 'Smith' AND age > 30;
    Mock Should Have Received    GET    /People    $filter=last_name eq 'Smith' and age gt 30

WHERE Mixed Pushable + Non-Pushable Pushes Subset Server-Side
    [Tags]    odata    pushdown    filter    partial
    ${output} =    StackQL Exec Inline    SELECT user_name FROM odata_trippin.people.people WHERE last_name = 'Smith' AND JSON_EXTRACT(metadata, '$.role') = 'admin';
    # last_name = 'Smith' pushes down; JSON_EXTRACT stays client-side
    Mock Should Have Received    GET    /People    $filter=last_name eq 'Smith'

OFFSET With LIMIT Sends $skip And $top
    [Tags]    odata    pushdown    skip
    ${output} =    StackQL Exec Inline    SELECT user_name FROM odata_trippin.people.people LIMIT 3 OFFSET 6;
    Mock Should Have Received    GET    /People    $top=3    $skip=6

@odata.nextLink Pagination Walks All Pages
    [Tags]    odata    pagination    next_link
    # Mock serves 5 people per page across 4 pages (20 total).
    ${output} =    StackQL Exec Inline    SELECT user_name FROM odata_trippin.people.people;
    Should Match Pipe Table Row Count    ${output}    20
    Mock Call Count Should Be    4

LIMIT 3 On Trevorblades GraphQL Renders limit: 3 In Query Body
    [Tags]    graphql    pushdown    top    mocked
    ${output} =    StackQL Exec Inline    SELECT code FROM trevorblades.geo.continents LIMIT 3;
    Last GraphQL Request Body Should Contain    limit: 3
    Should Match Pipe Table Row Count    ${output}    3

LIMIT 3 On Live Trevorblades Continents Returns Exactly 3 Rows
    [Tags]    graphql    pushdown    top    live
    ${output} =    StackQL Exec Inline    SELECT code FROM trevorblades.geo.continents LIMIT 3;
    # Trevorblades has 7 continents. With push-down, upstream returns 3.
    # Without push-down, upstream returns 7 and stackql client-caps.
    # Either way, the visible result is 3 rows; the test value is that
    # this test passes against the live endpoint at all.
    Should Match Pipe Table Row Count    ${output}    3

Back-compat smoke

Every existing provider that does NOT declare x-stackQL-config.queryParamPushdown.* must continue to behave identically (client-side LIMIT cap, no upstream param injection, no plan-time WHERE rewriting). The current azuread testdata fixtures (test/registry/src/azuread/) are the canonical "no push-down declared, uses WHERE-clause workaround" baseline; all existing tests against them must pass unchanged.

The existing GraphQL Trevorblades response-transform robot test (stackql_traffic_light_integration_from_cmd_line.robot:92) is the canonical "no push-down declared" GraphQL baseline; it must pass unchanged with no limit: N added to the rendered query.

Why this matters

  • Microsoft Graph / Entra ID provider is blocked behind Phases 1-5. Without OData push-down, every query forces the upstream to compute large result sets, then truncates client-side. Rate-limit budget burns, response times balloon, users have to learn OData syntax to write SQL. With Phases 1-5 landed, a real Entra ID provider becomes a first-class citizen.
  • Cloudflare provider's UX wart goes away with Phase 8. Today the 10 analytics resources expose limit as a WHERE-clause parameter; with GraphQL LIMIT push-down, that workaround is removed.
  • stackql's protocol abstraction story becomes complete. WHERE, ORDER BY, projection, COUNT, and LIMIT all push down for both REST (OData) and GraphQL. The last "protocol leak" — having to know whether the resource is REST or GraphQL to write the right query — is closed.
  • Trippin is the canonical OData reference. Hosting it in stackql/test/registry/src/ unblocks robot test development for any future OData provider.

Implementation notes

  • stackql-parser status: Phase 1 ($top) requires no parser changes — Select.Limit.Rowcount already populated. Phases 2-7 also no parser changes — Select.SelectExprs, Select.OrderBy, Select.Where.Expr, Select.Limit.Offset, and AST function-call detection all already exposed. An optional Limit.AsInt() (int64, bool) helper would clean up call sites but is not blocking.
  • Vendor bump: bump stackql/go.mod's pin on stackql-parser only if the optional helper lands; bump any-sdk to the version that includes the prerequisite PR.
  • Test infrastructure: the existing test/python/stackql_test_tooling/flask/ pattern (see github/app.py) is the model for the new trippin/app.py. Robot test keyword library extensions for Mock Should Have Received and Last GraphQL Request Body Should Contain may be needed.
  • Phase ordering: Phases 1-5 can ship in any order; Phase 6 needs the any-sdk PR; Phase 7 needs the any-sdk PR; Phase 8 is independent. The most user-impactful ordering is: 1 ($top) — proves the wiring — then 5 ($filter) — biggest payoff — then 2, 3, 4 — finishing the OData v1 surface — then 6-7 once the any-sdk PR lands — then 8 for GraphQL.

Out of scope (this issue)

  • $expand, $search, $value, $batch. Deferred per the any-sdk issue's "out of scope" section.
  • OData function-call push-down beyond startswith / contains / endswith. Useful but a large surface area.
  • WHERE-clause push-down for non-OData dialects (raw URL query params for REST-but-not-OData APIs). Defer to a separate issue.
  • GraphQL $select analog (projection push-down via GraphQL field selection). Doable but Cloudflare's existing GraphQL transforms make this less urgent. Defer.
  • GraphQL $filter analog. Each GraphQL API has bespoke filter syntax; can't be generalised the same way OData can.

File references

stackql (the gap)

  • internal/stackql/astvisit/query_rewriting.go line 1037 — existing empty case *sqlparser.Limit: stub (Phase 1 + 6 wire-in).
  • internal/stackql/astvisit/query_rewriting.go lines 335-336 — visitor descent already in place.
  • internal/stackql/planbuilder/ — push-down hint attachment for every phase.
  • internal/stackql/primitivebuilder/single_select_acquire.go — REST dispatch site (Phases 1-6).
  • internal/stackql/primitivebuilder/graphql_single_select_acquire.go — GraphQL dispatch site (Phase 8).
  • internal/stackql/execution/mono_valent_execution.go lines 318, 350 — existing pagination consumer; add odata_next_link to the algorithm dispatch (Phase 7).
  • test/registry/src/odata_trippin/v0.1.0/ — new fixture location (copied from any-sdk testdata).
  • test/registry/src/trevorblades/v0.1.0/services/geo.yaml — extend with top push-down declaration (Phase 8).
  • test/python/stackql_test_tooling/flask/trippin/app.py — new mock server.
  • test/robot/functional/stackql_mocked_from_cmd_line.robot — new test cases for Phases 1-7.
  • test/robot/integration-traffic-lights/stackql_traffic_light_integration_from_cmd_line.robot — new test cases for Phase 8 (live Trevorblades).

any-sdk (consumed by this issue; no new changes needed beyond the prerequisite PR)

  • internal/anysdk/query_param_pushdown.goSelectPushdown, FilterPushdown, OrderByPushdown, TopPushdown, CountPushdown (existing); SkipPushdown (added in prerequisite PR).
  • internal/anysdk/operation_store.go line 447 — inheritance walker.
  • pkg/streaming/pagination.go line 22 — algorithm constants, including PaginationAlgorithmODataNextLink (added in prerequisite PR).
  • pkg/graphql/graphql.goStandardGQLReader.iterativeInput is the substitution map Phase 8 writes into.

stackql-parser (no changes required)

  • go/vt/sqlparser/ast.go lines 72 (Select.Limit), 985-988 (Limit{Offset, Rowcount Expr}), 747 (SQLVal{Type, Val}). All needed AST nodes already exposed.

Related

  • Prerequisite: any-sdk issue Complete the OData scaffolding. Must be merged + published before Phases 6, 7 of this issue can land.
  • Cloudflare provider: currently uses WHERE limit = N workaround on all 10 GraphQL analytics resources. Phase 8 of this issue removes that workaround.
  • Future: a real Microsoft Graph / Entra ID provider in stackql-provider-registry depends on at least Phases 1-5 of this issue landing.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions