Skip to content

docs(workflow): document queue and database ownership by service#213

Merged
behinddwalls merged 4 commits into
mainfrom
preetam/request-log-ownership-doc
Jun 8, 2026
Merged

docs(workflow): document queue and database ownership by service#213
behinddwalls merged 4 commits into
mainfrom
preetam/request-log-ownership-doc

Conversation

@behinddwalls
Copy link
Copy Markdown
Collaborator

@behinddwalls behinddwalls commented Jun 5, 2026

Summary

Why?

Issue #211 (follow-up from PR #205) asks for a single place that records
the submitqueue topology at a high level: which service owns its data and
how the two services communicate. The workflow RFC already covers the
cross-queue flow, so ownership belongs alongside it.

What?

Append an "Ownership by service" section to doc/rfc/submitqueue/workflow.md,
described at a conceptual level rather than enumerating individual tables
and topics:

  • Gateway — RPC entry point and owner of the request log; the only service
    that reads or writes that record.
  • Orchestrator — runs the pipeline and owns its working state (requests,
    batches, builds); the only service that writes it.
  • Messaging queue — the shared, pluggable infrastructure the two services
    communicate through, kept in its own database separate from application
    data.

A closing "Request-log ownership invariant" section captures the rule: the
orchestrator only emits log events, the gateway is the sole consumer and
the only writer of the request log.

Documentation only; no code, schema, or proto changes.

Test Plan

  • make lint (clean tree)

Issue

Closes #211

Issues

Stack

  1. refactor(request-log): gateway is sole writer of request log #205
  2. @ docs(workflow): document queue and database ownership by service #213
  3. docs(rfc): extension contract — identity in, resolve internally #214

@behinddwalls behinddwalls changed the title docs(workflow): document queue/db ownership, tiering, request-log invariant docs(workflow): document queue ownership and database ownership model Jun 6, 2026
@behinddwalls behinddwalls changed the title docs(workflow): document queue ownership and database ownership model docs(workflow): document queue and database ownership by service Jun 6, 2026
@behinddwalls behinddwalls force-pushed the preetam/request-log-ownership-doc branch 3 times, most recently from a297f25 to 4ab6db2 Compare June 6, 2026 14:26
@behinddwalls behinddwalls marked this pull request as ready for review June 8, 2026 16:18
@behinddwalls behinddwalls requested review from a team and sbalabanov as code owners June 8, 2026 16:18
The request log had two persistence paths: the gateway wrote some entries
directly, while the orchestrator ran the `log`-topic consumer that wrote all
downstream entries to storage. Having the orchestrator persist the request log
blurs ownership — the orchestrator is a pipeline that should only emit events,
not own the request-log table. Concentrating all writes in the gateway gives a
single owner for the request log and keeps the orchestrator free of request-log
storage writes.

The request-log persistence consumer moves from the orchestrator to the gateway:

- Move `submitqueue/orchestrator/controller/log/` → `submitqueue/gateway/controller/log/`
  (importpath, doc comment, and default consumer group `orchestrator-log` →
  `gateway-log`). Logic is unchanged.
- Orchestrator: `TopicKeyLog` becomes publish-only (subscription dropped), the
  log controller registration and import are removed, controller count 11 → 10.
  It still publishes via `submitqueue/core/request.PublishLog`.
- Gateway: builds a consumer (generic + mysql classifiers), registers the moved
  log controller on `TopicKeyLog` with a subscription (group `gateway-log`),
  starts it, and drains it with `Stop(30000)` on shutdown — preserving the
  128+SIGTERM graceful-exit contract.
- Add `HOSTNAME=gateway-dev` to both gateway compose files for a stable
  subscriber name; update the workflow RFC and gateway README.
- Tests: add a gateway integration test that publishes to the log topic (as the
  orchestrator does) and asserts the gateway consumer persists it, and an e2e
  test that lands a request and asserts Status advances to `started` —
  exercising the publish→consume→persist path across both services.

The gateway keeps its two synchronous direct writes (`accepted` on Land,
`cancelling` on Cancel) for read-your-write visibility at RPC return. Both are
gateway writes, so the invariant holds: only the gateway persists the request
log; the orchestrator only publishes. This works because gateway and
orchestrator already share the same queue and app databases.

- ✅ `bazel build` of both servers + the moved package
- ✅ `make test` — unit tests pass (incl. the moved log_test)
- ✅ `make check-gazelle`, `make check-tidy`, `make lint` (fmt + license)
- ✅ `make integration-test-submitqueue-gateway` — new `TestRequestLogConsumer`
  verifies the gateway consumer persists a log entry published to the log topic
- ✅ `make e2e-test` — new `TestLandRequest_PersistsStartedLogViaGatewayConsumer`
  verifies an orchestrator-published `started` log is persisted by the gateway
  and readable via Status; both services still exit 128+SIGTERM on shutdown
@behinddwalls behinddwalls force-pushed the request-log-gateway-consumer branch from 43670be to 561f86e Compare June 8, 2026 16:55
@behinddwalls behinddwalls force-pushed the preetam/request-log-ownership-doc branch from 4ab6db2 to aca149d Compare June 8, 2026 16:55
behinddwalls and others added 2 commits June 8, 2026 10:03
- Make the log-consumer subscriber name unique per instance (hostname+PID)
  so co-located gateway processes don't contend for the same partition lease.
- Report the gRPC server error first in the shutdown errors.Join (it is the
  primary failure; consumer-stop is secondary cleanup).
- Clarify in the README that the gateway is the sole owner (writer and reader)
  of the request log; Status/Cancel read directly, orchestrator only publishes.
- Extract named poll constants (persistTimeout/persistPollInterval) in the
  gateway integration and e2e suites with a comment explaining that the
  in-container consumer is observed black-box via Status, so a bounded poll is
  used in lieu of an in-process channel/HookSignal wait.

Follow-ups split out: design doc (#211) and DLQ PublishLog() (#212).

Co-Authored-By: Oz <oz-agent@warp.dev>
Issue #211 (follow-up from PR #205) asks for a single place that records
the submitqueue topology at a high level: which service owns its data and
how the two services communicate. The workflow RFC already covers the
cross-queue flow, so ownership belongs alongside it.

Append an "Ownership by service" section to doc/rfc/submitqueue/workflow.md,
described at a conceptual level rather than enumerating individual tables
and topics:

- Gateway — RPC entry point and owner of the request log; the only service
  that reads or writes that record.
- Orchestrator — runs the pipeline and owns its working state (requests,
  batches, builds); the only service that writes it.
- Messaging queue — the shared, pluggable infrastructure the two services
  communicate through, kept in its own database separate from application
  data.

A closing "Request-log ownership invariant" section captures the rule: the
orchestrator only emits log events, the gateway is the sole consumer and
the only writer of the request log.

Documentation only; no code, schema, or proto changes.

- ✅ `make lint` (clean tree)

Closes #211
@behinddwalls behinddwalls force-pushed the request-log-gateway-consumer branch from 561f86e to 39e8fa0 Compare June 8, 2026 17:04
@behinddwalls behinddwalls force-pushed the preetam/request-log-ownership-doc branch from aca149d to 1a51248 Compare June 8, 2026 17:04
behinddwalls added a commit that referenced this pull request Jun 8, 2026
## Summary
### Why?

The request log had two persistence paths: the gateway wrote some
entries
directly, while the orchestrator ran the `log`-topic consumer that wrote
all
downstream entries to storage. Having the orchestrator persist the
request log
blurs ownership — the orchestrator is a pipeline that should only emit
events,
not own the request-log table. Concentrating all writes in the gateway
gives a
single owner for the request log and keeps the orchestrator free of
request-log
storage writes.

### What?

The request-log persistence consumer moves from the orchestrator to the
gateway:

- Move `submitqueue/orchestrator/controller/log/` →
`submitqueue/gateway/controller/log/`
(importpath, doc comment, and default consumer group `orchestrator-log`
→
  `gateway-log`). Logic is unchanged.
- Orchestrator: `TopicKeyLog` becomes publish-only (subscription
dropped), the
log controller registration and import are removed, controller count 11
→ 10.
  It still publishes via `submitqueue/core/request.PublishLog`.
- Gateway: builds a consumer (generic + mysql classifiers), registers
the moved
log controller on `TopicKeyLog` with a subscription (group
`gateway-log`),
starts it, and drains it with `Stop(30000)` on shutdown — preserving the
  128+SIGTERM graceful-exit contract.
- Add `HOSTNAME=gateway-dev` to both gateway compose files for a stable
  subscriber name; update the workflow RFC and gateway README.
- Tests: add a gateway integration test that publishes to the log topic
(as the
orchestrator does) and asserts the gateway consumer persists it, and an
e2e
  test that lands a request and asserts Status advances to `started` —
  exercising the publish→consume→persist path across both services.

The gateway keeps its two synchronous direct writes (`accepted` on Land,
`cancelling` on Cancel) for read-your-write visibility at RPC return.
Both are
gateway writes, so the invariant holds: only the gateway persists the
request
log; the orchestrator only publishes. This works because gateway and
orchestrator already share the same queue and app databases.

## Test Plan
- ✅ `bazel build` of both servers + the moved package
- ✅ `make test` — unit tests pass (incl. the moved log_test)
- ✅ `make check-gazelle`, `make check-tidy`, `make lint` (fmt + license)
- ✅ `make integration-test-submitqueue-gateway` — new
`TestRequestLogConsumer`
verifies the gateway consumer persists a log entry published to the log
topic
- ✅ `make e2e-test` — new
`TestLandRequest_PersistsStartedLogViaGatewayConsumer`
verifies an orchestrator-published `started` log is persisted by the
gateway
and readable via Status; both services still exit 128+SIGTERM on
shutdown

## Issues


## Stack
1. @ #205
1. #213
1. #214

---------

Co-authored-by: Oz <oz-agent@warp.dev>
Base automatically changed from request-log-gateway-consumer to main June 8, 2026 17:08
@behinddwalls behinddwalls enabled auto-merge June 8, 2026 17:08
@behinddwalls behinddwalls disabled auto-merge June 8, 2026 17:08
@behinddwalls behinddwalls merged commit 98f76d5 into main Jun 8, 2026
14 checks passed
@behinddwalls behinddwalls deleted the preetam/request-log-ownership-doc branch June 8, 2026 17:09
@behinddwalls behinddwalls deployed to stack-rebase June 8, 2026 17:09 — with GitHub Actions Active
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Design doc: queue/database ownership, queue tiering, and cross-queue message flow

2 participants