docs(workflow): document queue and database ownership by service#213
Merged
Conversation
a297f25 to
4ab6db2
Compare
albertywu
approved these changes
Jun 8, 2026
The request log had two persistence paths: the gateway wrote some entries directly, while the orchestrator ran the `log`-topic consumer that wrote all downstream entries to storage. Having the orchestrator persist the request log blurs ownership — the orchestrator is a pipeline that should only emit events, not own the request-log table. Concentrating all writes in the gateway gives a single owner for the request log and keeps the orchestrator free of request-log storage writes. The request-log persistence consumer moves from the orchestrator to the gateway: - Move `submitqueue/orchestrator/controller/log/` → `submitqueue/gateway/controller/log/` (importpath, doc comment, and default consumer group `orchestrator-log` → `gateway-log`). Logic is unchanged. - Orchestrator: `TopicKeyLog` becomes publish-only (subscription dropped), the log controller registration and import are removed, controller count 11 → 10. It still publishes via `submitqueue/core/request.PublishLog`. - Gateway: builds a consumer (generic + mysql classifiers), registers the moved log controller on `TopicKeyLog` with a subscription (group `gateway-log`), starts it, and drains it with `Stop(30000)` on shutdown — preserving the 128+SIGTERM graceful-exit contract. - Add `HOSTNAME=gateway-dev` to both gateway compose files for a stable subscriber name; update the workflow RFC and gateway README. - Tests: add a gateway integration test that publishes to the log topic (as the orchestrator does) and asserts the gateway consumer persists it, and an e2e test that lands a request and asserts Status advances to `started` — exercising the publish→consume→persist path across both services. The gateway keeps its two synchronous direct writes (`accepted` on Land, `cancelling` on Cancel) for read-your-write visibility at RPC return. Both are gateway writes, so the invariant holds: only the gateway persists the request log; the orchestrator only publishes. This works because gateway and orchestrator already share the same queue and app databases. - ✅ `bazel build` of both servers + the moved package - ✅ `make test` — unit tests pass (incl. the moved log_test) - ✅ `make check-gazelle`, `make check-tidy`, `make lint` (fmt + license) - ✅ `make integration-test-submitqueue-gateway` — new `TestRequestLogConsumer` verifies the gateway consumer persists a log entry published to the log topic - ✅ `make e2e-test` — new `TestLandRequest_PersistsStartedLogViaGatewayConsumer` verifies an orchestrator-published `started` log is persisted by the gateway and readable via Status; both services still exit 128+SIGTERM on shutdown
43670be to
561f86e
Compare
4ab6db2 to
aca149d
Compare
- Make the log-consumer subscriber name unique per instance (hostname+PID) so co-located gateway processes don't contend for the same partition lease. - Report the gRPC server error first in the shutdown errors.Join (it is the primary failure; consumer-stop is secondary cleanup). - Clarify in the README that the gateway is the sole owner (writer and reader) of the request log; Status/Cancel read directly, orchestrator only publishes. - Extract named poll constants (persistTimeout/persistPollInterval) in the gateway integration and e2e suites with a comment explaining that the in-container consumer is observed black-box via Status, so a bounded poll is used in lieu of an in-process channel/HookSignal wait. Follow-ups split out: design doc (#211) and DLQ PublishLog() (#212). Co-Authored-By: Oz <oz-agent@warp.dev>
Issue #211 (follow-up from PR #205) asks for a single place that records the submitqueue topology at a high level: which service owns its data and how the two services communicate. The workflow RFC already covers the cross-queue flow, so ownership belongs alongside it. Append an "Ownership by service" section to doc/rfc/submitqueue/workflow.md, described at a conceptual level rather than enumerating individual tables and topics: - Gateway — RPC entry point and owner of the request log; the only service that reads or writes that record. - Orchestrator — runs the pipeline and owns its working state (requests, batches, builds); the only service that writes it. - Messaging queue — the shared, pluggable infrastructure the two services communicate through, kept in its own database separate from application data. A closing "Request-log ownership invariant" section captures the rule: the orchestrator only emits log events, the gateway is the sole consumer and the only writer of the request log. Documentation only; no code, schema, or proto changes. - ✅ `make lint` (clean tree) Closes #211
561f86e to
39e8fa0
Compare
aca149d to
1a51248
Compare
behinddwalls
added a commit
that referenced
this pull request
Jun 8, 2026
## Summary ### Why? The request log had two persistence paths: the gateway wrote some entries directly, while the orchestrator ran the `log`-topic consumer that wrote all downstream entries to storage. Having the orchestrator persist the request log blurs ownership — the orchestrator is a pipeline that should only emit events, not own the request-log table. Concentrating all writes in the gateway gives a single owner for the request log and keeps the orchestrator free of request-log storage writes. ### What? The request-log persistence consumer moves from the orchestrator to the gateway: - Move `submitqueue/orchestrator/controller/log/` → `submitqueue/gateway/controller/log/` (importpath, doc comment, and default consumer group `orchestrator-log` → `gateway-log`). Logic is unchanged. - Orchestrator: `TopicKeyLog` becomes publish-only (subscription dropped), the log controller registration and import are removed, controller count 11 → 10. It still publishes via `submitqueue/core/request.PublishLog`. - Gateway: builds a consumer (generic + mysql classifiers), registers the moved log controller on `TopicKeyLog` with a subscription (group `gateway-log`), starts it, and drains it with `Stop(30000)` on shutdown — preserving the 128+SIGTERM graceful-exit contract. - Add `HOSTNAME=gateway-dev` to both gateway compose files for a stable subscriber name; update the workflow RFC and gateway README. - Tests: add a gateway integration test that publishes to the log topic (as the orchestrator does) and asserts the gateway consumer persists it, and an e2e test that lands a request and asserts Status advances to `started` — exercising the publish→consume→persist path across both services. The gateway keeps its two synchronous direct writes (`accepted` on Land, `cancelling` on Cancel) for read-your-write visibility at RPC return. Both are gateway writes, so the invariant holds: only the gateway persists the request log; the orchestrator only publishes. This works because gateway and orchestrator already share the same queue and app databases. ## Test Plan - ✅ `bazel build` of both servers + the moved package - ✅ `make test` — unit tests pass (incl. the moved log_test) - ✅ `make check-gazelle`, `make check-tidy`, `make lint` (fmt + license) - ✅ `make integration-test-submitqueue-gateway` — new `TestRequestLogConsumer` verifies the gateway consumer persists a log entry published to the log topic - ✅ `make e2e-test` — new `TestLandRequest_PersistsStartedLogViaGatewayConsumer` verifies an orchestrator-published `started` log is persisted by the gateway and readable via Status; both services still exit 128+SIGTERM on shutdown ## Issues ## Stack 1. @ #205 1. #213 1. #214 --------- Co-authored-by: Oz <oz-agent@warp.dev>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Why?
Issue #211 (follow-up from PR #205) asks for a single place that records
the submitqueue topology at a high level: which service owns its data and
how the two services communicate. The workflow RFC already covers the
cross-queue flow, so ownership belongs alongside it.
What?
Append an "Ownership by service" section to doc/rfc/submitqueue/workflow.md,
described at a conceptual level rather than enumerating individual tables
and topics:
that reads or writes that record.
batches, builds); the only service that writes it.
communicate through, kept in its own database separate from application
data.
A closing "Request-log ownership invariant" section captures the rule: the
orchestrator only emits log events, the gateway is the sole consumer and
the only writer of the request log.
Documentation only; no code, schema, or proto changes.
Test Plan
make lint(clean tree)Issue
Closes #211
Issues
Stack