refactor(workers): make the queue scale to zero when idle by jaredLunde · Pull Request #7 · beyondoss/queue

jaredLunde · 2026-06-27T18:46:43Z

What

Makes the queue server's delivery and schedule workers event-driven instead of fixed-interval pollers, so an idle queue generates zero Postgres traffic and the VM (and Postgres) can scale to zero.

Why

The delivery and schedule workers polled Postgres every 1s, and a 15s depth scrape ran unconditionally. That continuous traffic kept the queue VM's TAP device busy — and, because the queries hit Postgres, kept the Postgres VM busy too — so neither ever reached instd's idle threshold. Of an app's 5 primitive VMs, queue and postgres never scaled to zero even for a completely idle app. (SCHEDULES.md previously documented this as intentional; that stance is reversed here.)

Changes

Delivery worker (src/ops/delivery.rs): drains due rows, then sleeps until the earliest pending next_attempt_at (or parks if empty), woken in-process when a publish inserts deliveries.
Schedule worker (src/ops/schedule_worker.rs): fires what's due, then sleeps until the earliest active next_fire_at, capped at KEEPALIVE_CAP (240s, under the 300s light-sleep window so the VM stays awake and fires on time while any schedule is active), woken in-process by /schedules mutations.
Depth scrape: background loop removed; computed lazily in the /metrics handler.
Wiring: route handlers poke a tokio::sync::Notify. Primitives run single-instance (max=1), so an in-process signal suffices — no LISTEN/NOTIFY, no extension change, no Postgres republish.
Wake resilience (src/db.rs): connect/first-query retry with backoff + 30s acquire timeout (mirrors auth's connect_with_retry) so the first query after Postgres deep-sleeps holds while it restores from S3.
Docs (SCHEDULES.md, ARCHITECTURE.md) updated to the event-driven model.

Verification

33 unit + 80 integration tests pass against real Postgres.
Reworked unsubscribe_cancels_pending to assert the real guarantee (CASCADE-delete of a pending delivery) rather than relying on poll latency — delivery is now immediate.
Statement-level trace: an idle queue issues 0 Postgres queries over a 12s window (was ~2/s). cargo clippy -D warnings and dprint clean; .sqlx cache regenerated.

🤖 Generated with Claude Code

The delivery and schedule workers polled Postgres on a fixed 1s cadence and a 15s queue-depth scrape ran unconditionally. That continuous traffic kept the queue VM's TAP device busy — and, because the queries hit Postgres, kept the Postgres VM busy too — so neither could ever reach instd's idle threshold. Of an app's 5 primitive VMs, queue and postgres never scaled to zero. Make the workers event-driven instead of polling: - Delivery worker drains due rows, then sleeps until the earliest pending `next_attempt_at` (or parks if the table is empty), woken in-process when a publish inserts new deliveries. - Schedule worker fires what's due, then sleeps until the earliest active `next_fire_at` capped at KEEPALIVE_CAP (240s, under the 300s light-sleep window so the VM stays awake and fires on time while any schedule is active), woken in-process by `/schedules` mutations. - Depth-scrape background loop removed; computed lazily in the `/metrics` handler so an unscraped (sleeping) VM emits nothing. Route handlers (publish, schedule create/upsert/patch/run/delete) poke the relevant `tokio::sync::Notify`. Primitives run single-instance (max=1), so an in-process signal is sufficient — no LISTEN/NOTIFY, no extension change. DB connect/first-query now retry with backoff and a longer acquire timeout (mirrors auth's connect_with_retry) so the first query after an idle Postgres deep-sleeps holds while it restores from S3 instead of failing at 5s. Result: an idle app generates zero queue→Postgres traffic, so queue and postgres both sleep; verified by statement-level tracing (0 queries when idle) and the full integration suite. Apps with active schedules keep both awake (correct) at ~250x less idle DB traffic. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

jaredLunde force-pushed the queue-scale-to-zero branch 2 times, most recently from 905cd8a to c60b002 Compare June 27, 2026 19:11

jaredLunde force-pushed the queue-scale-to-zero branch from c60b002 to afab29c Compare June 27, 2026 19:24

jaredLunde merged commit 731d603 into main Jun 27, 2026
6 checks passed

jaredLunde deleted the queue-scale-to-zero branch June 27, 2026 19:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

refactor(workers): make the queue scale to zero when idle#7

refactor(workers): make the queue scale to zero when idle#7
jaredLunde merged 1 commit into
mainfrom
queue-scale-to-zero

jaredLunde commented Jun 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

jaredLunde commented Jun 27, 2026

What

Why

Changes

Verification

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant