Skip to content

tensorzero-http: expose extra root certs for trust anchor injection#7507

Open
AntoineToussaint wants to merge 2 commits into
mainfrom
feat/expose-trust-anchors
Open

tensorzero-http: expose extra root certs for trust anchor injection#7507
AntoineToussaint wants to merge 2 commits into
mainfrom
feat/expose-trust-anchors

Conversation

@AntoineToussaint
Copy link
Copy Markdown
Member

@AntoineToussaint AntoineToussaint commented May 21, 2026

Draft. Small, additive change to TensorzeroHttpClient. Opening for visibility while the consumer (nanogateway overhead bench) is being built.

What

New constructor on TensorzeroHttpClient:

pub fn new_with_extra_root_certs(
    global_outbound_http_timeout: Duration,
    global_outbound_http_intra_stream_read_timeout: Option<Duration>,
    proxy_env_var: impl Into<Cow<'static, str>>,
    extra_root_certs: Vec<reqwest::Certificate>,
) -> Result<Self, HttpClientError>

Threads the supplied certs through to each lazily-built inner reqwest::Client via ClientBuilder::add_root_certificate. Existing constructors (new, new_with_proxy_env_var, new_testing) keep their behavior unchanged — they delegate to a new private new_inner with extra_root_certs = Vec::new().

Hostname verification stays on; only the trust chain is extended.

Why

Building the gateway overhead bench in tensorzero/nanogateway#317. The bench wants to put the real gateway in the request path against an in-process upstream mock at "close to production transport" — i.e. HTTPS so ALPN negotiates HTTP/2, so the gateway's 1024×100 client waterfall (which is sized for h2 multiplexing) behaves like it does against Anthropic / OpenAI / Vertex.

That requires the bench-side mock to serve TLS with a self-signed cert (rcgen-generated at startup), which means the gateway-side reqwest::Client needs to trust that cert. TensorzeroHttpClient currently builds its inner clients with default TLS and exposes no knob to extend the trust store, so there's no way to do this from the outside.

The alternatives we considered and dropped:

  • http2_prior_knowledge() + plaintext h2c — works, but diverges from production's TLS+ALPN path. We'd be measuring a transport we don't actually run.
  • Install the cert in the system trust store before each bench run — intrusive, lingering side effects on the host.
  • Trim the bench to lower concurrency — sidesteps the issue but loses the 10K-concurrency numbers we want.

Trust-anchor injection is the smallest change that lets the bench match production transport exactly.

Scope and risk

  • Production callers are unaffected — they go through the unchanged new / new_with_proxy_env_var constructors and pass no extra certs.
  • The new constructor is opt-in: callers explicitly construct Vec<reqwest::Certificate> and pass it. Empty Vec is equivalent to current behavior.
  • No new dependency.

Tests

  • Existing tensorzero-http lib tests (test_read_timeout_fires_on_stalled_stream, test_concurrent_requests_helper) pass.
  • New test_new_with_extra_root_certs_constructs smoke test exercises both the empty-vec path and a real PEM cert (ISRG Root X1 as a public test fixture).
  • Actual cert-trust-on-the-wire is exercised end-to-end by the nanogateway bench (tensorzero/nanogateway#317).
running 3 tests
test tests::test_new_with_extra_root_certs_constructs ... ok
test tests::test_read_timeout_fires_on_stalled_stream ... ok
test tests::test_concurrent_requests_helper ... ok

Open questions

Worth thinking about in a follow-up but out of scope here: the broader "TensorzeroHttpClient is unconfigurable from the outside" surface. The 1024×100 client waterfall, the per-client pool sizing, the absence of an http2_prior_knowledge() knob — none of these are tunable today. That's been fine because the architecture assumes h2 upstreams and the defaults are sized for that. But as the gateway gets used in more transports (vLLM, self-hosted TGI with higher max_concurrent_streams, plaintext sidecars), it might be worth a more general options-struct refactor. Not for this PR; just flagging.

Test plan

  • cargo test -p tensorzero-http --lib
  • cargo clippy -p tensorzero-http --all-targets -- -D warnings
  • cargo fmt -p tensorzero-http --check

🤖 Generated with Claude Code


Note

Medium Risk
Adds an opt-in way to extend the TLS trust store for all lazily-built reqwest::Clients, which can affect outbound TLS verification behavior when used (though default callers remain unchanged). The change is small and covered by a construction smoke test.

Overview
Adds an opt-in TensorzeroHttpClient::new_with_extra_root_certs constructor that lets callers provide additional trusted TLS root certificates, threaded through the waterfall so every lazily-created reqwest::Client is built with add_root_certificate.

Refactors existing constructors to delegate to a new new_inner (defaulting to an empty cert list to preserve current behavior), updates build_client to accept/apply the extra certs, and adds a smoke test that verifies construction with both an empty list and a parsed PEM certificate.

Reviewed by Cursor Bugbot for commit 062a985. Configure here.

Add a new constructor, `TensorzeroHttpClient::new_with_extra_root_certs`,
that threads a `Vec<reqwest::Certificate>` through to each lazily-built
inner `reqwest::Client` via `ClientBuilder::add_root_certificate`.
Existing constructors keep their behavior (no extra certs); the only
behavior change is the new entry point.

Motivation: benchmark and integration setups that need to talk to a
self-signed TLS upstream — e.g. a local h2-over-TLS mock used by the
nanogateway overhead bench — currently can't, because the inner reqwest
client builds with default TLS and there is no knob to extend its trust
store from the outside.

Hostname verification stays on; this only extends the trust chain.
Production code paths are unaffected: only callers that explicitly opt
in via the new constructor get the additional roots.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@AntoineToussaint AntoineToussaint marked this pull request as ready for review May 21, 2026 20:45
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Comment @cursor review or bugbot run to trigger another review on this PR

Reviewed by Cursor Bugbot for commit 062a985. Configure here.

Comment thread crates/tensorzero-http/src/lib.rs Outdated
Cursor bugbot flagged the `use` statements inside
`test_new_with_extra_root_certs_constructs`'s body — project Rust
style requires `use` at module scope only, not in function bodies
(per AGENTS.md).

Moved `std::borrow::Cow` and the parent-module items
(`DEFAULT_HTTP_CLIENT_TIMEOUT`, `DEFAULT_PROXY_ENV_VAR`,
`TensorzeroHttpClient`) up to the tests module's `use` block, and
dropped the `super::` qualifiers from the call sites now that the
items are directly in scope. No behavior change.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@AntoineToussaint AntoineToussaint added this pull request to the merge queue May 22, 2026
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks May 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants