Skip to content

Releases: tensorzero/tensorzero

2026.6.0

04 Jun 15:15
Immutable release. Only release title and notes can be modified.
62eb8f6

Choose a tag to compare

Caution

Security Advisory

This release fixed a high-risk vulnerability affecting the TensorZero Gateway.

Please refer to the security advisory for more details: GHSA-824w-x939-6cmc

2026.5.2

20 May 13:48
Immutable release. Only release title and notes can be modified.
c2a4c6f

Choose a tag to compare

New Features

  • Accept both strings and array of strings for stop in the OpenAI-compatible inference endpoint (thanks @pragnyanramtha).
  • Emit additional OpenInference attributes for Arize compatibility.

2026.5.1

15 May 19:09
Immutable release. Only release title and notes can be modified.
c79fb0a

Choose a tag to compare

Bug Fixes

  • Treat SSE body decoding errors as fatal.

2026.5.0

08 May 20:45
Immutable release. Only release title and notes can be modified.
6e20b54

Choose a tag to compare

Caution

Breaking Changes

  • The UI will now require authentication when the gateway requires authentication. Previously, the UI only required authentication for gateway usage.

New Features

  • Improve error handling (e.g. status code propagation) and logging for complex streaming inferences (e.g. fallbacks).

& multiple under-the-hood and UI improvements (thanks @arisp)

2026.4.1

24 Apr 17:46
Immutable release. Only release title and notes can be modified.
5a6ffa1

Choose a tag to compare

Caution

Breaking Changes

  • The gateway now defaults to async observability writes to reduce tail latency: inferences are sent to the client before they are persisted in the database. To restore the previous behavior, set observability.async_writes = false. [docs]

Warning

Deprecations

  • Removed the TensorZero Autopilot "Sessions" page from the UI. We recently added a TensorZero MCP that integrates nicely with coding agents, and we'll re-introduce advanced TensorZero Autopilot workflows in a platform-agnostic format soon.

Bug Fixes

  • Return HTTP code 429 for rate limiting errors.
  • Fixed a bug affecting ClickHouse database names with hyphens. (thanks @ianliuy!)

New Features

  • Added TypeScript evaluators (for inference evaluations).
  • Added support for vLLM's new reasoning field.
  • Added aggregated variant usage data (tokens, cost, etc.) to the UI.
  • Added inference cost data to exported OpenTelemetry traces. (thanks @kimsehwan96!)
  • Added export.otlp.traces.include_content (default false) configuration field to include inference content (e.g. prompts, messages) in exported OpenTelemetry GenAI traces.

& multiple under-the-hood and UI improvements

2026.4.0

02 Apr 17:00
Immutable release. Only release title and notes can be modified.
e1b8b74

Choose a tag to compare

New Features

  • Add an MCP server to the gateway exposing its API in /mcp.
  • Report provider prompt caching statistics via API and UI.
  • Report usage statistics (e.g. tokens, latency, cost) for inference evaluations via CLI tool, API, and UI.
  • Add the Prometheus metrics tensorzero_input_tokens_total and tensorzero_output_tokens_total.
  • Add configuration field content_type_overrides to handle file inputs for long-tail providers.

& multiple under-the-hood and UI improvements

2026.3.4

26 Mar 14:27
Immutable release. Only release title and notes can be modified.
896a0f9

Choose a tag to compare

Warning

Planned Deprecations

  • The configuration for inference evaluations should be nested under the relevant functions moving forward [docs]. You can run evaluations by providing a function name and a list of evaluators. The legacy format will be removed in a future release.
    [functions.write_haiku.evaluators.exact_match]
    type = "exact_match"
    
  • The legacy implementation of GEPA (launch_optimization with GEPAConfig) will be removed in a future release. Please use t0.optimization.gepa.launch instead. [docs]

Bug Fixes

  • Fixed a UI bug where a custom gateway base_path was not handled correctly in certain routes. (thanks @wangfenjin!)

New Features

  • Started including embeddings requests in the Prometheus metrics tensorzero_requests_total and tensorzero_inferences_total.
  • Added the configuration field observability.batch_writes.write_queue_capacity to enable backpressure for observability data in the gateway.

& multiple under-the-hood and UI improvements (thanks @majiayu000)!


Important

🆕 TensorZero Autopilot

TensorZero Autopilot is an automated AI engineer powered by TensorZero that analyzes LLM observability data, sets up evals, optimizes prompts and models, and runs A/B tests.

It dramatically improves the performance of LLM agents across diverse tasks:

Bar chart showing baseline vs. optimized scores across diverse LLM tasks

Learn more →  Schedule a demo →

2026.3.3

18 Mar 16:23
Immutable release. Only release title and notes can be modified.
dfa8364

Choose a tag to compare

Bug Fixes

  • Fixed two edge cases affecting batch inference.
  • Fixed a UI bug affecting "Try with..." with inputs that include base64 files.
  • Removed assistant message prefill for JSON functions + Anthropic (deprecated by Anthropic).

New Features

  • Added an implementation of GEPA (automated prompt engineering) based on durable workflows.
  • Allow users to specify duplicate tool calls in all_of tool evaluators to evaluate parallel tool calling.
  • Allow users to specify an expiration date for API keys in the UI. (thanks @eibrahim95)
  • Allow users to specify object_storage.endpoint = "env::MY_ENV_VAR" in addition to static values. (thanks @Meredith2328)

& multiple under-the-hood and UI improvements (thanks @majiayu000)!

2026.3.2

13 Mar 16:09
Immutable release. Only release title and notes can be modified.
1893a22

Choose a tag to compare

Bug Fixes

  • Fixed an UI issue that prevented certain pages from rendering when depending on historical configuration.

New Features

  • Added Postgres as an alternative observability backend to ClickHouse. Postgres is the simplest way to get started; we recommend ClickHouse if you're handling >100 RPS.
  • Added the openrouter::xxx short-hand for embedding models.
  • Added support for per-session API keys in the browser (instead of a global environment variable) when auth is enabled.

& multiple under-the-hood and UI improvements!

2026.3.1

05 Mar 22:13
Immutable release. Only release title and notes can be modified.
7c39a1a

Choose a tag to compare

Warning

Completed Deprecations

  • Removed the deprecated model_provider_name filter for extra_body and extra_headers. Please use model_name and provider_name instead.
  • Removed the legacy experimental list_inferences endpoint and method. Please use the new endpoint instead. [docs]
  • Removed several long-deprecated types and methods from the TensorZero Python SDK.

Warning

Planned Deprecations

  • The embedded gateway in the TensorZero Python SDK will be removed in a future release (2026.6+). patch_openai_client and build_embedded are deprecated. Please deploy a standalone TensorZero Gateway instead (usage: base_url for OpenAI SDK; build_http for TensorZero SDK).
  • The variant configuration field weight will be removed in a future release (2026.6+). Please use the new experimentation configuration semantics. [docs]

Bug Fixes

  • Fixed a compatibility bug with Valkey-based caching that only affected Redis.

New Features

  • Added support for launching optimization workflows with dataset_name (instead of an inference query) in launch_optimization_workflow.

& multiple under-the-hood and UI improvements!