Releases: tensorzero/tensorzero
2026.6.0
Caution
Security Advisory
This release fixed a high-risk vulnerability affecting the TensorZero Gateway.
Please refer to the security advisory for more details: GHSA-824w-x939-6cmc
2026.5.2
New Features
- Accept both strings and array of strings for
stopin the OpenAI-compatible inference endpoint (thanks @pragnyanramtha). - Emit additional OpenInference attributes for Arize compatibility.
2026.5.1
Bug Fixes
- Treat SSE body decoding errors as fatal.
2026.5.0
Caution
Breaking Changes
- The UI will now require authentication when the gateway requires authentication. Previously, the UI only required authentication for gateway usage.
New Features
- Improve error handling (e.g. status code propagation) and logging for complex streaming inferences (e.g. fallbacks).
& multiple under-the-hood and UI improvements (thanks @arisp)
2026.4.1
Caution
Breaking Changes
- The gateway now defaults to async observability writes to reduce tail latency: inferences are sent to the client before they are persisted in the database. To restore the previous behavior, set
observability.async_writes = false. [docs]
Warning
Deprecations
- Removed the TensorZero Autopilot "Sessions" page from the UI. We recently added a TensorZero MCP that integrates nicely with coding agents, and we'll re-introduce advanced TensorZero Autopilot workflows in a platform-agnostic format soon.
Bug Fixes
- Return HTTP code 429 for rate limiting errors.
- Fixed a bug affecting ClickHouse database names with hyphens. (thanks @ianliuy!)
New Features
- Added TypeScript evaluators (for inference evaluations).
- Added support for vLLM's new
reasoningfield. - Added aggregated variant usage data (tokens, cost, etc.) to the UI.
- Added inference cost data to exported OpenTelemetry traces. (thanks @kimsehwan96!)
- Added
export.otlp.traces.include_content(default false) configuration field to include inference content (e.g. prompts, messages) in exported OpenTelemetry GenAI traces.
& multiple under-the-hood and UI improvements
2026.4.0
New Features
- Add an MCP server to the gateway exposing its API in
/mcp. - Report provider prompt caching statistics via API and UI.
- Report usage statistics (e.g. tokens, latency, cost) for inference evaluations via CLI tool, API, and UI.
- Add the Prometheus metrics
tensorzero_input_tokens_totalandtensorzero_output_tokens_total. - Add configuration field
content_type_overridesto handle file inputs for long-tail providers.
& multiple under-the-hood and UI improvements
2026.3.4
Warning
Planned Deprecations
- The configuration for inference evaluations should be nested under the relevant functions moving forward [docs]. You can run evaluations by providing a function name and a list of evaluators. The legacy format will be removed in a future release.
[functions.write_haiku.evaluators.exact_match] type = "exact_match" - The legacy implementation of GEPA (
launch_optimizationwithGEPAConfig) will be removed in a future release. Please uset0.optimization.gepa.launchinstead. [docs]
Bug Fixes
- Fixed a UI bug where a custom gateway
base_pathwas not handled correctly in certain routes. (thanks @wangfenjin!)
New Features
- Started including embeddings requests in the Prometheus metrics
tensorzero_requests_totalandtensorzero_inferences_total. - Added the configuration field
observability.batch_writes.write_queue_capacityto enable backpressure for observability data in the gateway.
& multiple under-the-hood and UI improvements (thanks @majiayu000)!
2026.3.3
Bug Fixes
- Fixed two edge cases affecting batch inference.
- Fixed a UI bug affecting "Try with..." with inputs that include base64 files.
- Removed assistant message prefill for JSON functions + Anthropic (deprecated by Anthropic).
New Features
- Added an implementation of GEPA (automated prompt engineering) based on durable workflows.
- Allow users to specify duplicate tool calls in
all_oftool evaluators to evaluate parallel tool calling. - Allow users to specify an expiration date for API keys in the UI. (thanks @eibrahim95)
- Allow users to specify
object_storage.endpoint = "env::MY_ENV_VAR"in addition to static values. (thanks @Meredith2328)
& multiple under-the-hood and UI improvements (thanks @majiayu000)!
2026.3.2
Bug Fixes
- Fixed an UI issue that prevented certain pages from rendering when depending on historical configuration.
New Features
- Added Postgres as an alternative observability backend to ClickHouse. Postgres is the simplest way to get started; we recommend ClickHouse if you're handling >100 RPS.
- Added the
openrouter::xxxshort-hand for embedding models. - Added support for per-session API keys in the browser (instead of a global environment variable) when auth is enabled.
& multiple under-the-hood and UI improvements!
2026.3.1
Warning
Completed Deprecations
- Removed the deprecated
model_provider_namefilter forextra_bodyandextra_headers. Please usemodel_nameandprovider_nameinstead. - Removed the legacy experimental
list_inferencesendpoint and method. Please use the new endpoint instead. [docs] - Removed several long-deprecated types and methods from the TensorZero Python SDK.
Warning
Planned Deprecations
- The embedded gateway in the TensorZero Python SDK will be removed in a future release (2026.6+).
patch_openai_clientandbuild_embeddedare deprecated. Please deploy a standalone TensorZero Gateway instead (usage:base_urlfor OpenAI SDK;build_httpfor TensorZero SDK). - The variant configuration field
weightwill be removed in a future release (2026.6+). Please use the new experimentation configuration semantics. [docs]
Bug Fixes
- Fixed a compatibility bug with Valkey-based caching that only affected Redis.
New Features
- Added support for launching optimization workflows with
dataset_name(instead of an inference query) inlaunch_optimization_workflow.
& multiple under-the-hood and UI improvements!
