Force exact_match cutoff to always fail in e2e test by Aaron1011 · Pull Request #7522 · tensorzero/tensorzero

Aaron1011 · 2026-05-28T16:16:10Z

Note

Low Risk
Test-only change in evaluations e2e; no production evaluation or cutoff logic is modified.

Overview
Stabilizes the run_llm_judge_evaluation_json_pretty e2e test by raising the exact_match cutoff from 0.6 to 1.01 and documenting that the threshold is intentionally above 1.0 so cutoff failure is guaranteed despite live-model score variance.

The error assertion is updated to expect exact_match (cutoff: 1.01, got: instead of 0.60. Behavior under test (pretty output, failed cutoffs error path) is unchanged; only the flaky threshold is fixed.

^{Reviewed by Cursor Bugbot for commit fd2e853. Configure here.}

Force exact_match cutoff to always fail in e2e test

fd2e853

Aaron1011 assigned AntoineToussaint May 28, 2026

Aaron1011 enabled auto-merge May 28, 2026 16:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Force exact_match cutoff to always fail in e2e test#7522

Force exact_match cutoff to always fail in e2e test#7522
Aaron1011 wants to merge 1 commit into
mainfrom
aaron/exact-match

Aaron1011 commented May 28, 2026 •

edited by cursor Bot

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Aaron1011 commented May 28, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Aaron1011 commented May 28, 2026 •

edited by cursor Bot

Loading