Skip to content

Force exact_match cutoff to always fail in e2e test#7522

Open
Aaron1011 wants to merge 1 commit into
mainfrom
aaron/exact-match
Open

Force exact_match cutoff to always fail in e2e test#7522
Aaron1011 wants to merge 1 commit into
mainfrom
aaron/exact-match

Conversation

@Aaron1011
Copy link
Copy Markdown
Member

@Aaron1011 Aaron1011 commented May 28, 2026

Note

Low Risk
Test-only change in evaluations e2e; no production evaluation or cutoff logic is modified.

Overview
Stabilizes the run_llm_judge_evaluation_json_pretty e2e test by raising the exact_match cutoff from 0.6 to 1.01 and documenting that the threshold is intentionally above 1.0 so cutoff failure is guaranteed despite live-model score variance.

The error assertion is updated to expect exact_match (cutoff: 1.01, got: instead of 0.60. Behavior under test (pretty output, failed cutoffs error path) is unchanged; only the flaky threshold is fixed.

Reviewed by Cursor Bugbot for commit fd2e853. Configure here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants