Add YC Bench baseline rollout data for can-ai-agents-improve-ai-agents starter by anndvision · Pull Request #1 · anndvision/data

anndvision · 2026-04-27T02:05:18Z

Summary

Hosts the baseline rollout traces consumed by the can-ai-agents-improve-ai-agents starter — the companion code for the Can AI Agents Improve AI Agents? TensorZero blog post.

The starter ships in tensorzero/tensorzero, but the inference traces (98 MiB) are too large to live there — pre-commit blocks at 1 MiB and that repo doesn't use Git LFS. The starter's baseline_data/fetch.sh will download these jsonls from this repo's LFS-backed media URLs.

Tracked through Git LFS — *.jsonl filter=lfs ... added to .gitattributes (next to the existing *.csv LFS rule).

Files

File	Rows	Size	SHA-256
`can-ai-agents-improve-ai-agents/baseline_data/inferences.jsonl`	1,380	98 MiB	`9bac777bcedd790146ed082252ad77d41496f19f1beaecc8acac12cffe55d176`
`can-ai-agents-improve-ai-agents/baseline_data/feedback.jsonl`	320	36 KiB	`e59685147aea4679d6e617e39e12cd8474e052649f0eb48ccca9f6b2a6fe319d`

Provenance

Real baseline rollout of yc_bench_tutorial_v0::yc_bench_act against the initial variant on openai::gpt-5.4-mini: 80 unique train tasks + 20 unique test tasks (Codex YC Bench seed 0, 2026-04-23). Matches the artifacts the autopilot-evals harness dumps to <run_dir>/claude_code/baseline_data/ before invoking the optimizer agent.

Raw URLs (post-merge)

https://media.githubusercontent.com/media/anndvision/data/main/can-ai-agents-improve-ai-agents/baseline_data/inferences.jsonl
https://media.githubusercontent.com/media/anndvision/data/main/can-ai-agents-improve-ai-agents/baseline_data/feedback.jsonl

(LFS-backed files use media.githubusercontent.com/media/... to fetch actual content rather than the LFS pointer that raw.githubusercontent.com returns.)

Test plan

Both URLs resolve post-merge and serve the correct bytes (SHA-256 matches the table above).
Starter's bash baseline_data/fetch.sh downloads + verifies both files.

…s starter

Add YC Bench baseline rollout data for can-ai-agents-improve-ai-agent…

ab7f4c1

…s starter

anndvision merged commit f692405 into main Apr 27, 2026

anndvision deleted the add-can-ai-agents-baseline-data branch April 27, 2026 02:06

anndvision mentioned this pull request Apr 28, 2026

examples/blog: can-ai-agents-improve-ai-agents starter tensorzero/tensorzero#7404

Open

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add YC Bench baseline rollout data for can-ai-agents-improve-ai-agents starter#1

Add YC Bench baseline rollout data for can-ai-agents-improve-ai-agents starter#1
anndvision merged 1 commit into
mainfrom
add-can-ai-agents-baseline-data

anndvision commented Apr 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

anndvision commented Apr 27, 2026

Summary

Files

Provenance

Raw URLs (post-merge)

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant