Executable-code benchmark and verifier-guided post-training sandbox in the L20 project family.
code-generation reproducibility verifier post-training l20 qwen llm-evaluation coding-agents rlvr livecodebench evalplus
-
Updated
Jun 29, 2026 - Python