- afterquery (yc w25) β building agent benchmarks + training a reward model on 10k+ preference annotations. shipped a data pipeline behind a BIG(lol) deal. ppo, rlhf, the works.
- cimez β pdf intelligence for pe due diligence. 4 firms use it, ~100 cims/month. faiss + rag + a lot of pymupdf suffering.
- query-adaptive token allocation β tiny mlp/decision-tree controller that picks input + output budgets per query for black-box llm apis. cutting token spend ~30% at <2% quality loss so far.
- litellm pr #25208 β caught a silent data-loss bug in their responses-api translation layer that was eating assistant text when tool calls were attached. multi-turn agents were quietly broken. wrote round-trip tests, sent the fix. 42kβ repo, used by 100k+ devs.
- distributed parameter server (c++ / grpc / protobuf) β multi-worker training coordination with sync gradient agg, version control, fault recovery. 3.8Γ throughput over single-node. wrote the ml engine from scratch β manual forward/backward, softmax, cross-entropy, no torch.
- splashbi β multi-agent nlp insight engine over enterprise kpis. role-aware retrieval, feedback loop, +28% relevance from a/b tests.
- princeton (puchalla lab) β built a $500 tirf microscope. commercial ones cost $50k. currently writing the 3d reconstruction software.
- gtri β 94.7% accurate cf prediction model + clinician gui for biomarker analysis.
- techease β nonprofit i started, ~60k seniors served via ymca + county partnerships.
- ivyacademy β $154k+ college consulting startup; piloting a free tier for kids who can't pay.
python c/c++ go ts pytorch langchain faiss fastapi docker k8s grpc postgres aws/gcp β the usual suspects.
- coca-cola scholar (150 / 110k+) Β· regeneron isef finalist Β· 36 act Β· amc12b top 1% Β· y combinator startup school '26
