Comparing changes

New approach: instead of compressing source model info into a single virtual token (rosetta), distribute the signal as additive logit bias during target generation. Source model's vocabulary distribution is mapped through vocab overlap to target vocabulary. Implementation: - New: rosetta/logit_guided.py — CrossModelLogitBias processor + bias computation - Modified: huggingface.py — cross_model_method="logit_guided" option in generate() - Modified: easy.py — pass through cross_model_method and logit_bias_alpha - New: pipeline_logit_guided.py — benchmark pipeline for GSM8K 2-agent - Modified: run_gsm8k_2agent.py — --mode logit_guided support - Modified: shared/generation.py — logits_processor kwarg support - New: test_logit_guided.py — 11 unit tests (bias shape, zero-mean, gating, scaling) Key features: - Confidence gating: skip bias when target is already confident (>0.8 max prob) - Zero-mean bias: doesn't shift distribution center, only nudges relative prefs - Alpha scaling: default 0.5 (conservative for cross-vocab mapping) - Falls back gracefully for RIDGE/PROCRUSTES (no token-level mapping) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Smart routing: Enhanced quality gate with task-type classification (math/code vs comprehension) using lexical features. Zero latency overhead. Backward compatible with existing assess_transfer() API. Mid-layer injection: Inject projected hidden states at ~75% depth via forward hook instead of layer-0 KV-cache priming. Based on Ramesh & Li (2501.14082) cross-model injection research. Both features available as cross_model_method options in HuggingFaceConnector.generate() and as benchmark pipeline modes. 47 new tests (31 smart routing + 16 mid-layer), all passing. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Per-layer linear projections with learned sigmoid gates for cross-model latent transfer. Both source and target models frozen; only the lightweight projector trains. Inference via per-layer forward hooks that additively inject projected hidden states during prefill. New files: - rosetta/train.py: LayerProjector, TrainConfig, train_projector() - rosetta/trained_hooks.py: trained_multi_layer_hook context manager - pipeline_trained.py: GSM8K benchmark pipeline for trained mode - test_trained_projector.py: 19 tests (projector, hooks, registry, enum) Modified: - types.py: ProjectionMethod.TRAINED enum - calibrate.py: layer_weights/biases/gates fields on AVPMap - registry.py: save/load trained projection fields - huggingface.py: cross_model_method="trained" branch + _prepare_trained_injection() - run_gsm8k_2agent.py: "trained" mode with inline training Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The forward() method was calling .item() on sigmoid gates, detaching them from the computation graph. This meant gate logits only received gradients from L1 regularization, not from the MSE loss — so gates couldn't learn which layers are important from the training signal. Fix: add return_gate_tensors parameter. Training uses True (tensor gates for gradient flow), inference uses False (float gates for speed). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Was passing ModelIdentity object instead of config dict. Now uses model.config.to_dict() like all other call sites. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Research shows MSE-only training optimizes geometric alignment but not downstream generation quality. This adds cross-entropy (NTP) loss through the hooked target model as the primary loss, with MSE as auxiliary (0.1 weight). Also fixes MSE to use unhooked reference hidden states (avoiding circular reference), lowers gate_init to -5.0 for less initial corruption. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The hooked forward pass only needs logits for NTP loss. Hidden states for MSE auxiliary come from the separate unhooked reference pass. Setting output_hidden_states=False saves activation memory. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…nfig Exp 3 (NTP loss) failed due to cold-start gate collapse: gate_init=-5.0 combined with L1 regularization pushed all 28 gates to zero. Now exposing these hyperparameters so experiments can test warm-gate NTP configurations. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…=False) 4 experiments showed MSE-only with gate_init=-3.0 matches NTP loss at 76% GSM8K cross-family accuracy (+6pp over rosetta) while requiring half the training compute. NTP with cold gates (-5.0) causes gate collapse to 0/28. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Extract top-K tokens by attention weight from source model's forward pass, decode to text, re-tokenize on target side, prepend as embeddings before the projected latent vector. Controlled by hybrid_k parameter (0=disabled). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…beds inputs_embeds injection had zero effect (model ignores raw embeddings). Now inject key tokens as "Key Context" in the answerer's prompt via input_ids — processed through normal embedding + positional encoding path. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

SDPA silently ignores output_attentions=True, so attention weights were never returned. key_text was always None, meaning hybrid mode was effectively running pure rosetta. Temporarily switch to eager attention for the dummy forward pass when hybrid_k > 0. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

SDPA attention silently ignores output_attentions=True, so all hybrid experiment runs had no attention weights — key_text was always None. The previous runtime _attn_implementation override didn't work because HuggingFace selects the attention module class at from_pretrained time. Fix: pass attn_implementation="eager" to from_pretrained when hybrid_k > 0. Remove the broken runtime override from pipeline_rosetta.py. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Attention-based extraction was picking up system prompt and instruction tokens instead of paragraph content. Added find_content_start() to locate the "## Paragraphs:" marker and zero out template token scores. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Comparing changes

Open a pull request

Uh oh!

Commits on Mar 14, 2026

This comparison is taking too long to generate.

Uh oh!