Tags: VectorArc/avp-python
Tags
Fix transformers 5.4 compat: remove cache_position from generate() transformers 5.4.0 validates model_kwargs and rejects cache_position in model.generate() for models whose prepare_inputs_for_generation doesn't return it (e.g., GPT2). In transformers >=5.0, generate() manages cache positions internally when past_key_values is provided. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Release v0.4.1 API stability release. 33 issues found and fixed across Easy API, wire format, connector ABC, and type system. Result objects, CRC32 checksum, simplified connector ABC. 500 tests pass, cloud validated on A100. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add CHANGELOG entry for v0.4.0 168 commits since v0.3.2. Major changes: 4 new engine connectors, 3 framework integrations, torch made optional (numpy projection), deprecated API removed, dead code cleaned, docs rewritten. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Release v0.3.1: fix protobuf compatibility for Colab/older environments Remove gencode version check from avp_pb2.py that required protobuf >=6.31.1 at runtime. Now works with protobuf >=4.21 as declared in dependencies. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Mark vLLM integration as experimental with runtime warnings KV connector plugin (AVPKVConnectorV1Dynamic) has known issues with PagedAttention format conversion, CUDA graph compatibility, and concurrent request isolation. Not validated end-to-end with real vLLM. VLLMConnector text generation works; latent transfer does not. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
PreviousNext