Skip to content

feat: add LiteLLM as unified LLM provider#339

Open
RheagalFire wants to merge 1 commit into
InternLM:mainfrom
RheagalFire:feat/add-litellm-provider
Open

feat: add LiteLLM as unified LLM provider#339
RheagalFire wants to merge 1 commit into
InternLM:mainfrom
RheagalFire:feat/add-litellm-provider

Conversation

@RheagalFire
Copy link
Copy Markdown

Summary

  • Adds LiteLLM as a unified LLM provider, giving lagent users access to 100+ LLM providers (OpenAI, Anthropic, Google Gemini, Azure, Bedrock, Ollama, etc.) through litellm.completion() as an SDK dependency.
  • Both sync (LiteLLMAPI) and async (AsyncLiteLLMAPI) classes, following the same pattern as GPTAPI/AsyncGPTAPI.

Motivation

lagent currently has dedicated providers for OpenAI (GPTAPI) and Anthropic (ClaudeAPI). Users who want to use Google Gemini, Azure OpenAI, AWS Bedrock, Ollama, or any other provider need to write a new provider class from scratch. LiteLLM translates completion()-style calls to any of 100+ providers, so a single LiteLLMAPI class covers all of them with the same interface lagent already uses.

Changes

  • lagent/llms/litellm_llm.py - LiteLLMAPI (sync) and AsyncLiteLLMAPI (async) extending BaseAPILLM/AsyncBaseAPILLM. Implements chat(), _chat(), stream_chat(), _stream_chat(). Uses litellm.completion() / litellm.acompletion() with drop_params=True for cross-provider compatibility. Lazy-imports litellm so the base install is unaffected.
  • lagent/llms/__init__.py - registered LiteLLMAPI and AsyncLiteLLMAPI in imports and __all__
  • requirements/optional.txt - added litellm>=1.80,<1.87
  • tests/test_litellm.py - 17 tests covering dispatch, credentials, gen_params translation, null responses, exception propagation, import guard, registration, async init, and live E2E

Integration details

  • drop_params=True silently drops provider-unsupported kwargs so the same config works across OpenAI, Anthropic, Gemini, etc.
  • Strips top_p, top_k, repetition_penalty from gen_params before forwarding since these cause conflicts on certain providers (e.g. Anthropic rejects temperature + top_p together). Users can override via **gen_params.
  • Retries only on transient errors (RateLimitError, APIConnectionError, Timeout, InternalServerError, ServiceUnavailableError) using qualname-based matching so the module imports cleanly even without litellm installed.
  • Credentials: when key='ENV' (default), no api_key is passed and litellm reads provider-specific env vars (OPENAI_API_KEY, ANTHROPIC_API_KEY, etc.) directly.

Tests

Unit tests (17/17 pass):

TestLiteLLMAPI::test_chat_dispatches_correctly PASSED
TestLiteLLMAPI::test_chat_batch PASSED
TestLiteLLMAPI::test_null_response_returns_empty_string PASSED
TestLiteLLMAPI::test_response_stripped PASSED
TestLiteLLMAPI::test_drop_params_always_set PASSED
TestLiteLLMAPI::test_api_key_forwarded_when_set PASSED
TestLiteLLMAPI::test_api_key_omitted_when_env PASSED
TestLiteLLMAPI::test_api_base_forwarded PASSED
TestLiteLLMAPI::test_json_mode PASSED
TestLiteLLMAPI::test_gen_params_translated PASSED
TestLiteLLMAPI::test_zero_max_tokens_returns_none PASSED
TestLiteLLMAPI::test_exception_propagates PASSED
TestLiteLLMAPI::test_import_error PASSED
TestLiteLLMAPI::test_registered_in_init PASSED
TestAsyncLiteLLMAPI::test_init PASSED
TestAsyncLiteLLMAPI::test_completion_kwargs PASSED
TestLiveE2E::test_live_chat PASSED
======================== 17 passed in 3.48s ========================

Live E2E (Anthropic claude-sonnet-4-6 via Azure Foundry):

Live E2E response: 'OK'

Risk / Compatibility

  • Additive only. Existing GPTAPI, ClaudeAPI, and all other providers are untouched.
  • litellm is in requirements/optional.txt - base install unaffected.
  • Null/empty responses return '' instead of crashing.

Example usage

from lagent.llms import LiteLLMAPI, AsyncLiteLLMAPI

# Any provider via LiteLLM model format
model = LiteLLMAPI(model_type='anthropic/claude-sonnet-4-20250514')
# export ANTHROPIC_API_KEY=...

result = model.chat([{'role': 'user', 'content': 'Hello!'}])

# Google Gemini
model = LiteLLMAPI(model_type='gemini/gemini-2.5-flash')

# Azure OpenAI
model = LiteLLMAPI(
    model_type='azure/gpt-4o',
    api_base='https://my-resource.openai.azure.com',
    key='my-azure-key',
)

# Streaming
for status, text, _ in model.stream_chat([{'role': 'user', 'content': 'Hi'}]):
    print(text)

# Async
async_model = AsyncLiteLLMAPI(model_type='gpt-4o-mini')
result = await async_model.chat([{'role': 'user', 'content': 'Hi'}])

@RheagalFire
Copy link
Copy Markdown
Author

cc @Harold-lkk @braisedpork1964

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant