cortex-hub / test_venv / lib / python3.9 / site-packages / litellm / passthrough /
@Antigravity AI Antigravity AI authored 12 hours ago
..
README.md Hardened Windows Agent Task Termination and Automatic Cleanup 12 hours ago
__init__.py Hardened Windows Agent Task Termination and Automatic Cleanup 12 hours ago
main.py Hardened Windows Agent Task Termination and Automatic Cleanup 12 hours ago
utils.py Hardened Windows Agent Task Termination and Automatic Cleanup 12 hours ago
README.md

This makes it easier to pass through requests to the LLM APIs.

E.g. Route to VLLM's /classify endpoint:

SDK (Basic)

import litellm


response = litellm.llm_passthrough_route(
    model="hosted_vllm/papluca/xlm-roberta-base-language-detection",
    method="POST",
    endpoint="classify",
    api_base="http://localhost:8090",
    api_key=None,
    json={
        "model": "swapped-for-litellm-model",
        "input": "Hello, world!",
    }
)

print(response)

SDK (Router)

import asyncio
from litellm import Router

router = Router(
    model_list=[
        {
            "model_name": "roberta-base-language-detection",
            "litellm_params": {
                "model": "hosted_vllm/papluca/xlm-roberta-base-language-detection",
                "api_base": "http://localhost:8090", 
            }
        }
    ]
)

request_data = {
    "model": "roberta-base-language-detection",
    "method": "POST",
    "endpoint": "classify",
    "api_base": "http://localhost:8090",
    "api_key": None,
    "json": {
        "model": "roberta-base-language-detection",
        "input": "Hello, world!",
    }
}

async def main():
    response = await router.allm_passthrough_route(**request_data)
    print(response)

if __name__ == "__main__":
    asyncio.run(main())

PROXY

  1. Setup config.yaml
model_list:
  - model_name: roberta-base-language-detection
    litellm_params:
      model: hosted_vllm/papluca/xlm-roberta-base-language-detection
      api_base: http://localhost:8090
  1. Run the proxy
litellm proxy --config config.yaml

# RUNNING on http://localhost:4000
  1. Use the proxy
curl -X POST http://localhost:4000/vllm/classify \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <your-api-key>" \
-d '{"model": "roberta-base-language-detection", "input": "Hello, world!"}' \

How to add a provider for passthrough

See VLLMModelInfo for an example.

  1. Inherit from BaseModelInfo
from litellm.llms.base_llm.base_utils import BaseLLMModelInfo

class VLLMModelInfo(BaseLLMModelInfo):
    pass
  1. Register the provider in the ProviderConfigManager.get_provider_model_info
from litellm.utils import ProviderConfigManager
from litellm.types.utils import LlmProviders

provider_config = ProviderConfigManager.get_provider_model_info(
    model="my-test-model", provider=LlmProviders.VLLM
)

print(provider_config)