..
README.md	Hardened Windows Agent Task Termination and Automatic Cleanup	12 hours ago
__init__.py	Hardened Windows Agent Task Termination and Automatic Cleanup	12 hours ago
main.py	Hardened Windows Agent Task Termination and Automatic Cleanup	12 hours ago
utils.py	Hardened Windows Agent Task Termination and Automatic Cleanup	12 hours ago

This makes it easier to pass through requests to the LLM APIs.

E.g. Route to VLLM's /classify endpoint:

SDK (Basic)

import litellm


response = litellm.llm_passthrough_route(
    model="hosted_vllm/papluca/xlm-roberta-base-language-detection",
    method="POST",
    endpoint="classify",
    api_base="http://localhost:8090",
    api_key=None,
    json={
        "model": "swapped-for-litellm-model",
        "input": "Hello, world!",
    }
)

print(response)

SDK (Router)

import asyncio
from litellm import Router

router = Router(
    model_list=[
        {
            "model_name": "roberta-base-language-detection",
            "litellm_params": {
                "model": "hosted_vllm/papluca/xlm-roberta-base-language-detection",
                "api_base": "http://localhost:8090", 
            }
        }
    ]
)

request_data = {
    "model": "roberta-base-language-detection",
    "method": "POST",
    "endpoint": "classify",
    "api_base": "http://localhost:8090",
    "api_key": None,
    "json": {
        "model": "roberta-base-language-detection",
        "input": "Hello, world!",
    }
}

async def main():
    response = await router.allm_passthrough_route(**request_data)
    print(response)

if __name__ == "__main__":
    asyncio.run(main())

PROXY

Setup config.yaml

model_list:
  - model_name: roberta-base-language-detection
    litellm_params:
      model: hosted_vllm/papluca/xlm-roberta-base-language-detection
      api_base: http://localhost:8090

Run the proxy

litellm proxy --config config.yaml

# RUNNING on http://localhost:4000

Use the proxy

curl -X POST http://localhost:4000/vllm/classify \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <your-api-key>" \
-d '{"model": "roberta-base-language-detection", "input": "Hello, world!"}' \

How to add a provider for passthrough

See VLLMModelInfo for an example.

Inherit from BaseModelInfo

from litellm.llms.base_llm.base_utils import BaseLLMModelInfo

class VLLMModelInfo(BaseLLMModelInfo):
    pass

from litellm.utils import ProviderConfigManager
from litellm.types.utils import LlmProviders

provider_config = ProviderConfigManager.get_provider_model_info(
    model="my-test-model", provider=LlmProviders.VLLM
)

print(provider_config)