MCP Overview
LiteLLM Proxy provides an MCP Gateway that allows you to use a fixed endpoint for all MCP tools and control MCP access by Key, Team.
LiteLLM MCP Architecture: Use MCP tools with all LiteLLM supported models
Overview​
| Feature | Description |
|---|---|
| MCP Operations | • List Tools • Call Tools |
| Supported MCP Transports | • Streamable HTTP • SSE • Standard Input/Output (stdio) |
| LiteLLM Permission Management | • By Key • By Team • By Organization |
Adding your MCP​
Prerequisites​
To store MCP servers in the database, you need to enable database storage:
Environment Variable:
export STORE_MODEL_IN_DB=True
OR in config.yaml:
general_settings:
store_model_in_db: true
Fine-grained Database Storage Control​
By default, when store_model_in_db is true, all object types (models, MCPs, guardrails, vector stores, etc.) are stored in the database. If you want to store only specific object types, use the supported_db_objects setting.
Example: Store only MCP servers in the database
general_settings:
store_model_in_db: true
supported_db_objects: ["mcp"] # Only store MCP servers in DB
model_list:
- model_name: gpt-4o
litellm_params:
model: openai/gpt-4o
api_key: sk-xxxxxxx
See all available object types: Config Settings - supported_db_objects
If supported_db_objects is not set, all object types are loaded from the database (default behavior).
- LiteLLM UI
- config.yaml
On the LiteLLM UI, Navigate to "MCP Servers" and click "Add New MCP Server".
On this form, you should enter your MCP Server URL and the transport you want to use.
LiteLLM supports the following MCP transports:
- Streamable HTTP
- SSE (Server-Sent Events)
- Standard Input/Output (stdio)
Add HTTP MCP Server​
This video walks through adding and using an HTTP MCP server on LiteLLM UI and using it in Cursor IDE.
Add SSE MCP Server​
This video walks through adding and using an SSE MCP server on LiteLLM UI and using it in Cursor IDE.
Add STDIO MCP Server​
For stdio MCP servers, select "Standard Input/Output (stdio)" as the transport type and provide the stdio configuration in JSON format:
Static Headers​
Sometimes your MCP server needs specific headers on every request. Maybe it's an API key, maybe it's a custom header the server expects. Instead of configuring auth, you can just set them directly.
These headers get sent with every request to the server. That's it.
When to use this:
- Your server needs custom headers that don't fit the standard auth patterns
- You want full control over exactly what headers are sent
- You're debugging and need to quickly add headers without changing auth configuration
Add your MCP servers directly in your config.yaml file:
model_list:
- model_name: gpt-4o
litellm_params:
model: openai/gpt-4o
api_key: sk-xxxxxxx
litellm_settings:
# MCP Aliases - Map aliases to server names for easier tool access
mcp_aliases:
"github": "github_mcp_server"
"zapier": "zapier_mcp_server"
"deepwiki": "deepwiki_mcp_server"
mcp_servers:
# HTTP Streamable Server
deepwiki_mcp:
url: "https://mcp.deepwiki.com/mcp"
# SSE Server
zapier_mcp:
url: "https://actions.zapier.com/mcp/sk-akxxxxx/sse"
# Standard Input/Output (stdio) Server - CircleCI Example
circleci_mcp:
transport: "stdio"
command: "npx"
args: ["-y", "@circleci/mcp-server-circleci"]
env:
CIRCLECI_TOKEN: "your-circleci-token"
CIRCLECI_BASE_URL: "https://circleci.com"
# Full configuration with all optional fields
my_http_server:
url: "https://my-mcp-server.com/mcp"
transport: "http"
description: "My custom MCP server"
auth_type: "api_key"
auth_value: "abc123"
Configuration Options:
-
Server Name: Use any descriptive name for your MCP server (e.g.,
zapier_mcp,deepwiki_mcp,circleci_mcp) -
Alias: This name will be prefilled with the server name with "_" replacing spaces, else edit it to be the prefix in tool names
-
URL: The endpoint URL for your MCP server (required for HTTP/SSE transports)
-
Transport: Optional transport type (defaults to
sse)sse- SSE (Server-Sent Events) transporthttp- Streamable HTTP transportstdio- Standard Input/Output transport
-
Command: The command to execute for stdio transport (required for stdio)
-
Args: Array of arguments to pass to the command (optional for stdio)
-
Env: Environment variables to set for the stdio process (optional for stdio)
-
Description: Optional description for the server
-
Auth Type: Optional authentication type. Supported values:
Value Header sent api_keyX-API-Key: <auth_value>bearer_tokenAuthorization: Bearer <auth_value>basicAuthorization: Basic <auth_value>authorizationAuthorization: <auth_value> -
Extra Headers: Optional list of additional header names that should be forwarded from client to the MCP server
-
Static Headers: Optional map of header key/value pairs to include every request to the MCP server.
-
Spec Version: Optional MCP specification version (defaults to
2025-06-18)
Examples for each auth type:
mcp_servers:
api_key_example:
url: "https://my-mcp-server.com/mcp"
auth_type: "api_key"
auth_value: "abc123" # headers={"X-API-Key": "abc123"}
# NEW – OAuth 2.0 Client Credentials (v1.77.5)
oauth2_example:
url: "https://my-mcp-server.com/mcp"
auth_type: "oauth2" # 👈 KEY CHANGE
authorization_url: "https://my-mcp-server.com/oauth/authorize" # optional override
token_url: "https://my-mcp-server.com/oauth/token" # optional override
registration_url: "https://my-mcp-server.com/oauth/register" # optional override
client_id: os.environ/OAUTH_CLIENT_ID
client_secret: os.environ/OAUTH_CLIENT_SECRET
scopes: ["tool.read", "tool.write"] # optional override
bearer_example:
url: "https://my-mcp-server.com/mcp"
auth_type: "bearer_token"
auth_value: "abc123" # headers={"Authorization": "Bearer abc123"}
basic_example:
url: "https://my-mcp-server.com/mcp"
auth_type: "basic"
auth_value: "dXNlcjpwYXNz" # headers={"Authorization": "Basic dXNlcjpwYXNz"}
custom_auth_example:
url: "https://my-mcp-server.com/mcp"
auth_type: "authorization"
auth_value: "Token example123" # headers={"Authorization": "Token example123"}
# Example with extra headers forwarding
github_mcp:
url: "https://api.githubcopilot.com/mcp"
auth_type: "bearer_token"
auth_value: "ghp_example_token"
extra_headers: ["custom_key", "x-custom-header"] # These headers will be forwarded from client
# Example with static headers
my_mcp_server:
url: "https://my-mcp-server.com/mcp"
static_headers: # These headers will be requested to the MCP server
X-API-Key: "abc123"
X-Custom-Header: "some-value"
MCP Aliases​
You can define aliases for your MCP servers in the litellm_settings section. This allows you to:
- Map friendly names to server names: Use shorter, more memorable aliases
- Override server aliases: If a server doesn't have an alias defined, the system will use the first matching alias from
mcp_aliases - Ensure uniqueness: Only the first alias for each server is used, preventing conflicts
Example:
litellm_settings:
mcp_aliases:
"github": "github_mcp_server" # Maps "github" alias to "github_mcp_server"
"zapier": "zapier_mcp_server" # Maps "zapier" alias to "zapier_mcp_server"
"docs": "deepwiki_mcp_server" # Maps "docs" alias to "deepwiki_mcp_server"
"github_alt": "github_mcp_server" # This will be ignored since "github" already maps to this server
Benefits:
- Simplified tool access: Use
github_create_issueinstead ofgithub_mcp_server_create_issue - Consistent naming: Standardize alias patterns across your organization
- Easy migration: Change server names without breaking existing tool references
Converting OpenAPI Specs to MCP Servers​
LiteLLM can automatically convert OpenAPI specifications into MCP servers, allowing you to expose any REST API as MCP tools. This is useful when you have existing APIs with OpenAPI/Swagger documentation and want to make them available as MCP tools.
Benefits:
- Rapid Integration: Convert existing APIs to MCP tools without writing custom MCP server code
- Automatic Tool Generation: LiteLLM automatically generates MCP tools from your OpenAPI spec
- Unified Interface: Use the same MCP interface for both native MCP servers and OpenAPI-based APIs
- Easy Testing: Test and iterate on API integrations quickly
Configuration:
Add your OpenAPI-based MCP server to your config.yaml:
model_list:
- model_name: gpt-4o
litellm_params:
model: openai/gpt-4o
api_key: sk-xxxxxxx
mcp_servers:
# OpenAPI Spec Example - Petstore API
petstore_mcp:
url: "https://petstore.swagger.io/v2"
spec_path: "/path/to/openapi.json"
auth_type: "none"
# OpenAPI Spec with API Key Authentication
my_api_mcp:
url: "http://0.0.0.0:8090"
spec_path: "/path/to/openapi.json"
auth_type: "api_key"
auth_value: "your-api-key-here"
# OpenAPI Spec with Bearer Token
secured_api_mcp:
url: "https://api.example.com"
spec_path: "/path/to/openapi.json"
auth_type: "bearer_token"
auth_value: "your-bearer-token"
Configuration Parameters:
| Parameter | Required | Description |
|---|---|---|
url | Yes | The base URL of your API endpoint |
spec_path | Yes | Path or URL to your OpenAPI specification file (JSON or YAML) |
auth_type | No | Authentication type: none, api_key, bearer_token, basic, authorization |
auth_value | No | Authentication value (required if auth_type is set) |
authorization_url | No | For auth_type: oauth2. Optional override; if omitted LiteLLM auto-discovers it. |
token_url | No | For auth_type: oauth2. Optional override; if omitted LiteLLM auto-discovers it. |
registration_url | No | For auth_type: oauth2. Optional override; if omitted LiteLLM auto-discovers it. |
scopes | No | For auth_type: oauth2. Optional override; if omitted LiteLLM uses the scopes advertised by the server. |
description | No | Optional description for the MCP server |
allowed_tools | No | List of specific tools to allow (see MCP Tool Filtering) |
disallowed_tools | No | List of specific tools to block (see MCP Tool Filtering) |
Usage Example​
Once configured, you can use the OpenAPI-based MCP server just like any other MCP server:
- Python FastMCP
- Cursor IDE
- OpenAI Responses API
from fastmcp import Client
import asyncio
# Standard MCP configuration
config = {
"mcpServers": {
"petstore": {
"url": "http://localhost:4000/petstore_mcp/mcp",
"headers": {
"x-litellm-api-key": "Bearer sk-1234"
}
}
}
}
# Create a client that connects to the server
client = Client(config)
async def main():
async with client:
# List available tools generated from OpenAPI spec
tools = await client.list_tools()
print(f"Available tools: {[tool.name for tool in tools]}")
# Example: Get a pet by ID (from Petstore API)
response = await client.call_tool(
name="getpetbyid",
arguments={"petId": "1"}
)
print(f"Response:\n{response}\n")
# Example: Find pets by status
response = await client.call_tool(
name="findpetsbystatus",
arguments={"status": "available"}
)
print(f"Response:\n{response}\n")
if __name__ == "__main__":
asyncio.run(main())
{
"mcpServers": {
"Petstore": {
"url": "http://localhost:4000/petstore_mcp/mcp",
"headers": {
"x-litellm-api-key": "Bearer $LITELLM_API_KEY"
}
}
}
}
curl --location 'https://api.openai.com/v1/responses' \
--header 'Content-Type: application/json' \
--header "Authorization: Bearer $OPENAI_API_KEY" \
--data '{
"model": "gpt-4o",
"tools": [
{
"type": "mcp",
"server_label": "petstore",
"server_url": "http://localhost:4000/petstore_mcp/mcp",
"require_approval": "never",
"headers": {
"x-litellm-api-key": "Bearer YOUR_LITELLM_API_KEY"
}
}
],
"input": "Find all available pets in the petstore",
"tool_choice": "required"
}'
How It Works
- Spec Loading: LiteLLM loads your OpenAPI specification from the provided
spec_path - Tool Generation: Each API endpoint in the spec becomes an MCP tool
- Parameter Mapping: OpenAPI parameters are automatically mapped to MCP tool parameters
- Request Handling: When a tool is called, LiteLLM converts the MCP request to the appropriate HTTP request
- Response Translation: API responses are converted back to MCP format
OpenAPI Spec Requirements
Your OpenAPI specification should follow standard OpenAPI/Swagger conventions:
- Supported versions: OpenAPI 3.0.x, OpenAPI 3.1.x, Swagger 2.0
- Required fields:
paths,infosections should be properly defined - Operation IDs: Each operation should have a unique
operationId(this becomes the tool name) - Parameters: Request parameters should be properly documented with types and descriptions
MCP Oauth​
LiteLLM v 1.77.6 added support for OAuth 2.0 Client Credentials for MCP servers.
This configuration is currently available on the config.yaml, with UI support coming soon.
mcp_servers:
github_mcp:
url: "https://api.githubcopilot.com/mcp"
auth_type: oauth2
client_id: os.environ/GITHUB_OAUTH_CLIENT_ID
client_secret: os.environ/GITHUB_OAUTH_CLIENT_SECRET
How It Works​
Participants
- Client – The MCP-capable AI agent (e.g., Claude Code, Cursor, or another IDE/agent) that initiates OAuth discovery, authorization, and tool invocations on behalf of the user.
- LiteLLM Proxy – Mediates all OAuth discovery, registration, token exchange, and MCP traffic while protecting stored credentials.
- Authorization Server – Issues OAuth 2.0 tokens via dynamic client registration, PKCE authorization, and token endpoints.
- MCP Server (Resource Server) – The protected MCP endpoint that receives LiteLLM’s authenticated JSON-RPC requests.
- User-Agent (Browser) – Temporarily involved so the end user can grant consent during the authorization step.
Flow Steps
- Resource Discovery: The client fetches MCP resource metadata from LiteLLM’s
.well-known/oauth-protected-resourceendpoint to understand scopes and capabilities. - Authorization Server Discovery: The client retrieves the OAuth server metadata (token endpoint, authorization endpoint, supported PKCE methods) through LiteLLM’s
.well-known/oauth-authorization-serverendpoint. - Dynamic Client Registration: The client registers through LiteLLM, which forwards the request to the authorization server (RFC 7591). If the provider doesn’t support dynamic registration, you can pre-store
client_id/client_secretin LiteLLM (e.g., GitHub MCP) and the flow proceeds the same way. - User Authorization: The client launches a browser session (with code challenge and resource hints). The user approves access, the authorization server sends the code through LiteLLM back to the client.
- Token Exchange: The client calls LiteLLM with the authorization code, code verifier, and resource. LiteLLM exchanges them with the authorization server and returns the issued access/refresh tokens.
- MCP Invocation: With a valid token, the client sends the MCP JSON-RPC request (plus LiteLLM API key) to LiteLLM, which forwards it to the MCP server and relays the tool response.
See the official MCP Authorization Flow for additional reference.
Forwarding Custom Headers to MCP Servers​
LiteLLM supports forwarding additional custom headers from MCP clients to backend MCP servers using the extra_headers configuration parameter. This allows you to pass custom authentication tokens, API keys, or other headers that your MCP server requires.
Configuration
- config.yaml
- Dynamically on Client Side
Configure extra_headers in your MCP server configuration to specify which header names should be forwarded:
mcp_servers:
github_mcp:
url: "https://api.githubcopilot.com/mcp"
auth_type: "bearer_token"
auth_value: "ghp_default_token"
extra_headers: ["custom_key", "x-custom-header", "Authorization"]
description: "GitHub MCP server with custom header forwarding"
Use this when giving users access to a group of MCP servers.
Format: x-mcp-{server_alias}-{header_name}: value
This allows you to use different authentication for different MCP servers.
Examples:
x-mcp-github-authorization: Bearer ghp_xxxxxxxxx- GitHub MCP server with Bearer tokenx-mcp-zapier-x-api-key: sk-xxxxxxxxx- Zapier MCP server with API keyx-mcp-deepwiki-authorization: Basic base64_encoded_creds- DeepWiki MCP server with Basic auth
from fastmcp import Client
import asyncio
# Standard MCP configuration with multiple servers
config = {
"mcpServers": {
"mcp_group": {
"url": "http://localhost:4000/mcp",
"headers": {
"x-mcp-servers": "dev_group", # assume this gives access to github, zapier and deepwiki
"x-litellm-api-key": "Bearer sk-1234",
"x-mcp-github-authorization": "Bearer gho_token",
"x-mcp-zapier-x-api-key": "sk-xxxxxxxxx",
"x-mcp-deepwiki-authorization": "Basic base64_encoded_creds",
"custom_key": "value"
}
}
}
}
# Create a client that connects to all servers
client = Client(config)
async def main():
async with client:
tools = await client.list_tools()
print(f"Available tools: {tools}")
# call mcp
await client.call_tool(
name="github_mcp-search_issues",
arguments={'query': 'created:>2024-01-01', 'sort': 'created', 'order': 'desc', 'perPage': 30}
)
if __name__ == "__main__":
asyncio.run(main())
Benefits:
- Server-specific authentication: Each MCP server can use different auth methods
- Better security: No need to share the same auth token across all servers
- Flexible header names: Support for different auth header types (authorization, x-api-key, etc.)
- Clean separation: Each server's auth is clearly identified
Client Usage​
When connecting from MCP clients, include the custom headers that match the extra_headers configuration:
- Python FastMCP
- Cursor IDE
- HTTP Client
from fastmcp import Client
import asyncio
# MCP client configuration with custom headers
config = {
"mcpServers": {
"github": {
"url": "http://localhost:4000/github_mcp/mcp",
"headers": {
"x-litellm-api-key": "Bearer sk-1234",
"Authorization": "Bearer gho_token",
"custom_key": "custom_value",
"x-custom-header": "additional_data"
}
}
}
}
# Create a client that connects to the server
client = Client(config)
async def main():
async with client:
# List available tools
tools = await client.list_tools()
print(f"Available tools: {tools}")
# Call a tool if available
if tools:
result = await client.call_tool(tools[0].name, {})
print(f"Tool result: {result}")
# Run the client
asyncio.run(main())
{
"mcpServers": {
"GitHub": {
"url": "http://localhost:4000/github_mcp/mcp",
"headers": {
"x-litellm-api-key": "Bearer $LITELLM_API_KEY",
"Authorization": "Bearer $GITHUB_TOKEN",
"custom_key": "custom_value",
"x-custom-header": "additional_data"
}
}
}
}
curl --location 'http://localhost:4000/github_mcp/mcp' \
--header 'Content-Type: application/json' \
--header 'x-litellm-api-key: Bearer sk-1234' \
--header 'Authorization: Bearer gho_token' \
--header 'custom_key: custom_value' \
--header 'x-custom-header: additional_data' \
--data '{
"jsonrpc": "2.0",
"id": 1,
"method": "tools/list"
}'
How It Works​
- Configuration: Define
extra_headersin your MCP server config with the header names you want to forward - Client Headers: Include the corresponding headers in your MCP client requests
- Header Forwarding: LiteLLM automatically forwards matching headers to the backend MCP server
- Authentication: The backend MCP server receives both the configured auth headers and the custom headers
Using your MCP with client side credentials​
Use this if you want to pass a client side authentication token to LiteLLM to then pass to your MCP to auth to your MCP.
New Server-Specific Auth Headers (Recommended)​
You can specify MCP auth tokens using server-specific headers in the format x-mcp-{server_alias}-{header_name}. This allows you to use different authentication for different MCP servers.
Benefits:
- Server-specific authentication: Each MCP server can use different auth methods
- Better security: No need to share the same auth token across all servers
- Flexible header names: Support for different auth header types (authorization, x-api-key, etc.)
- Clean separation: Each server's auth is clearly identified
Legacy Auth Header (Deprecated)​
You can also specify your MCP auth token using the header x-mcp-auth. This will be forwarded to all MCP servers and is deprecated in favor of server-specific headers.
- OpenAI API
- LiteLLM Proxy
- Cursor IDE
- Streamable HTTP
- Python FastMCP
Connect via OpenAI Responses API with Server-Specific Auth​
Use the OpenAI Responses API and include server-specific auth headers:
curl --location 'https://api.openai.com/v1/responses' \
--header 'Content-Type: application/json' \
--header "Authorization: Bearer $OPENAI_API_KEY" \
--data '{
"model": "gpt-4o",
"tools": [
{
"type": "mcp",
"server_label": "litellm",
"server_url": "litellm_proxy",
"require_approval": "never",
"headers": {
"x-litellm-api-key": "Bearer YOUR_LITELLM_API_KEY",
"x-mcp-github-authorization": "Bearer YOUR_GITHUB_TOKEN",
"x-mcp-zapier-x-api-key": "YOUR_ZAPIER_API_KEY"
}
}
],
"input": "Run available tools",
"tool_choice": "required"
}'
Connect via OpenAI Responses API with Legacy Auth​
Use the OpenAI Responses API and include the x-mcp-auth header for your MCP server authentication:
curl --location 'https://api.openai.com/v1/responses' \
--header 'Content-Type: application/json' \
--header "Authorization: Bearer $OPENAI_API_KEY" \
--data '{
"model": "gpt-4o",
"tools": [
{
"type": "mcp",
"server_label": "litellm",
"server_url": "litellm_proxy",
"require_approval": "never",
"headers": {
"x-litellm-api-key": "Bearer YOUR_LITELLM_API_KEY",
"x-mcp-auth": YOUR_MCP_AUTH_TOKEN
}
}
],
"input": "Run available tools",
"tool_choice": "required"
}'
Connect via LiteLLM Proxy Responses API with Server-Specific Auth​
Use this when calling LiteLLM Proxy for LLM API requests to /v1/responses endpoint with server-specific authentication:
curl --location '<your-litellm-proxy-base-url>/v1/responses' \
--header 'Content-Type: application/json' \
--header "Authorization: Bearer $LITELLM_API_KEY" \
--data '{
"model": "gpt-4o",
"tools": [
{
"type": "mcp",
"server_label": "litellm",
"server_url": "litellm_proxy",
"require_approval": "never",
"headers": {
"x-litellm-api-key": "Bearer YOUR_LITELLM_API_KEY",
"x-mcp-github-authorization": "Bearer YOUR_GITHUB_TOKEN",
"x-mcp-zapier-x-api-key": "YOUR_ZAPIER_API_KEY"
}
}
],
"input": "Run available tools",
"tool_choice": "required"
}'
Connect via LiteLLM Proxy Responses API with Legacy Auth​
Use this when calling LiteLLM Proxy for LLM API requests to /v1/responses endpoint with MCP authentication:
curl --location '<your-litellm-proxy-base-url>/v1/responses' \
--header 'Content-Type: application/json' \
--header "Authorization: Bearer $LITELLM_API_KEY" \
--data '{
"model": "gpt-4o",
"tools": [
{
"type": "mcp",
"server_label": "litellm",
"server_url": "litellm_proxy",
"require_approval": "never",
"headers": {
"x-litellm-api-key": "Bearer YOUR_LITELLM_API_KEY",
"x-mcp-auth": "YOUR_MCP_AUTH_TOKEN"
}
}
],
"input": "Run available tools",
"tool_choice": "required"
}'
Connect via Cursor IDE with Server-Specific Auth​
Use tools directly from Cursor IDE with LiteLLM MCP and include server-specific authentication:
Setup Instructions:
- Open Cursor Settings: Use
⇧+⌘+J(Mac) orCtrl+Shift+J(Windows/Linux) - Navigate to MCP Tools: Go to the "MCP Tools" tab and click "New MCP Server"
- Add Configuration: Copy and paste the JSON configuration below, then save with
Cmd+SorCtrl+S
{
"mcpServers": {
"LiteLLM": {
"url": "litellm_proxy",
"headers": {
"x-litellm-api-key": "Bearer $LITELLM_API_KEY",
"x-mcp-github-authorization": "Bearer $GITHUB_TOKEN",
"x-mcp-zapier-x-api-key": "$ZAPIER_API_KEY"
}
}
}
}
Connect via Cursor IDE with Legacy Auth​
Use tools directly from Cursor IDE with LiteLLM MCP and include your MCP authentication token:
Setup Instructions:
- Open Cursor Settings: Use
⇧+⌘+J(Mac) orCtrl+Shift+J(Windows/Linux) - Navigate to MCP Tools: Go to the "MCP Tools" tab and click "New MCP Server"
- Add Configuration: Copy and paste the JSON configuration below, then save with
Cmd+SorCtrl+S
{
"mcpServers": {
"LiteLLM": {
"url": "litellm_proxy",
"headers": {
"x-litellm-api-key": "Bearer $LITELLM_API_KEY",
"x-mcp-auth": "$MCP_AUTH_TOKEN"
}
}
}
}
Connect via Streamable HTTP Transport with Server-Specific Auth​
Connect to LiteLLM MCP using HTTP transport with server-specific authentication:
Server URL:
litellm_proxy
Headers:
x-litellm-api-key: Bearer YOUR_LITELLM_API_KEY
x-mcp-github-authorization: Bearer YOUR_GITHUB_TOKEN
x-mcp-zapier-x-api-key: YOUR_ZAPIER_API_KEY
Connect via Streamable HTTP Transport with Legacy Auth​
Connect to LiteLLM MCP using HTTP transport with MCP authentication:
Server URL:
litellm_proxy
Headers:
x-litellm-api-key: Bearer YOUR_LITELLM_API_KEY
x-mcp-auth: Bearer YOUR_MCP_AUTH_TOKEN
This URL can be used with any MCP client that supports HTTP transport. The x-mcp-auth header will be forwarded to your MCP server for authentication.
Connect via Python FastMCP Client with Server-Specific Auth​
Use the Python FastMCP client to connect to your LiteLLM MCP server with server-specific authentication:
import asyncio
import json
from fastmcp import Client
from fastmcp.client.transports import StreamableHttpTransport
# Create the transport with your LiteLLM MCP server URL and server-specific auth headers
server_url = "litellm_proxy"
transport = StreamableHttpTransport(
server_url,
headers={
"x-litellm-api-key": "Bearer YOUR_LITELLM_API_KEY",
"x-mcp-github-authorization": "Bearer YOUR_GITHUB_TOKEN",
"x-mcp-zapier-x-api-key": "YOUR_ZAPIER_API_KEY"
}
)
# Initialize the client with the transport
client = Client(transport=transport)
async def main():
# Connection is established here
print("Connecting to LiteLLM MCP server with server-specific authentication...")
async with client:
print(f"Client connected: {client.is_connected()}")
# Make MCP calls within the context
print("Fetching available tools...")
tools = await client.list_tools()
print(f"Available tools: {json.dumps([t.name for t in tools], indent=2)}")
# Example: Call a tool (replace 'tool_name' with an actual tool name)
if tools:
tool_name = tools[0].name
print(f"Calling tool: {tool_name}")
# Call the tool with appropriate arguments
result = await client.call_tool(tool_name, arguments={})
print(f"Tool result: {result}")
# Run the example
if __name__ == "__main__":
asyncio.run(main())
Connect via Python FastMCP Client with Legacy Auth​
Use the Python FastMCP client to connect to your LiteLLM MCP server with MCP authentication:
import asyncio
import json
from fastmcp import Client
from fastmcp.client.transports import StreamableHttpTransport
# Create the transport with your LiteLLM MCP server URL and auth headers
server_url = "litellm_proxy"
transport = StreamableHttpTransport(
server_url,
headers={
"x-litellm-api-key": "Bearer YOUR_LITELLM_API_KEY",
"x-mcp-auth": "Bearer YOUR_MCP_AUTH_TOKEN"
}
)
# Initialize the client with the transport
client = Client(transport=transport)
async def main():
# Connection is established here
print("Connecting to LiteLLM MCP server with authentication...")
async with client:
print(f"Client connected: {client.is_connected()}")
# Make MCP calls within the context
print("Fetching available tools...")
tools = await client.list_tools()
print(f"Available tools: {json.dumps([t.name for t in tools], indent=2)}")
# Example: Call a tool (replace 'tool_name' with an actual tool name)
if tools:
tool_name = tools[0].name
print(f"Calling tool: {tool_name}")
# Call the tool with appropriate arguments
result = await client.call_tool(tool_name, arguments={})
print(f"Tool result: {result}")
# Run the example
if __name__ == "__main__":
asyncio.run(main())
Customize the MCP Auth Header Name​
By default, LiteLLM uses x-mcp-auth to pass your credentials to MCP servers. You can change this header name in one of the following ways:
- Set the
LITELLM_MCP_CLIENT_SIDE_AUTH_HEADER_NAMEenvironment variable
export LITELLM_MCP_CLIENT_SIDE_AUTH_HEADER_NAME="authorization"
- Set the
mcp_client_side_auth_header_namein the general settings on the config.yaml file
model_list:
- model_name: gpt-4o
litellm_params:
model: openai/gpt-4o
api_key: sk-xxxxxxx
general_settings:
mcp_client_side_auth_header_name: "authorization"
Using the authorization header​
In this example the authorization header will be passed to the MCP server for authentication.
curl --location '<your-litellm-proxy-base-url>/v1/responses' \
--header 'Content-Type: application/json' \
--header "Authorization: Bearer $LITELLM_API_KEY" \
--data '{
"model": "gpt-4o",
"tools": [
{
"type": "mcp",
"server_label": "litellm",
"server_url": "litellm_proxy",
"require_approval": "never",
"headers": {
"x-litellm-api-key": "Bearer YOUR_LITELLM_API_KEY",
"authorization": "Bearer sk-zapier-token-123"
}
}
],
"input": "Run available tools",
"tool_choice": "required"
}'
LiteLLM Proxy - Walk through MCP Gateway​
LiteLLM exposes an MCP Gateway for admins to add all their MCP servers to LiteLLM. The key benefits of using LiteLLM Proxy with MCP are:
- Use a fixed endpoint for all MCP tools
- MCP Permission management by Key, Team, or User
This video demonstrates how you can onboard an MCP server to LiteLLM Proxy, use it and set access controls.
LiteLLM Python SDK MCP Bridge​
LiteLLM Python SDK acts as a MCP bridge to utilize MCP tools with all LiteLLM supported models. LiteLLM offers the following features for using MCP
- List Available MCP Tools: OpenAI clients can view all available MCP tools
litellm.experimental_mcp_client.load_mcp_toolsto list all available MCP tools
- Call MCP Tools: OpenAI clients can call MCP tools
litellm.experimental_mcp_client.call_openai_toolto call an OpenAI tool on an MCP server
1. List Available MCP Tools​
In this example we'll use litellm.experimental_mcp_client.load_mcp_tools to list all available MCP tools on any MCP server. This method can be used in two ways:
format="mcp"- (default) Return MCP tools- Returns:
mcp.types.Tool
- Returns:
format="openai"- Return MCP tools converted to OpenAI API compatible tools. Allows using with OpenAI endpoints.- Returns:
openai.types.chat.ChatCompletionToolParam
- Returns:
- LiteLLM Python SDK
- OpenAI SDK + LiteLLM Proxy
# Create server parameters for stdio connection
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client
import os
import litellm
from litellm import experimental_mcp_client
server_params = StdioServerParameters(
command="python3",
# Make sure to update to the full absolute path to your mcp_server.py file
args=["./mcp_server.py"],
)
async with stdio_client(server_params) as (read, write):
async with ClientSession(read, write) as session:
# Initialize the connection
await session.initialize()
# Get tools
tools = await experimental_mcp_client.load_mcp_tools(session=session, format="openai")
print("MCP TOOLS: ", tools)
messages = [{"role": "user", "content": "what's (3 + 5)"}]
llm_response = await litellm.acompletion(
model="gpt-4o",
api_key=os.getenv("OPENAI_API_KEY"),
messages=messages,
tools=tools,
)
print("LLM RESPONSE: ", json.dumps(llm_response, indent=4, default=str))
In this example we'll walk through how you can use the OpenAI SDK pointed to the LiteLLM proxy to call MCP tools. The key difference here is we use the OpenAI SDK to make the LLM API request
# Create server parameters for stdio connection
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client
import os
from openai import OpenAI
from litellm import experimental_mcp_client
server_params = StdioServerParameters(
command="python3",
# Make sure to update to the full absolute path to your mcp_server.py file
args=["./mcp_server.py"],
)
async with stdio_client(server_params) as (read, write):
async with ClientSession(read, write) as session:
# Initialize the connection
await session.initialize()
# Get tools using litellm mcp client
tools = await experimental_mcp_client.load_mcp_tools(session=session, format="openai")
print("MCP TOOLS: ", tools)
# Use OpenAI SDK pointed to LiteLLM proxy
client = OpenAI(
api_key="your-api-key", # Your LiteLLM proxy API key
base_url="http://localhost:4000" # Your LiteLLM proxy URL
)
messages = [{"role": "user", "content": "what's (3 + 5)"}]
llm_response = client.chat.completions.create(
model="gpt-4",
messages=messages,
tools=tools
)
print("LLM RESPONSE: ", llm_response)
2. List and Call MCP Tools​
In this example we'll use
litellm.experimental_mcp_client.load_mcp_toolsto list all available MCP tools on any MCP serverlitellm.experimental_mcp_client.call_openai_toolto call an OpenAI tool on an MCP server
The first llm response returns a list of OpenAI tools. We take the first tool call from the LLM response and pass it to litellm.experimental_mcp_client.call_openai_tool to call the tool on the MCP server.
How litellm.experimental_mcp_client.call_openai_tool works​
- Accepts an OpenAI Tool Call from the LLM response
- Converts the OpenAI Tool Call to an MCP Tool
- Calls the MCP Tool on the MCP server
- Returns the result of the MCP Tool call
- LiteLLM Python SDK
- OpenAI SDK + LiteLLM Proxy
# Create server parameters for stdio connection
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client
import os
import litellm
from litellm import experimental_mcp_client
server_params = StdioServerParameters(
command="python3",
# Make sure to update to the full absolute path to your mcp_server.py file
args=["./mcp_server.py"],
)
async with stdio_client(server_params) as (read, write):
async with ClientSession(read, write) as session:
# Initialize the connection
await session.initialize()
# Get tools
tools = await experimental_mcp_client.load_mcp_tools(session=session, format="openai")
print("MCP TOOLS: ", tools)
messages = [{"role": "user", "content": "what's (3 + 5)"}]
llm_response = await litellm.acompletion(
model="gpt-4o",
api_key=os.getenv("OPENAI_API_KEY"),
messages=messages,
tools=tools,
)
print("LLM RESPONSE: ", json.dumps(llm_response, indent=4, default=str))
openai_tool = llm_response["choices"][0]["message"]["tool_calls"][0]
# Call the tool using MCP client
call_result = await experimental_mcp_client.call_openai_tool(
session=session,
openai_tool=openai_tool,
)
print("MCP TOOL CALL RESULT: ", call_result)
# send the tool result to the LLM
messages.append(llm_response["choices"][0]["message"])
messages.append(
{
"role": "tool",
"content": str(call_result.content[0].text),
"tool_call_id": openai_tool["id"],
}
)
print("final messages with tool result: ", messages)
llm_response = await litellm.acompletion(
model="gpt-4o",
api_key=os.getenv("OPENAI_API_KEY"),
messages=messages,
tools=tools,
)
print(
"FINAL LLM RESPONSE: ", json.dumps(llm_response, indent=4, default=str)
)
In this example we'll walk through how you can use the OpenAI SDK pointed to the LiteLLM proxy to call MCP tools. The key difference here is we use the OpenAI SDK to make the LLM API request
# Create server parameters for stdio connection
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client
import os
from openai import OpenAI
from litellm import experimental_mcp_client
server_params = StdioServerParameters(
command="python3",
# Make sure to update to the full absolute path to your mcp_server.py file
args=["./mcp_server.py"],
)
async with stdio_client(server_params) as (read, write):
async with ClientSession(read, write) as session:
# Initialize the connection
await session.initialize()
# Get tools using litellm mcp client
tools = await experimental_mcp_client.load_mcp_tools(session=session, format="openai")
print("MCP TOOLS: ", tools)
# Use OpenAI SDK pointed to LiteLLM proxy
client = OpenAI(
api_key="your-api-key", # Your LiteLLM proxy API key
base_url="http://localhost:8000" # Your LiteLLM proxy URL
)
messages = [{"role": "user", "content": "what's (3 + 5)"}]
llm_response = client.chat.completions.create(
model="gpt-4",
messages=messages,
tools=tools
)
print("LLM RESPONSE: ", llm_response)
# Get the first tool call
tool_call = llm_response.choices[0].message.tool_calls[0]
# Call the tool using MCP client
call_result = await experimental_mcp_client.call_openai_tool(
session=session,
openai_tool=tool_call.model_dump(),
)
print("MCP TOOL CALL RESULT: ", call_result)
# Send the tool result back to the LLM
messages.append(llm_response.choices[0].message.model_dump())
messages.append({
"role": "tool",
"content": str(call_result.content[0].text),
"tool_call_id": tool_call.id,
})
final_response = client.chat.completions.create(
model="gpt-4",
messages=messages,
tools=tools
)
print("FINAL RESPONSE: ", final_response)