Metadata-Version: 2.4
Name: mcpbridge-ai
Version: 0.2.0
Summary: Universal MCP-to-LLM bridge. Connect any LLM to any MCP server.
Author-email: Aditya Jangam <adjangam9@gmail.com>
License: MIT
Project-URL: Homepage, https://github.com/AJ-Playground/mcpbridge
Project-URL: Repository, https://github.com/AJ-Playground/mcpbridge
Keywords: mcp,llm,bridge,tool-calling,model-context-protocol,agents
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Software Development :: Libraries
Requires-Python: >=3.10
Description-Content-Type: text/markdown
Requires-Dist: httpx>=0.27
Requires-Dist: httpx-sse>=0.4
Requires-Dist: websockets>=12.0
Requires-Dist: jsonschema>=4.0
Requires-Dist: pydantic>=2.0
Requires-Dist: anyio>=4.0
Provides-Extra: anthropic
Requires-Dist: anthropic>=0.40; extra == "anthropic"
Provides-Extra: openai
Requires-Dist: openai>=1.50; extra == "openai"
Provides-Extra: gemini
Requires-Dist: google-generativeai>=0.8; extra == "gemini"
Provides-Extra: mistral
Requires-Dist: mistralai>=1.0; extra == "mistral"
Provides-Extra: cohere
Requires-Dist: cohere>=5.0; extra == "cohere"
Provides-Extra: groq
Requires-Dist: groq>=0.11; extra == "groq"
Provides-Extra: ollama
Requires-Dist: ollama>=0.3; extra == "ollama"
Provides-Extra: together
Requires-Dist: together>=1.2; extra == "together"
Provides-Extra: bedrock
Requires-Dist: boto3>=1.34; extra == "bedrock"
Provides-Extra: azure
Requires-Dist: openai>=1.50; extra == "azure"
Provides-Extra: tiktoken
Requires-Dist: tiktoken>=0.7; extra == "tiktoken"
Provides-Extra: all
Requires-Dist: anthropic>=0.40; extra == "all"
Requires-Dist: openai>=1.50; extra == "all"
Requires-Dist: google-generativeai>=0.8; extra == "all"
Requires-Dist: mistralai>=1.0; extra == "all"
Requires-Dist: cohere>=5.0; extra == "all"
Requires-Dist: groq>=0.11; extra == "all"
Requires-Dist: ollama>=0.3; extra == "all"
Requires-Dist: together>=1.2; extra == "all"
Requires-Dist: boto3>=1.34; extra == "all"
Requires-Dist: tiktoken>=0.7; extra == "all"
Provides-Extra: dev
Requires-Dist: pytest>=8; extra == "dev"
Requires-Dist: pytest-asyncio>=0.23; extra == "dev"
Requires-Dist: pytest-mock>=3.14; extra == "dev"

# mcpbridge

A production-grade Python package that connects any LLM provider to any MCP server.
One config dict, one `await bridge.run()`, and the model can call MCP tools
autonomously until it has an answer.

mcpbridge handles the entire lifecycle: transport negotiation, MCP protocol
handshake, tool discovery, schema conversion, multi-turn tool calling, argument
validation, history management, and graceful error recovery -- so your
application code does not have to.

---

## Table of Contents

- [Why mcpbridge](#why-mcpbridge)
- [Installation](#installation)
- [Quickstart](#quickstart)
- [How It Works](#how-it-works)
- [Configuration Reference](#configuration-reference)
- [Supported LLM Providers](#supported-llm-providers)
- [MCP Transport Types](#mcp-transport-types)
- [MCP Server Authentication](#mcp-server-authentication)
- [Multi-Server Setup](#multi-server-setup)
- [System Prompt and Context Injection](#system-prompt-and-context-injection)
- [Session Manager (Multi-User / FastAPI)](#session-manager)
- [Callback Hooks](#callback-hooks)
- [Discovery Helpers](#discovery-helpers)
- [Custom LLM Adapter](#custom-llm-adapter)
- [Error Handling](#error-handling)
- [Best Practices](#best-practices)
- [LoopResult Reference](#loopresult-reference)
- [Model Aliases](#model-aliases)
- [Limitations and Roadmap](#limitations-and-roadmap)
- [Running Tests](#running-tests)
- [Related Links](#related-links)
- [License](#license)

---

## Why mcpbridge

The Model Context Protocol (MCP) defines how tools, resources, and prompts are
exposed over JSON-RPC. Every LLM provider has a different format for tool
definitions, tool call extraction, and tool result insertion. Writing the glue
between one provider and one MCP server is tedious. Writing it for eleven
providers and four transport types is a maintenance problem.

mcpbridge solves this by sitting in the middle:

```
Your app  -->  MCPBridge  -->  LLM provider  (tool definitions, chat calls)
                   |
                   +--------->  MCP server   (tool discovery, tool execution)
```

You describe the LLM and the MCP server in a config dict, and mcpbridge does
the rest: discovers tools, converts schemas, runs the agentic loop, validates
arguments, retries on failure, trims history, and returns a clean result.

---

## Installation

Install the base package (no LLM provider SDKs):

```bash
pip install mcpbridge
```

Install with a specific provider:

```bash
pip install mcpbridge[openai]
pip install mcpbridge[anthropic]
pip install mcpbridge[gemini]
pip install mcpbridge[groq]
pip install mcpbridge[mistral]
pip install mcpbridge[cohere]
pip install mcpbridge[ollama]
pip install mcpbridge[together]
pip install mcpbridge[bedrock]
pip install mcpbridge[azure]
```

Install everything:

```bash
pip install mcpbridge[all]
```

Requires Python 3.10 or later.

---

## Quickstart

```python
import asyncio
from mcpbridge import MCPBridge


async def main():
    bridge = await MCPBridge(
        {
            "llm": {
                "provider": "openai",
                "model": "gpt-4o",
                "api_key": "sk-...",
            },
            "mcp": {
                "url": "http://localhost:3000",
                "transport": "streamable_http",
            },
            "prompt": {
                "system": "You are a helpful assistant.",
            },
            "loop": {
                "max_iterations": 10,
            },
        }
    ).connect()

    result = await bridge.run("What is the weather in Tokyo?")
    print(result.text)
    print(result.finish_reason)   # "done" or "max_iterations"
    print(result.tool_calls_made) # list of tool calls with results

    await bridge.close()


if __name__ == "__main__":
    asyncio.run(main())
```

`MCPBridge` also works as an async context manager:

```python
async with MCPBridge(config) as bridge:
    result = await bridge.run("Hello")
```

---

## How It Works

1. **Connect** -- `MCPBridge.connect()` opens the MCP transport, performs the
   JSON-RPC `initialize` / `initialized` handshake, and discovers all tools,
   resources, and prompts exposed by the server.

2. **Build prompt** -- The `PromptBuilder` assembles a system prompt from
   internal instructions, the user-defined system prompt (with `{var}`
   interpolation), and auto-generated tool descriptions.

3. **Call the LLM** -- The adapter converts discovered MCP tools into the
   provider-specific schema format and sends them alongside the message history.

4. **Extract tool calls** -- If the LLM response contains tool calls, the
   adapter normalizes them into `ToolCall(id, name, arguments)` objects.

5. **Execute tools** -- Each tool call is validated against the JSON schema,
   dispatched to the correct MCP server transport, and timed out if it takes
   too long. Failed tool calls can be retried once.

6. **Append results** -- Tool results are inserted back into the message
   history in the provider-specific format, and the loop returns to step 3.

7. **Return** -- When the LLM produces a final answer (no tool calls), or when
   `max_iterations` is reached, the loop returns a `LoopResult` with the
   assistant text, the full tool call log, token estimates, and a
   `finish_reason`.

---

## Configuration Reference

The config is a plain Python dict with five top-level keys.

### `llm`

| Field | Type | Default | Description |
|---|---|---|---|
| `provider` | str | (required) | `anthropic`, `openai`, `gemini`, `mistral`, `cohere`, `groq`, `ollama`, `together`, `bedrock`, `azure_openai`, `openai_compatible` |
| `model` | str | `""` | Model name or short alias (see [Model Aliases](#model-aliases)) |
| `api_key` | str or null | null | Falls back to provider env var if not set |
| `base_url` | str or null | null | Override the provider API endpoint |
| `temperature` | float | `0.7` | Sampling temperature |
| `max_tokens` | int | `4096` | Maximum output tokens |
| `top_p` | float or null | null | Nucleus sampling |
| `top_k` | int or null | null | Top-k sampling (providers that support it) |
| `stop_sequences` | list | `[]` | Stop sequences |
| `stream` | bool | `false` | Accepted for forward-compatibility; falls back to non-streaming in this version |
| `timeout` | int | `60` | HTTP timeout in seconds for the LLM call |
| `extra_params` | dict | `{}` | Provider-specific parameters passed through verbatim |
| `thinking` | object | `{"enabled": false, "budget_tokens": 1024}` | Anthropic extended thinking |
| `azure_deployment` | str | `""` | Azure OpenAI deployment name |
| `azure_api_version` | str | `"2024-02-01"` | Azure API version |
| `aws_region` | str | `"us-east-1"` | Bedrock region |
| `aws_profile` | str | `""` | Bedrock named profile |

### `mcp`

| Field | Type | Default | Description |
|---|---|---|---|
| `url` | str or null | null | Remote MCP server URL |
| `transport` | str | `"auto"` | `auto`, `http`, `sse`, `streamable_http`, `ws`, `stdio` |
| `headers` | dict | `{}` | HTTP headers for authentication and other purposes |
| `timeout` | int | `30` | MCP transport timeout in seconds |
| `command` | str or null | null | Subprocess command for stdio transport |
| `args` | list | `[]` | Subprocess arguments for stdio transport |
| `env` | dict | `{}` | Environment variables passed to stdio subprocess |
| `servers` | list or null | null | Multi-server config; overrides top-level `url`/`command` |
| `namespace_strategy` | str | `"prefix"` | `prefix`, `error`, `last_wins` |

### `prompt`

| Field | Type | Default | Description |
|---|---|---|---|
| `system` | str | `""` | System prompt text |
| `interpolate` | bool | `true` | Enable `{variable}` interpolation in the system prompt |
| `context_vars` | dict | `{}` | Initial interpolation variables |
| `inject_tool_descriptions` | bool | `true` | Append tool descriptions to the system prompt |
| `inject_internal_instructions` | bool | `true` | Prepend internal tool-use rules to the system prompt |
| `user_prefix` | str | `""` | Prepended to every user query |
| `user_suffix` | str | `""` | Appended to every user query |

### `loop`

| Field | Type | Default | Description |
|---|---|---|---|
| `max_iterations` | int | `10` | Maximum tool-calling iterations before returning |
| `max_tokens_total` | int | `32000` | Best-effort total token budget for the loop |
| `tool_timeout` | int | `30` | Timeout in seconds for each MCP tool call |
| `parallel_tool_calls` | bool | `true` | Execute multiple tool calls concurrently |
| `on_tool_call` | callable or null | null | `async def on_tool_call(name, args)` |
| `on_tool_result` | callable or null | null | `async def on_tool_result(name, tool_result)` |
| `on_iteration` | callable or null | null | `async def on_iteration(iteration, messages)` |
| `retry_on_tool_error` | bool | `true` | Retry a failed tool call once |
| `error_strategy` | str | `"return_error_to_llm"` | `raise`, `return_error_to_llm`, `skip` |

### `session`

| Field | Type | Default | Description |
|---|---|---|---|
| `persist_history` | bool | `true` | Keep conversation history across runs |
| `max_history_tokens` | int | `32000` | Trim history when it exceeds this token count |
| `history_trim_strategy` | str | `"oldest_first"` | `oldest_first` or `summarize` |

---

## Supported LLM Providers

| Provider | Pip extra | Env var | Tool format |
|---|---|---|---|
| Anthropic Claude | `mcpbridge[anthropic]` | `ANTHROPIC_API_KEY` | `input_schema` content blocks |
| OpenAI | `mcpbridge[openai]` | `OPENAI_API_KEY` | OpenAI `tool_calls` |
| Google Gemini | `mcpbridge[gemini]` | `GOOGLE_API_KEY` | `function_declarations` |
| Mistral | `mcpbridge[mistral]` | `MISTRAL_API_KEY` | OpenAI-compatible |
| Cohere | `mcpbridge[cohere]` | `COHERE_API_KEY` | Flat `parameter_definitions` |
| Groq | `mcpbridge[groq]` | `GROQ_API_KEY` | OpenAI-compatible |
| Ollama | `mcpbridge[ollama]` | (none) | OpenAI-compatible, local |
| Together AI | `mcpbridge[together]` | `TOGETHER_API_KEY` | OpenAI-compatible |
| AWS Bedrock | `mcpbridge[bedrock]` | AWS credential chain | Converse API `toolSpec` |
| Azure OpenAI | `mcpbridge[azure]` | `AZURE_OPENAI_API_KEY` | OpenAI-compatible |
| OpenAI-compatible | (none) | `OPENAI_API_KEY` (optional) | Any `/chat/completions` endpoint |

If a provider SDK is not installed, mcpbridge raises an `ImportError` at adapter
init time with the exact `pip install` command needed.

---

## MCP Transport Types

| Transport | When to use | Config fields |
|---|---|---|
| `stdio` | Local MCP server as a subprocess | `command`, `args`, `env` |
| `streamable_http` | Preferred for remote MCP servers | `url`, `headers` |
| `http_sse` | Legacy HTTP + SSE servers | `url`, `headers` |
| `ws` / `websocket` | WebSocket MCP servers | `url`, `headers` |

**Auto-detection rules** (when `transport` is `"auto"`):

1. `command` is set -- stdio
2. URL starts with `ws://` or `wss://` -- websocket
3. URL ends with `/sse` -- http_sse
4. Otherwise -- streamable_http

---

## MCP Server Authentication

MCP servers may require authentication. mcpbridge handles this at the transport
layer, not inside JSON-RPC payloads.

**HTTP / SSE / Streamable HTTP / WebSocket** -- pass credentials via `mcp.headers`:

```python
"mcp": {
    "url": "https://secure-mcp-server.example.com",
    "transport": "streamable_http",
    "headers": {
        "Authorization": "Bearer YOUR_TOKEN",
    },
}
```

**stdio** -- pass credentials via environment variables to the subprocess:

```python
"mcp": {
    "command": "npx",
    "args": ["@some/mcp-server"],
    "env": {
        "MCP_API_KEY": "YOUR_TOKEN",
    },
}
```

For multi-server setups, each server entry supports its own `headers` and `env`.
Top-level `mcp.headers` are merged into every server's headers automatically.

---

## Multi-Server Setup

Connect to multiple MCP servers and let the LLM choose tools from any of them.
Tool names are automatically namespaced to prevent collisions.

```python
config = {
    "llm": {"provider": "groq", "model": "llama-3.3-70b-versatile"},
    "mcp": {
        "servers": [
            {
                "name": "files",
                "command": "npx",
                "args": ["@modelcontextprotocol/server-filesystem"],
                "transport": "stdio",
            },
            {
                "name": "db",
                "url": "http://localhost:8080",
                "transport": "streamable_http",
                "headers": {"Authorization": "Bearer DB_TOKEN"},
            },
        ],
        "namespace_strategy": "prefix",
    },
    "prompt": {"system": "Use tools when needed."},
    "loop": {"max_iterations": 10},
}
```

Namespace strategies:

- `prefix` (default) -- tool names become `servername__toolname`
- `error` -- raise `ConfigValidationError` on name collision
- `last_wins` -- later server overwrites earlier tools with the same name

You can also add and remove servers at runtime:

```python
await bridge.add_server("analytics", {"url": "http://localhost:9090"})
await bridge.remove_server("analytics")
```

---

## System Prompt and Context Injection

The system prompt supports `{variable}` interpolation. Variables can be set at
config time or at runtime.

```python
config = {
    "llm": {"provider": "openai", "model": "gpt-4o", "api_key": "sk-..."},
    "mcp": {"url": "http://localhost:3000", "transport": "streamable_http"},
    "prompt": {
        "system": "You are a translator. Translate to {language}.",
        "context_vars": {"language": "Hindi"},
    },
    "loop": {"max_iterations": 10},
}
```

At runtime:

```python
bridge.set_context(language="Japanese")
result = await bridge.run("Translate: Good morning")
```

---

## Session Manager

For multi-user applications (web servers, APIs), `SessionManager` pools one
`MCPBridge` instance per session and handles idle cleanup.

```python
from fastapi import FastAPI
from mcpbridge import SessionManager

app = FastAPI()

base_config = {
    "llm": {"provider": "groq", "model": "llama-3.3-70b-versatile"},
    "mcp": {"url": "http://localhost:3000", "transport": "streamable_http"},
    "prompt": {"system": "You are a helpful assistant."},
    "loop": {"max_iterations": 10},
    "session": {"persist_history": True, "max_history_tokens": 32000},
}
manager = SessionManager(base_config, max_sessions=1000)

@app.on_event("startup")
async def startup():
    await manager.start_cleanup_task(ttl_seconds=3600)

@app.post("/chat")
async def chat(session_id: str, message: str):
    bridge = await manager.get_or_create(session_id)
    result = await bridge.run(message)
    return {"text": result.text, "finish_reason": result.finish_reason}
```

Key methods:

- `get_or_create(session_id)` -- returns an existing or new connected bridge
- `destroy(session_id)` -- closes and removes a session
- `destroy_all()` -- shuts down all sessions
- `active_count()` -- number of active sessions
- `start_cleanup_task(ttl_seconds, interval_seconds)` -- background cleanup

---

## Callback Hooks

Monitor tool calls and loop iterations in real time.

```python
async def on_tool_call(name, args):
    print(f"Calling tool: {name} with {args}")

async def on_tool_result(name, result):
    print(f"Tool result: {name} -> {result.content} (error={result.is_error})")

async def on_iteration(iteration, messages):
    print(f"Loop iteration {iteration}")

config = {
    "llm": {"provider": "openai", "model": "gpt-4o", "api_key": "sk-..."},
    "mcp": {"url": "http://localhost:3000", "transport": "streamable_http"},
    "prompt": {"system": "Use tools when needed."},
    "loop": {
        "max_iterations": 10,
        "on_tool_call": on_tool_call,
        "on_tool_result": on_tool_result,
        "on_iteration": on_iteration,
    },
}
```

Callbacks can be synchronous or asynchronous. mcpbridge will await coroutines
automatically.

---

## Discovery Helpers

After connecting, you can inspect MCP server capabilities directly.

```python
async with MCPBridge(config) as bridge:
    tools = bridge.list_tools()
    for t in tools:
        print(t.namespaced_name, t.description)

    resources = await bridge.list_resources()
    content = await bridge.read_resource("file:///path/to/resource")

    prompts = await bridge.list_prompts()
    rendered = await bridge.get_prompt("prompt_name", {"arg": "value"})
```

---

## Custom LLM Adapter

To add a provider that mcpbridge does not support out of the box, subclass
`BaseLLMAdapter` and implement the required methods.

```python
from mcpbridge.adapters.base import BaseLLMAdapter, ToolCall, ToolResult


class MyAdapter(BaseLLMAdapter):
    @property
    def provider_name(self) -> str:
        return "my_provider"

    @property
    def supports_parallel_tool_calls(self) -> bool:
        return True

    @property
    def supports_system_prompt(self) -> bool:
        return True

    async def chat(self, messages, tools, system=None, **kwargs):
        # Call your provider and return the raw response.
        ...

    def extract_tool_calls(self, response) -> list[ToolCall]:
        # Parse tool calls from the raw response.
        ...

    def extract_text(self, response) -> str:
        # Extract the final assistant text.
        ...

    def is_done(self, response) -> bool:
        # True when there are no pending tool calls.
        ...

    def append_tool_results(self, messages, response, tool_results):
        # Insert tool results into the message history.
        ...

    def count_tokens(self, messages, system) -> int:
        # Best-effort token estimate for history trimming.
        ...
```

The adapter contract is simple: `append_tool_results()` must produce a message
history that `chat()` can accept on the next call. Everything else follows.

---

## Error Handling

mcpbridge defines a structured exception hierarchy. Every exception inherits
from `MCPBridgeError`, so you can catch broadly or narrowly.

**Configuration**

| Exception | When |
|---|---|
| `ConfigValidationError` | Invalid config dict |
| `ProviderNotFoundError` | Unknown `llm.provider` value |
| `APIKeyMissingError` | No API key found (config or env var) |

**Transport**

| Exception | When |
|---|---|
| `TransportConnectionError` | Cannot connect to MCP server |
| `TransportDisconnectedError` | Connection dropped mid-session |
| `TransportTimeoutError` | MCP request timed out |

**MCP Protocol**

| Exception | When |
|---|---|
| `MCPProtocolError` | Malformed JSON-RPC or unexpected server response |
| `MCPToolNotFoundError` | Tool name not in discovered registry |
| `MCPToolCallError` | MCP server returned an error for `tools/call` |
| `MCPSchemaValidationError` | Tool arguments failed JSON Schema validation |

**LLM**

| Exception | When |
|---|---|
| `LLMRateLimitError` | Provider returned 429 |
| `LLMAuthError` | Provider authentication failed |
| `LLMContextLengthError` | Context window exceeded |
| `ToolsNotSupportedError` | Model does not support tool calling |

**Loop**

| Exception | When |
|---|---|
| `LoopTimeoutError` | Total token budget exceeded |
| `BridgeInUseError` | Concurrent `run()` on the same bridge instance |
| `SessionLimitError` | `SessionManager` exceeded `max_sessions` |

Note: `max_iterations` no longer raises an exception. When the loop reaches the
configured limit, it returns a `LoopResult` with `finish_reason="max_iterations"`
and best-effort text (extracted from the last LLM response or the most recent
tool result). This prevents service crashes when a model gets stuck in a
tool-calling loop.

---

## Best Practices

**Keep `max_iterations` reasonable.** A value between 5 and 15 covers most
real-world tool-calling workflows. If the model consistently hits the limit,
the system prompt likely needs to be more explicit about when to stop calling
tools.

**Use `error_strategy: "return_error_to_llm"` in production.** This is the
default. When a tool call fails (schema mismatch, timeout, server error), the
error message is returned to the LLM as a tool result so it can decide how to
recover. The `"raise"` strategy is better for development and debugging.

**Set `persist_history: false` for stateless endpoints.** If each request is
independent, disable history to avoid unbounded memory growth. For
conversational use cases, set `max_history_tokens` to a value that fits within
your model's context window.

**Use `namespace_strategy: "prefix"` for multi-server setups.** This is the
default and prevents tool name collisions between servers. The LLM sees tool
names like `serverA__search` and `serverB__search` and can distinguish them.

**Authenticate MCP servers via `mcp.headers` or `mcp.env`.** Do not embed
credentials in URLs. For HTTP transports, use the `Authorization` header. For
stdio transports, pass tokens through environment variables.

**Use the `on_tool_call` and `on_tool_result` callbacks for logging and
observability.** They fire synchronously during the loop and give you full
visibility into what the model is doing without modifying the loop behavior.

**Check `result.finish_reason` after every `run()` call.** A value of `"done"`
means the model produced a final answer. A value of `"max_iterations"` means
the loop was capped before the model finished. Your application should handle
both cases.

**Prefer short model aliases for readability.** Instead of writing
`"claude-opus-4-20250514"`, write `"opus"`. mcpbridge resolves aliases
automatically (see [Model Aliases](#model-aliases)).

**Do not share a single `MCPBridge` instance across concurrent requests.**
Each `run()` call acquires an internal lock. For concurrent users, use
`SessionManager` which creates one bridge per session.

---

## LoopResult Reference

`bridge.run()` returns a `LoopResult` dataclass:

| Field | Type | Description |
|---|---|---|
| `text` | str | Final assistant text (or best-effort fallback) |
| `tool_calls_made` | list[dict] | Each entry: `{id, name, arguments, result, is_error}` |
| `iterations` | int | Number of tool-calling iterations completed |
| `total_input_tokens` | int | Best-effort input token estimate |
| `total_output_tokens` | int | Best-effort output token estimate |
| `finish_reason` | str | `"done"` or `"max_iterations"` |

---

## Model Aliases

Short aliases can be used in place of full model identifiers.

| Provider | Alias | Resolves to |
|---|---|---|
| anthropic | `opus` | `claude-opus-4-20250514` |
| anthropic | `sonnet` | `claude-sonnet-4-20250514` |
| anthropic | `haiku` | `claude-haiku-4-5-20251001` |
| openai | `gpt4o` | `gpt-4o` |
| openai | `gpt4o-mini` | `gpt-4o-mini` |
| openai | `o3` | `o3` |
| openai | `o3-mini` | `o3-mini` |
| openai | `o4-mini` | `o4-mini` |
| gemini | `flash` | `gemini-1.5-flash` |
| gemini | `flash2` | `gemini-2.0-flash` |
| gemini | `pro` | `gemini-1.5-pro` |
| gemini | `pro2` | `gemini-2.0-pro` |
| mistral | `large` | `mistral-large-latest` |
| mistral | `small` | `mistral-small-latest` |
| groq | `llama3` | `llama-3.3-70b-versatile` |
| groq | `llama3-small` | `llama-3.1-8b-instant` |
| cohere | `command` | `command-r-plus` |
| together | `llama3` | `meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo` |

If the model string is not recognized as an alias, it is passed through as-is.

---

## Limitations and Roadmap

- **Token streaming is not implemented.** The `stream` parameter is accepted
  and will not raise an error, but the loop runs in non-streaming mode. A
  future version will add incremental token delivery via async generators.

- **LLM-level retry/backoff is not built in.** If the LLM provider returns
  429 or 5xx, the adapter raises immediately. Rate-limit retry logic should be
  handled at the application level or in a custom adapter.

- **History trimming is best-effort.** Token estimates use `tiktoken` when
  available (OpenAI family) and fall back to a rough `chars / 4` heuristic for
  other providers.

---

## Running Tests

```bash
pip install mcpbridge[dev]
pytest -q
```

The test suite uses fakes and mocks for all external dependencies (LLM
providers, MCP servers). No API keys or running servers are needed.

---

## Related Links

**MCP Specification**

- Architecture: https://modelcontextprotocol.io/docs/concepts/architecture
- Transports: https://modelcontextprotocol.io/docs/concepts/transports
- Tools: https://modelcontextprotocol.io/docs/concepts/tools
- Resources: https://modelcontextprotocol.io/docs/concepts/resources
- Prompts: https://modelcontextprotocol.io/docs/concepts/prompts
- Full spec: https://spec.modelcontextprotocol.io/specification/2024-11-05/

**LLM Provider Documentation**

- Anthropic Messages API: https://docs.anthropic.com/en/api/messages
- Anthropic Tool Use: https://docs.anthropic.com/en/docs/build-with-claude/tool-use
- OpenAI Chat Completions: https://platform.openai.com/docs/api-reference/chat/create
- OpenAI Function Calling: https://platform.openai.com/docs/guides/function-calling
- Google Gemini: https://ai.google.dev/api/generate-content
- Gemini Function Calling: https://ai.google.dev/gemini-api/docs/function-calling
- Mistral Chat: https://docs.mistral.ai/api/#tag/chat
- Mistral Function Calling: https://docs.mistral.ai/capabilities/function_calling/
- Cohere Chat: https://docs.cohere.com/reference/chat
- Cohere Tool Use: https://docs.cohere.com/docs/tool-use
- Groq: https://console.groq.com/docs/openai
- Groq Tool Use: https://console.groq.com/docs/tool-use
- Ollama API: https://github.com/ollama/ollama/blob/main/docs/api.md
- Together AI: https://docs.together.ai/docs/chat-overview
- AWS Bedrock Converse: https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_Converse.html
- Azure OpenAI: https://learn.microsoft.com/en-us/azure/ai-services/openai/reference

---

## License

MIT
