Metadata-Version: 2.4
Name: chuk-llm
Version: 0.12
Summary: A unified, production-ready Python library for Large Language Model (LLM) providers with real-time streaming, function calling, middleware support, automatic session tracking, dynamic model discovery, and intelligent system prompt generation.
Author-email: Chris Hay <chrishayuk@somejunkmailbox.com>
Maintainer-email: Chris Hay <chrishayuk@somejunkmailbox.com>
License: MIT
Project-URL: Homepage, https://github.com/chrishayuk/chuk-llm
Project-URL: Documentation, https://github.com/chrishayuk/chuk-llm#readme
Project-URL: Repository, https://github.com/chrishayuk/chuk-llm.git
Project-URL: Issues, https://github.com/chrishayuk/chuk-llm/issues
Project-URL: Changelog, https://github.com/chrishayuk/chuk-llm/releases
Keywords: llm,ai,openai,anthropic,claude,gpt,gemini,ollama,streaming,async,machine-learning
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Typing :: Typed
Requires-Python: >=3.11
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: aiohttp>=3.12.15
Requires-Dist: anthropic>=0.62.0
Requires-Dist: asyncio>=4.0.0
Requires-Dist: chuk-ai-session-manager>=0.7
Requires-Dist: google-genai>=1.29.0
Requires-Dist: groq>=0.25.0
Requires-Dist: httpx>=0.28.1
Requires-Dist: ibm-watsonx-ai>=1.3.30
Requires-Dist: jinja2>=3.1.6
Requires-Dist: mistralai>=1.9.3
Requires-Dist: ollama>=0.5.3
Requires-Dist: openai>=1.79.0
Requires-Dist: python-dotenv>=1.1.0
Requires-Dist: pyyaml>=6.0.2
Requires-Dist: tiktoken>=0.11.0
Requires-Dist: transformers>=4.53.2
Provides-Extra: redis
Requires-Dist: chuk-ai-session-manager[redis]>=0.7; extra == "redis"
Provides-Extra: watsonx
Requires-Dist: ibm-watsonx-ai; extra == "watsonx"
Requires-Dist: jinja2>=3.1.6; extra == "watsonx"
Requires-Dist: transformers>=4.53.2; extra == "watsonx"
Provides-Extra: cli
Requires-Dist: rich>=14.0.0; extra == "cli"
Provides-Extra: dev
Requires-Dist: pytest-asyncio>=0.26.0; extra == "dev"
Requires-Dist: pytest>=8.3.5; extra == "dev"
Requires-Dist: pytest-cov>=6.1.1; extra == "dev"
Requires-Dist: rich>=14.0.0; extra == "dev"
Provides-Extra: all
Requires-Dist: chuk-ai-session-manager[redis]>=0.7; extra == "all"
Requires-Dist: rich>=14.0.0; extra == "all"
Dynamic: license-file

# chuk-llm

**One library, all LLMs.** Production-ready Python library with automatic model discovery, real-time streaming, and zero-config session tracking.

```python
from chuk_llm import quick_question
print(quick_question("What is 2+2?"))  # "2 + 2 equals 4."
```

## Why chuk-llm?

- **🚀 Instant Setup**: Works out of the box with any LLM provider
- **🔍 Auto-Discovery**: Detects new models automatically (especially Ollama)
- **⚡ 5-7x Faster**: Groq achieves 526 tokens/sec vs OpenAI's 68 tokens/sec
- **📊 Built-in Analytics**: Automatic cost and usage tracking
- **🎯 Developer-First**: Clean API, great CLI, sensible defaults

## Quick Start

### Installation

```bash
# Core functionality
pip install chuk_llm

# Or with extras
pip install chuk_llm[redis]  # Persistent sessions
pip install chuk_llm[cli]    # Enhanced CLI experience
pip install chuk_llm[all]    # Everything
```

### Basic Usage

```python
# Simplest approach - auto-detects available providers
from chuk_llm import quick_question
answer = quick_question("Explain quantum computing in one sentence")

# Provider-specific (auto-generated functions!)
from chuk_llm import ask_openai_sync, ask_claude_sync, ask_ollama_llama3_2_sync

response = ask_openai_sync("Tell me a joke")
response = ask_claude_sync("Write a haiku")
response = ask_ollama_llama3_2_sync("Explain Python")  # Auto-discovered!
```

### Async & Streaming

```python
import asyncio
from chuk_llm import ask, stream

async def main():
    # Async call
    response = await ask("What's the capital of France?")
    
    # Real-time streaming
    async for chunk in stream("Write a story"):
        print(chunk, end="", flush=True)

asyncio.run(main())
```

### CLI Usage

```bash
# Quick commands with global aliases
chuk-llm ask_gpt "What is Python?"
chuk-llm ask_claude "Explain quantum computing"

# Auto-discovered Ollama models work instantly
chuk-llm ask_ollama_gemma3 "Hello world"
chuk-llm stream_ollama_mistral "Write a long story"

# Discover new models
chuk-llm discover ollama
```

## Key Features

### 🔍 Automatic Model Discovery

Pull new Ollama models and use them immediately - no configuration needed:

```bash
# Terminal 1: Pull a new model
ollama pull llama3.2
ollama pull mistral-small:latest

# Terminal 2: Use immediately in Python
from chuk_llm import ask_ollama_llama3_2_sync, ask_ollama_mistral_small_latest_sync
response = ask_ollama_llama3_2_sync("Hello!")

# Or via CLI
chuk-llm ask_ollama_mistral_small_latest "Tell me a joke"
```

### 📊 Automatic Session Tracking

Every call is automatically tracked for analytics:

```python
from chuk_llm import ask_sync, get_session_stats

ask_sync("What's the capital of France?")
ask_sync("What's 2+2?")

stats = get_session_stats()
print(f"Total cost: ${stats['estimated_cost']:.6f}")
print(f"Total tokens: {stats['total_tokens']}")
```

### 🎭 Stateful Conversations

Build conversational AI with memory:

```python
from chuk_llm import conversation

async with conversation() as chat:
    await chat.ask("My name is Alice")
    response = await chat.ask("What's my name?")
    # AI responds: "Your name is Alice"
```

### ⚡ Concurrent Execution

Run multiple queries in parallel for massive speedups:

```python
import asyncio
from chuk_llm import ask

# 3-7x faster than sequential!
responses = await asyncio.gather(
    ask("What is AI?"),
    ask("Capital of Japan?"),
    ask("Meaning of life?")
)
```

## Supported Providers

| Provider | Models | Special Features |
|----------|--------|-----------------|
| **OpenAI** | GPT-4o, GPT-4o-mini, GPT-3.5 | Industry standard |
| **Azure OpenAI** | GPT-4o, GPT-3.5 (Enterprise) | SOC2, HIPAA compliant, VNet |
| **Anthropic** | Claude 3.5 Sonnet, Haiku | Advanced reasoning |
| **Google** | Gemini 2.0 Flash, 1.5 Pro | Multimodal support |
| **Groq** | Llama 3.3, Mixtral | Ultra-fast (526 tokens/sec) |
| **Ollama** | Any local model | Auto-discovery, offline |
| **IBM watsonx** | Granite 3.3, Llama 4 | Enterprise features |
| **Perplexity** | Sonar models | Real-time web search |
| **Mistral** | Large, Medium, Small | European sovereignty |

## Configuration

### Environment Variables

```bash
# API Keys
export OPENAI_API_KEY="sk-..."
export ANTHROPIC_API_KEY="sk-ant-..."
export AZURE_OPENAI_API_KEY="..."
export AZURE_OPENAI_ENDPOINT="https://your-resource.openai.azure.com"

# Session Storage (optional)
export SESSION_PROVIDER=redis  # Default: memory
export SESSION_REDIS_URL=redis://localhost:6379/0

# Discovery Settings
export CHUK_LLM_AUTO_DISCOVER=true  # Auto-discover new models
```

### Python Configuration

```python
from chuk_llm import configure

configure(
    provider="azure_openai",
    model="gpt-4o-mini",
    temperature=0.7
)

# All subsequent calls use these settings
response = ask_sync("Hello!")
```

## Advanced Features

<details>
<summary><b>🛠️ Function Calling</b></summary>

```python
tools = [{
    "type": "function",
    "function": {
        "name": "get_weather",
        "description": "Get weather information",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {"type": "string"}
            }
        }
    }
}]

response = ask_sync("What's the weather in Paris?", tools=tools)
```
</details>

<details>
<summary><b>🌳 Conversation Branching</b></summary>

```python
async with conversation() as chat:
    await chat.ask("Planning a vacation")
    
    # Explore different options
    async with chat.branch() as japan_branch:
        await japan_branch.ask("Tell me about Japan")
    
    async with chat.branch() as italy_branch:
        await italy_branch.ask("Tell me about Italy")
    
    # Main conversation unaffected by branches
    await chat.ask("I'll go with Japan!")
```
</details>

<details>
<summary><b>📈 Provider Comparison</b></summary>

```python
from chuk_llm import compare_providers

results = compare_providers(
    "Explain quantum computing",
    ["openai", "anthropic", "groq", "ollama"]
)

for provider, response in results.items():
    print(f"{provider}: {response[:100]}...")
```
</details>

<details>
<summary><b>🎯 Intelligent System Prompts</b></summary>

ChukLLM automatically generates optimized system prompts based on provider capabilities:

```python
# Each provider gets optimized prompts
response = ask_claude_sync("Help me code", tools=tools)
# Claude gets: "You are Claude, an AI assistant created by Anthropic..."

response = ask_openai_sync("Help me code", tools=tools)  
# OpenAI gets: "You are a helpful assistant with function calling..."
```
</details>

## CLI Commands

```bash
# Quick access to any model
chuk-llm ask_gpt "Your question"
chuk-llm ask_claude "Your question"
chuk-llm ask_ollama_llama3_2 "Your question"

# Discover and test
chuk-llm discover ollama        # Find new models
chuk-llm test azure_openai      # Test connection
chuk-llm providers              # List all providers
chuk-llm models ollama          # Show available models
chuk-llm functions              # List all generated functions

# Advanced usage
chuk-llm ask "Question" --provider azure_openai --model gpt-4o-mini --json
chuk-llm ask "Question" --stream --verbose

# Zero-install with uvx
uvx chuk-llm ask_claude "Hello world"
```

## Performance

ChukLLM is designed for production use with:

- **Connection pooling** for efficient HTTP management
- **Automatic retries** with exponential backoff
- **Concurrent execution** for parallel processing
- **Smart caching** for discovered models
- **Zero-overhead** session tracking (can be disabled)

### Real Benchmark Results

```bash
# Run benchmarks yourself
uv run benchmarks/llm_benchmark.py

# Results show:
# - Groq: 526 tokens/sec, 0.15s first token
# - OpenAI: 68 tokens/sec, 0.58s first token
```

## Documentation

- 📚 [Full Documentation](https://github.com/chrishayuk/chuk-llm/wiki)
- 🎯 [Examples](https://github.com/chrishayuk/chuk-llm/tree/main/examples)
- 🔄 [Migration Guide](https://github.com/chrishayuk/chuk-llm/wiki/migration)
- 📊 [Benchmarks](https://github.com/chrishayuk/chuk-llm/wiki/benchmarks)
- 🤝 [Contributing](https://github.com/chrishayuk/chuk-llm/blob/main/CONTRIBUTING.md)

## Quick Comparison

| Feature | chuk-llm | LangChain | LiteLLM | OpenAI SDK |
|---------|----------|-----------|---------|------------|
| Auto-discovery | ✅ | ❌ | ❌ | ❌ |
| Native streaming | ✅ | ⚠️ | ✅ | ✅ |
| Session tracking | ✅ Built-in | ⚠️ Manual | ❌ | ❌ |
| CLI included | ✅ | ❌ | ⚠️ Basic | ❌ |
| Provider functions | ✅ Auto-generated | ❌ | ❌ | ❌ |
| Conversations | ✅ | ✅ | ❌ | ⚠️ Manual |
| Setup complexity | Simple | Complex | Simple | Simple |
| Dependencies | Minimal | Heavy | Moderate | Minimal |

## Installation Options

| Command | Features | Use Case |
|---------|----------|----------|
| `pip install chuk_llm` | Core + Session tracking | Development |
| `pip install chuk_llm[redis]` | + Redis persistence | Production |
| `pip install chuk_llm[cli]` | + Rich CLI formatting | CLI tools |
| `pip install chuk_llm[all]` | Everything | Full features |

## License

MIT License - see [LICENSE](LICENSE) file for details.

## Support

- 🐛 [Issues](https://github.com/chrishayuk/chuk-llm/issues)
- 💬 [Discussions](https://github.com/chrishayuk/chuk-llm/discussions)
- 📧 [Email](mailto:chrishayuk@somejunkmailbox.com)

---

**Built with ❤️ for developers who just want their LLMs to work.**
