Metadata-Version: 2.4
Name: flux-llm-kai
Version: 0.1.0
Summary: A functional programming LLM client with tool calling
Author-email: kai4avaya <kai4avaya@example.com>
Project-URL: Homepage, https://github.com/kai4avaya/flux--llm
Project-URL: Repository, https://github.com/kai4avaya/flux--llm
Project-URL: Issues, https://github.com/kai4avaya/flux--llm/issues
Requires-Python: >=3.8
Description-Content-Type: text/markdown
Requires-Dist: httpx
Requires-Dist: pydantic
Requires-Dist: asyncio
Requires-Dist: typing

# LLM Client - Enhanced Functional Programming Architecture

A powerful, functional programming approach to LLM orchestration with tool calling, provider routing, and automatic failover capabilities.

## Architecture Overview

The LLM Client follows a **functional programming architecture** with clear separation of concerns:

```
┌─────────────────────────────────────────────────────────────┐
│                    LLM Client Architecture                  │
├─────────────────────────────────────────────────────────────┤
│  Core Layer (orchestrator_v2.py)                           │
│  ├── Data Structures (LLMRequest, LLMResponse, etc.)       │
│  ├── Tool Functions (execute_tool_call, get_available_tools)│
│  ├── Router Service (ProviderRouter, RoutingStrategy)      │
│  └── Core Orchestrator Functions (4 main functions)        │
├─────────────────────────────────────────────────────────────┤
│  Integration Layer (__init__.py)                           │
│  ├── Low-level Functions (requests.py, serialization.py)   │
│  ├── Tool System (tools/)                                  │
│  └── Public API (6 core functions)                         │
├─────────────────────────────────────────────────────────────┤
│  Provider Layer (requests.py, serialization.py)            │
│  ├── HTTP Request Building                                 │
│  ├── Response Parsing (OpenAI, Google formats)             │
│  └── Streaming Support                                     │
└─────────────────────────────────────────────────────────────┘
```

## Key Design Principles

### 1. **Functional Programming**
- **Pure Functions**: No side effects, predictable behavior
- **Composability**: Functions can be easily combined and pipelined
- **Immutability**: Data structures are immutable, no hidden state
- **Stateless**: No instance variables to manage

### 2. **Automatic Tool Detection**
- Tools are automatically detected from `ToolMeta.registry`
- No manual tool management required
- Smart inclusion based on `tools_enabled` flag

### 3. **Multiple Routing Strategies**
- **PRIORITY**: Try providers in priority order (default)
- **RANDOM**: Start with random provider, then priority
- **CYCLE**: Cycle through providers continuously

### 4. **Proper Tool Call Parsing**
- **OpenAI Format**: Parses `tool_calls` with `function.name` and `function.arguments`
- **Google Format**: Parses `function_call` with `name` and `args`
- **Automatic Detection**: Based on provider, uses correct parsing format

## Core Components

### Data Structures

```python
@dataclass
class LLMRequest:
    messages: List[Dict[str, str]]
    model: Optional[str] = None
    temperature: float = 0.7
    max_tokens: Optional[int] = None
    tools_enabled: bool = True
    metadata: Dict[str, Any] = None

@dataclass
class ProviderConfig:
    name: str
    priority: int
    status: ProviderStatus = ProviderStatus.AVAILABLE
    retry_count: int = 0
    max_retries: int = 3
    backoff_seconds: int = 60
```

### Core Functions (Reduced Set)

#### 1. **stream_llm_response()**
Stream from a single provider with proper tool call parsing.

```python
async for chunk_type, content in stream_llm_response(request, "openai"):
    if chunk_type == "t":  # text
        print(content)
    elif chunk_type == "f":  # function call
        print(f"Function: {content}")
```

#### 2. **stream_with_router()**
Stream with automatic failover and routing strategies.

```python
router = create_router(["openai", "google_gemini"], strategy=RoutingStrategy.PRIORITY)
async for chunk_type, content, provider in stream_with_router(request, router):
    print(f"[{provider}]: {content}")
```

#### 3. **chat_with_tools()**
Single provider chat with automatic tool execution.

```python
async for chunk_type, content in chat_with_tools(request, "openai", max_iterations=5):
    print(content)
```

#### 4. **chat_with_tools_and_router()**
Multi-provider chat with tools and failover.

```python
async for chunk_type, content, provider in chat_with_tools_and_router(request, router):
    print(f"[{provider}]: {content}")
```

### Convenience Functions

#### 1. **quick_chat()**
Simple single-provider chat.

```python
response = await quick_chat("Hello!", "openai", tools_enabled=True)
```

#### 2. **quick_chat_with_router()**
Simple chat with routing strategies.

```python
response, provider = await quick_chat_with_router(
    "Hello!", 
    ["openai", "google_gemini"],
    strategy=RoutingStrategy.RANDOM
)
```

## Routing Strategies

### PRIORITY Strategy (Default)
```python
# Try providers in priority order: openai -> google_gemini -> anthropic
router = create_router(
    ["openai", "google_gemini", "anthropic"],
    strategy=RoutingStrategy.PRIORITY
)
```

### RANDOM Strategy
```python
# Start with random provider, then follow priority order
router = create_router(
    ["openai", "google_gemini", "anthropic"],
    strategy=RoutingStrategy.RANDOM
)
```

### CYCLE Strategy
```python
# Cycle through providers continuously
router = create_router(
    ["openai", "google_gemini", "anthropic"],
    strategy=RoutingStrategy.CYCLE
)

# Use with max_cycles parameter
async for chunk_type, content, provider in stream_with_router(request, router, max_cycles=3):
    print(f"[{provider}]: {content}")
```

## Tool System Integration

### Tool Registration
```python
from utils.llm_client import Tool

@Tool
def calculator(expression: str) -> str:
    """Calculate a mathematical expression safely."""
    try:
        result = eval(expression)
        return f"Result: {result}"
    except Exception as e:
        return f"Error: {str(e)}"
```

### Automatic Tool Detection
```python
# Tools are automatically detected and included
request = LLMRequest(
    messages=[{"role": "user", "content": "What's 5 * 7?"}],
    tools_enabled=True  # Automatically includes available tools
)
```

### Tool Call Execution
```python
# Tool calls are automatically executed and results fed back to LLM
async for chunk_type, content in chat_with_tools(request, "openai"):
    print(content)  # Includes both LLM response and tool results
```

## Provider Support

### Supported Providers
- **OpenAI** (GPT models)
- **Google Gemini** (Gemini models)
- **Anthropic** (Claude models)
- **OpenRouter** (Multiple models)

### Provider Configuration
```python
# Custom provider configurations
providers = [
    ProviderConfig(
        name="openai",
        priority=0,
        status=ProviderStatus.AVAILABLE,
        max_retries=2,
        backoff_seconds=30
    ),
    ProviderConfig(
        name="google_gemini",
        priority=1,
        status=ProviderStatus.AVAILABLE,
        max_retries=3,
        backoff_seconds=60
    )
]

router = ProviderRouter(providers, RoutingStrategy.PRIORITY)
```

## Error Handling & Resilience

### Automatic Failover
- If a provider fails, automatically try the next available provider
- Configurable retry limits and backoff periods
- Status tracking for each provider

### Tool Call Error Handling
- Graceful handling of malformed tool calls
- Error messages returned to LLM for context
- Exception handling for tool execution failures

### Retry Logic
```python
# Custom retry configuration
provider = ProviderConfig(
    name="openai",
    max_retries=3,
    backoff_seconds=60  # Wait 60 seconds before retry
)
```

## Usage Examples

### Basic Usage
```python
from utils.llm_client import quick_chat, quick_chat_with_router, RoutingStrategy

# Simple chat
response = await quick_chat("Hello!", "openai")

# Chat with failover
response, provider = await quick_chat_with_router(
    "Hello!", 
    ["openai", "google_gemini"]
)
```

### Advanced Usage
```python
from utils.llm_client import (
    LLMRequest, create_router, stream_with_router, 
    RoutingStrategy, Tool
)

@Tool
def get_weather(city: str) -> str:
    """Get weather for a city."""
    return f"Weather in {city}: Sunny, 72°F"

# Create request
request = LLMRequest(
    messages=[{"role": "user", "content": "What's the weather in NYC?"}],
    tools_enabled=True
)

# Create router
router = create_router(
    ["openai", "google_gemini"],
    strategy=RoutingStrategy.RANDOM
)

# Stream with failover
async for chunk_type, content, provider in stream_with_router(request, router):
    if chunk_type == "t":
        print(f"[{provider}]: {content}")
```

### Custom Pipeline
```python
# Create custom pipeline with middleware
async def my_pipeline(request):
    router = create_router(["openai", "google_gemini"])
    
    async for chunk_type, content, provider in stream_with_router(request, router):
        # Custom processing
        processed_content = process_chunk(content)
        yield chunk_type, processed_content, provider

# Use the pipeline
async for chunk_type, content, provider in my_pipeline(request):
    print(f"[{provider}]: {content}")
```

## Benefits

1. **Simplified API**: Only 6 core functions instead of many
2. **Automatic Detection**: Tools are automatically detected and included
3. **Flexible Routing**: Multiple routing strategies for different use cases
4. **Proper Parsing**: Correct tool call parsing for different providers
5. **Cycle Support**: Can cycle through providers for load balancing
6. **Error Resilience**: Automatic failover and retry logic
7. **Functional Composition**: Easy to compose and extend
8. **Provider Agnostic**: Works with multiple LLM providers
9. **Tool Integration**: Seamless tool calling with automatic execution
10. **Performance**: Efficient streaming and concurrent tool execution

## Migration from OOP Approach

### Before (OOP)
```python
client = LLMClient("openai")
client.enable_tools()
async for chunk in client.chat_with_tools(messages):
    print(chunk)
```

### After (Functional)
```python
request = LLMRequest(messages=messages, tools_enabled=True)
async for chunk_type, content in chat_with_tools(request, "openai"):
    if chunk_type == "t":
        print(content)
```

The functional approach is more flexible, composable, and provides better separation of concerns while maintaining simplicity.
