Metadata-Version: 2.4
Name: scalexi_llm
Version: 0.1.1
Summary: A comprehensive multi-provider LLM proxy library with unified interface
Author: scalex_innovation
Keywords: llm,ai,openai,anthropic,gemini,groq,deepseek,grok,qwen,exa,proxy,api
Description-Content-Type: text/markdown
Requires-Dist: openai
Requires-Dist: anthropic
Requires-Dist: google-genai
Requires-Dist: groq
Requires-Dist: pymupdf
Requires-Dist: xai-sdk
Requires-Dist: python-docx
Requires-Dist: pydantic
Requires-Dist: python-dotenv
Requires-Dist: exa_py
Dynamic: author
Dynamic: description
Dynamic: description-content-type
Dynamic: keywords
Dynamic: requires-dist
Dynamic: summary

# ScaleXI LLM

A comprehensive multi-provider LLM proxy library that provides a unified interface for interacting with various AI models from different providers.

## 🚀 Features

- **Multi-Provider Support**: OpenAI, Anthropic (Claude), Google (Gemini), Groq, DeepSeek, Qwen, Grok
- **60+ Model Configurations**: Detailed pricing, context lengths, and capabilities for each model
- **Structured Output**: Pydantic schema support with intelligent fallbacks
- **Vision Capabilities**: Image analysis with automatic fallback for non-vision models
- **File Processing**: PDF, DOCX, TXT, and JSON file analysis
- **Web Search Integration**: Built-in web search via Exa API
- **Cost Tracking**: Automatic token usage and cost calculation
- **Robust Fallbacks**: Multi-level fallback mechanisms for reliability
- **Comprehensive Testing**: Built-in benchmarking across all providers

## 📦 Installation

```bash
# Clone the repository
git clone <repository-url>
cd scalexi_llm

# Install dependencies
pip install -r requirements.txt
```

### Dependencies

```
openai
anthropic
google-genai
groq
pymupdf
xai-sdk
python-docx
pydantic
python-dotenv
exa-py
```

## ⚙️ Configuration

Create a `.env` file in the project root with your API keys:

```env
OPENAI_API_KEY=your_openai_key
ANTHROPIC_API_KEY=your_anthropic_key
GEMINI_API_KEY=your_gemini_key
GROQ_API_KEY=your_groq_key
DEEPSEEK_API_KEY=your_deepseek_key
QWEN_API_KEY=your_qwen_key
GROK_API_KEY=your_grok_key
EXA_API_KEY=your_exa_key
```

You can just include the API keys for providers you want to use. Note that some providers fallback to Gemini in certain cases. (see documentation for more details)

## 🔧 Quick Start

```python
from scalexi_llm.scalexi_llm import LLMProxy

# Initialize the proxy
llm = LLMProxy()

# Basic usage
response, execution_time, token_usage, cost = llm.ask_llm(
    model_name="gemini-2.5-flash",
    system_prompt="You are a helpful assistant.",
    user_prompt="Explain quantum computing in simple terms"
)

print(f"Response: {response}")
print(f"Cost: ${cost:.6f}")
print(f"Tokens used: {token_usage['total_tokens']}")
```

## 💡 Usage Examples

### Structured Output

```python
from pydantic import BaseModel, Field
from typing import List

class Recipe(BaseModel):
    name: str = Field(description="Recipe name")
    ingredients: List[str] = Field(description="Required ingredients")
    steps: List[str] = Field(description="Cooking steps")
    cooking_time: int = Field(description="Cooking time in minutes")

response, _, _, _ = llm.ask_llm(
    model_name="gpt-4o-latest",
    user_prompt="Create a recipe for chocolate chip cookies",
    schema=Recipe
)
```

### Image Analysis

```python
response, _, _, _ = llm.ask_llm(
    model_name="moonshotai/kimi-k2-instruct",
    user_prompt="Analyze this image and describe what you see",
    image_path="photo.jpg"
)
```

### File Processing

```python
response, _, _, _ = llm.ask_llm(
    model_name="claude-3-5-sonnet-latest",
    user_prompt="Summarize this document",
    file_path="document.pdf"
)
```

### Combined Features

```python
response, _, _, _ = llm.ask_llm(
    model_name="grok-4-latest",
    system_prompt="Analyze the provided content comprehensively",
    user_prompt="Analyze this resume and find career recommendations from web.",
    file_path="resume.pdf",
    image_path="certifications.png",
    websearch=True,
    schema=AnalysisSchema
)
```

## 🔄 Fallback Mechanisms

The library implements intelligent fallback systems:

1. **Vision Fallback**: Non-vision models automatically use vision-capable models from the same provider to describe images
2. **Structured Output Fallback**: Falls back to better models from the same provider when schema validation fails

Models from certain providers fallback to Gemini in certain cases.
Check documentation for more details on fallbacks.

## 🧪 Testing

Run the comprehensive test suite:

```bash
cd testing
python combined_test.py
```

This will:
- Test all available models
- Benchmark performance across providers
- Generate detailed analysis reports
- Test combined features (file + image + web search + structured output)

Feel free to use your own files and images for testing.

## 📊 Cost Tracking

Every request returns detailed cost and token usage information:

```python
response, execution_time, token_usage, cost = llm.ask_llm(...)

# Token usage breakdown
print(token_usage)
# {
#     "prompt_tokens": 150,
#     "completion_tokens": 200,
#     "total_tokens": 350
# }

# Total cost in USD
print(f"Cost: ${cost:.6f}")
```

## 🛠️ API Reference

### `ask_llm()` Parameters

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `model_name` | str | `"gpt-4o-mini"` | Model to use for generation |
| `system_prompt` | str | `""` | System prompt for the model |
| `user_prompt` | str | `""` | User prompt/message |
| `temperature` | float | `None` | Sampling temperature (0.0-2.0) |
| `schema` | Pydantic Model | `None` | Structured output schema |
| `image_path` | str | `None` | Path to image file |
| `file_path` | str | `None` | Path to file for analysis |
| `websearch` | bool | `False` | Enable web search |
| `max_tokens` | int | `None` | Maximum tokens to generate |
| `retry_limit` | int | `1` | Number of retry attempts |
| `fallback_to_provider_best_model` | bool | `True` | Enable provider fallback |
| `fallback_to_standard_model` | bool | `True` | Enable cross-provider fallback |

### Return Values

Returns a tuple: `(response, execution_time, token_usage, cost)`

- **response**: The model's response (string or JSON)
- **execution_time**: Time taken in seconds (float)
- **token_usage**: Dictionary with token counts
- **cost**: Total cost in USD (float)

---

**ScaleXI LLM** - Unified AI model access with enterprise-grade reliability and comprehensive feature support.
