Metadata-Version: 2.4
Name: leanprompt
Version: 0.4.1
Summary: A FastAPI-based LLM integration framework for engineering-centric AI development.
Home-page: https://github.com/yjkwon_wm2m/leanprompt
Author: Youngjune Kwon
Author-email: yjkwon@winm2m.com
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Framework :: FastAPI
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: fastapi
Requires-Dist: uvicorn
Requires-Dist: pydantic
Requires-Dist: httpx
Requires-Dist: pyyaml
Requires-Dist: jinja2
Dynamic: author
Dynamic: author-email
Dynamic: classifier
Dynamic: description
Dynamic: description-content-type
Dynamic: home-page
Dynamic: license-file
Dynamic: requires-dist
Dynamic: requires-python
Dynamic: summary

# LeanPrompt (Backend)

**LeanPrompt** is an engineering-centric LLM integration framework based on FastAPI. It helps you use LLMs as reliable and predictable software components, not just text generators.

## ✨ Key Features

* **FastAPI Native:** Integrates instantly into existing FastAPI apps as a plugin.
* **Markdown-Driven Prompts:** Manage prompts as `.md` files, separated from code. Filenames become API paths.
* **Session-Based Context Caching:** Saves token costs by sending prompts only at the start of a session and then sending only input deltas.
* **Output Guardrails:** Built-in output validation and automatic retry logic via Pydantic models.
* **WebSocket First:** Highly optimized WebSocket support for real-time streaming feedback.

## 🚀 Quick Start

### Installation

```bash
pip install leanprompt
```

### Basic Usage

```python
from fastapi import FastAPI
from leanprompt import LeanPrompt, Guard
from pydantic import BaseModel
import os

app = FastAPI()

# Initialize LeanPrompt with your preferred provider
# Configure via environment variable: LEANPROMPT_LLM_PROVIDER="provider|api_key"
provider_env = os.getenv("LEANPROMPT_LLM_PROVIDER", "openai|dummy_key")
provider_name, api_key = provider_env.split("|", 1)

lp = LeanPrompt(app, provider=provider_name, prompt_dir="prompts", api_key=api_key)

# Define output model for validation
class CalculationResult(BaseModel):
    result: int

# Create a calculator endpoint
@lp.route("/calc/add", prompt_file="add.md")
@Guard.validate(CalculationResult)
async def add(user_input: str):
    """Performs addition based on user input."""
    pass  # LeanPrompt handles the logic
```

### API Prefix and WebSocket Path

You can apply a shared prefix to all LeanPrompt routes and the WebSocket endpoint:

```python
app = FastAPI()

lp = LeanPrompt(
    app,
    provider=provider_name,
    prompt_dir="prompts",
    api_key=api_key,
    api_prefix="/api",
    ws_path="ws",  # relative -> /api/ws/{client_id}
)

@lp.route("/calc/add", prompt_file="add.md")
async def add(user_input: str):
    pass
```

Clients can keep using the same LeanPrompt path value (`/calc/add`) while connecting to
`ws://localhost:8000/api/ws/{client_id}`.

Using an absolute `ws_path` (e.g., `"/ws"`) keeps the WebSocket route outside the
`api_prefix`. Avoid `ws_path="/"` to prevent route collisions.

If you already configure a FastAPI router prefix, LeanPrompt can attach to it directly:

```python
app = FastAPI()
api = FastAPI()
app.mount("/api", api)

lp = LeanPrompt(
    api,
    provider=provider_name,
    prompt_dir="prompts",
    api_key=api_key,
    ws_path="/ws",  # -> /api/ws/{client_id}
)
```

### JWT Annotation Example

LeanPrompt routes can reuse a JWT validator annotation for HTTP requests:

```python
from fastapi import Request
from leanprompt import Guard

def require_jwt(request: Request) -> bool:
    # Example only. Insecure for production; validate signature, expiry, and claims.
    # Example: jwt.decode(token, key, algorithms=["HS256"])
    return bool(request.headers.get("authorization"))

@lp.route("/secure/add", prompt_file="add.md")
@Guard.auth(require_jwt)
@Guard.validate(CalculationResult)
async def secure_add(user_input: str):
    pass
```

For WebSocket authentication, pass a validation hook when you construct `LeanPrompt`:

```python
from fastapi import WebSocket

def require_ws_jwt(websocket: WebSocket) -> bool:
    # Example only. Insecure for production; validate signature, expiry, and claims.
    # Example: jwt.decode(token, key, algorithms=["HS256"])
    return bool(websocket.headers.get("authorization"))

lp = LeanPrompt(
    app,
    provider=provider_name,
    prompt_dir="prompts",
    api_key=api_key,
    ws_auth=require_ws_jwt,
)
```

### WebSocket Interceptors

You can intercept inbound/outbound WebSocket messages for metering, auditing, or billing.
If the request interceptor returns `False` or `{"error": "..."}`, the request is blocked and
the error payload is returned immediately.

Interceptor signature:

```python
def interceptor(websocket: WebSocket, event: dict):
    ...
```

Event payload shape:

```json
{
  "direction": "inbound" | "outbound",
  "client_id": "...",
  "path": "/route",
  "payload": { "path": "/route", "message": "..." } | { "response": "...", "path": "/route" },
  "raw": "{...}",
  "byte_length": 123
}
```

Return behavior:

- Request interceptor (`ws_request_interceptor`)
  - Return `None` / no return: request continues to normal processing.
  - Return `False`: request is blocked and `{ "error": "WebSocket request rejected" }` is sent.
  - Return `{ "error": "..." }`: request is blocked and the dict is sent as-is (path is added if missing).
  - Raise an exception: treated as blocked and `{ "error": "<exception message>" }` is sent.
- Response interceptor (`ws_response_interceptor`)
  - Return value is ignored; it never blocks the response.
  - Exceptions are logged and the response still proceeds.

```python
from fastapi import WebSocket

billing_state = {
    "credits": 10_000,  # bytes
    "usage": 0,
}

def ws_billing(websocket: WebSocket, event: dict):
    # event keys: direction, client_id, path, payload, raw, byte_length
    if event["direction"] == "inbound":
        projected = billing_state["usage"] + event["byte_length"]
        if projected > billing_state["credits"]:
            return {"error": "Billing failed: insufficient credits", "code": "billing_failed"}
        billing_state["usage"] = projected
    else:
        billing_state["usage"] += event["byte_length"]

lp = LeanPrompt(
    app,
    provider=provider_name,
    prompt_dir="prompts",
    api_key=api_key,
    ws_request_interceptor=ws_billing,
    ws_response_interceptor=ws_billing,
)
```

### Complete Example Server

Here's a full example with multiple endpoints:

```python
from fastapi import FastAPI
from leanprompt import LeanPrompt, Guard
from pydantic import BaseModel
import os

# Define output models
class MoodJson(BaseModel):
    current_mood: str
    confidence: float
    reason: str

class CalculationResult(BaseModel):
    result: int

app = FastAPI()

# Initialize LeanPrompt
provider_env = os.getenv("LEANPROMPT_LLM_PROVIDER", "openai|dummy_key")
provider_name, api_key = provider_env.split("|", 1)
lp = LeanPrompt(app, provider=provider_name, prompt_dir="examples/prompts", api_key=api_key)

@lp.route("/calc/add", prompt_file="add.md")
@Guard.validate(CalculationResult)
async def add(user_input: str):
    """Performs addition based on user input."""
    pass

@lp.route("/calc/multiply", prompt_file="multiply.md")
@Guard.validate(CalculationResult)
async def multiply(user_input: str):
    """Performs multiplication based on user input."""
    pass

@lp.route("/mood/json", prompt_file="mood_json.md")
@Guard.validate(MoodJson)
async def get_mood_json(user_input: str):
    """Returns the mood analysis in JSON format."""
    pass

# Custom validation for markdown content
def validate_markdown_content(text: str):
    if "##" not in text and "**" not in text:
        raise ValueError("Response does not look like Markdown")
    if "Meanings" not in text:
        raise ValueError("Missing required section: 'Meanings'")
    return {"raw_markdown": text}

@lp.route("/linguist", prompt_file="word_relationships.md")
@Guard.custom(validate_markdown_content)
async def analyze_words(user_input: str):
    """Analyzes word relationships and returns markdown."""
    pass
```

### Using Local LLM (Ollama)

You can use local LLMs like Qwen 2.5 Coder or DeepSeek-Coder-V2 via [Ollama](https://ollama.com).

1.  Install and run Ollama:
    ```bash
    ollama run qwen2.5-coder
    ```

2.  Initialize LeanPrompt with `ollama` provider:
    ```python
    lp = LeanPrompt(
        app, 
        provider="ollama", 
        base_url="http://localhost:11434", # Optional, defaults to this
        model="qwen2.5-coder" # Specify the model name here or in prompt frontmatter
    )
    ```

### Supported Providers

LeanPrompt supports multiple LLM providers:

- **OpenAI**: `provider="openai"`
- **DeepSeek**: `provider="deepseek"`
- **Google Gemini**: `provider="google"`
- **Ollama (Local)**: `provider="ollama"`

## 📂 Project Structure

```
leanprompt/
├── leanprompt/          # Main library code
│   ├── core.py          # Core logic (FastAPI integration)
│   ├── guard.py         # Validation logic
│   └── providers/       # LLM provider implementations
├── examples/            # Usage examples
│   ├── main.py          # Example FastAPI app
│   └── prompts/         # Example prompt files
├── tests/               # Unit tests
├── setup.py             # Package installation script
└── requirements.txt     # Dependencies
```

## 🏃 Running the Example

1.  **Install Dependencies:**
    ```bash
    pip install -r requirements.txt
    ```

2.  **Set Environment Variable:**
    ```bash
    # Format: provider|api_key
    export LEANPROMPT_LLM_PROVIDER="openai|your_openai_api_key"
    
    # Or for DeepSeek:
    export LEANPROMPT_LLM_PROVIDER="deepseek|your_deepseek_api_key"
    ```

3.  **Run the Example Server:**
    ```bash
    # Run from the root directory
    export PYTHONPATH=$PYTHONPATH:$(pwd)
    python examples/main.py
    ```

## 📡 API Examples

### HTTP Endpoints

**Calculation (Add):**
```bash
curl -X POST "http://localhost:8000/calc/add" \
     -H "Content-Type: application/json" \
     -d '{"message": "50 + 50"}'
# Response: {"result": 100}
```

**Calculation (Multiply):**
```bash
curl -X POST "http://localhost:8000/calc/multiply" \
     -H "Content-Type: application/json" \
     -d '{"message": "10 * 5"}'
# Response: {"result": 50}
```

**Mood Analysis (JSON):**
```bash
curl -X POST "http://localhost:8000/mood/json" \
     -H "Content-Type: application/json" \
     -d '{"message": "I am feeling great today!"}'
# Response: {"current_mood": "Happy", "confidence": 0.9, "reason": "Positive language used"}
```

**Word Relationship Analysis:**
```bash
curl -X POST "http://localhost:8000/linguist" \
     -H "Content-Type: application/json" \
     -d '{"message": "apple, banana, cherry"}'
# Response: Markdown formatted analysis with meanings and relationships
```

### WebSocket Interface

LeanPrompt provides a WebSocket interface for real-time streaming and context management:

```python
import websocket
import json

def on_message(ws, message):
    response = json.loads(message)
    print(f"Path: {response.get('path')}")
    print(f"Response: {response['response']}")

ws = websocket.WebSocketApp(
    "ws://localhost:8000/ws/test_client",
    on_message=on_message
)

# Send different requests to test routing and context
ws.send(json.dumps({"path": "/add", "message": "10 + 20"}))
ws.send(json.dumps({"path": "/multiply", "message": "5 * 5"}))
ws.send(json.dumps({"path": "/linguist", "message": "apple, banana, cherry"}))
ws.send(json.dumps({"path": "/linguist", "message": "What color are they?"}))
```

### Context Chaining Example

The WebSocket interface maintains separate conversation contexts for each path:

```python
# First message to /linguist path
ws.send(json.dumps({
    "path": "/linguist", 
    "message": "apple, banana, cherry"
}))

# Follow-up message - AI remembers the previous context
ws.send(json.dumps({
    "path": "/linguist", 
    "message": "What color are they?"
}))
# Response will mention red, yellow, etc. showing context memory
```

## 📝 Prompt Templates

LeanPrompt uses markdown files with frontmatter for prompt templates:

**Example: `add.md`**
```markdown
---
model: deepseek-chat
temperature: 0.1
---
You are a calculator.
Perform the addition requested by the user.
Return the result in valid JSON format matching this schema:
{"result": integer}

Example:
User: 1 + 1
AI: {"result": 2}

Only return the JSON object.
```

**Example: `word_relationships.md`**
```markdown
---
model: deepseek-chat
---
You are a helpful linguist.
The user will provide three English words.
Please provide the meaning of each word and explain the relationships between them.
Return the response in Markdown format.
Use headers like "## Meanings" and "## Relationships" to structure your response.
```

## 🛡️ Output Validation

LeanPrompt provides built-in output validation using Pydantic models:

```python
from pydantic import BaseModel
from leanprompt import Guard

class MoodResponse(BaseModel):
    mood: str
    intensity: int  # 1-10
    notes: str

@lp.route("/mood", prompt_file="mood.md")
@Guard.validate(MoodResponse)
async def analyze_mood(user_input: str):
    pass  # Automatically validates and converts LLM response
```

For custom validation logic:
```python
def validate_markdown(text: str):
    if "##" not in text:
        raise ValueError("Invalid markdown format")
    return text

@lp.route("/custom", prompt_file="custom.md")
@Guard.custom(validate_markdown)
async def custom_endpoint(user_input: str):
    pass
```
