Metadata-Version: 2.4
Name: llmpromptvault
Version: 0.1.1
Summary: Version, compare, and manage your LLM prompts. No API keys required. Bring your own model.
Author-email: Ankur Srivastav <ankursrivastava98@gmail.com>
License: MIT
Keywords: llm,prompts,prompt-engineering,prompt-management,versioning,mlops,ai,comparison,testing
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: Software Development :: Version Control
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: pyyaml>=6.0
Provides-Extra: dev
Requires-Dist: pytest>=7.0; extra == "dev"
Requires-Dist: pytest-cov>=4.0; extra == "dev"
Requires-Dist: black>=23.0; extra == "dev"
Requires-Dist: build; extra == "dev"
Requires-Dist: twine; extra == "dev"
Dynamic: license-file

# LLMPromptVault 🔐

> **Version and compare your LLM prompts. No API key required. Bring your own model.**

[![PyPI version](https://badge.fury.io/py/llmpromptvault.svg)](https://pypi.org/project/llmpromptvault/)
[![Python](https://img.shields.io/badge/python-3.9%2B-blue.svg)](https://www.python.org/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)

---

## Why LLMPromptVault Exists

Prompt engineering is iterative.

You tweak wording.  
You test outputs.  
You switch models.  
You measure cost.  
You forget what worked.

LLMPromptVault gives structure to that chaos.

It helps you:

- Track prompt versions
- Log LLM responses with metrics
- Compare prompt variants side-by-side
- Analyze performance over time

---

## What LLMPromptVault Does

LLMPromptVault is a **prompt lifecycle management library** — not an LLM client.

It does three core things:

1. **Version your prompts** — track every change like Git  
2. **Log responses** — store outputs with latency and token usage  
3. **Compare prompts** — see side-by-side differences with metrics  

⚠️ **LLMPromptVault never calls any LLM.**

You call your own model — OpenAI, Claude, Gemini, Ollama, local models — and pass the response to LLMPromptVault.

---

## Installation

```bash
pip install llmpromptvault
```

Only one dependency: `pyyaml`.  
Everything else uses Python’s standard library.

---

## Quick Start

```python
from llmpromptvault import Prompt, Compare

# 1️⃣ Define two versions of a prompt
v1 = Prompt("summarize", template="Summarize this: {text}", version="v1")
v2 = Prompt("summarize", template="Summarize in 3 bullet points: {text}", version="v2")

# 2️⃣ YOU call your LLM (any model, any way you like)
r1 = your_llm(v1.render(text="Some article content..."))
r2 = your_llm(v2.render(text="Some article content..."))

# 3️⃣ Compare and log results
cmp = Compare(v1, v2)
cmp.log(r1, r2)
cmp.show()
```

Example Output:

```
────────────────────────────────────────────────────────────
  LLMPROMPTVAULT COMPARISON
────────────────────────────────────────────────────────────
  Prompt A                       summarize (v1)
  Prompt B                       summarize (v2)
────────────────────────────────────────────────────────────

  ── Response A ──
  Here is a summary of the article...

  ── Response B ──
  • Key point one
  • Key point two
  • Key point three

────────────────────────────────────────────────────────────
  Metric                           Prompt A     Prompt B
  ──────────────────────────── ──────────── ────────────
  Word count                             12           18
  Char count                             68          112
  Latency (ms)                        820.0        950.0
  Tokens                                 45           62
────────────────────────────────────────────────────────────
```

---

# Core API

## `Prompt` — Define and Version Prompts

```python
from llmpromptvault import Prompt

p = Prompt(
    name="classify",
    template="Classify this text as positive or negative: {text}",
    version="v1",
    description="Sentiment classifier",
    tags=["classify", "sentiment"],
)

# Variables required by template
p.variables()      # ['text']

# Render prompt (no LLM call happens here)
rendered = p.render(text="I love this product!")

# You call your LLM
response = your_llm(rendered)

# Log run metadata
p.log(
    rendered_prompt=rendered,
    response=response,
    model="gpt-4o-mini",
    latency_ms=820,
    tokens=45,
)

# Aggregate statistics
p.stats()

# Raw run history
p.runs(last_n=10)
```

---

## Versioning

```python
# Create a new version — v1 automatically preserved
v2 = p.update(
    new_template="You are a sentiment expert. Classify as positive/negative/neutral: {text}"
)

# View full history
p.history()
```

Versioning is explicit and controlled — nothing happens automatically.

---

## Save & Load YAML

Export prompts as human-readable YAML:

```python
p.save("prompts/classify.yaml")
```

Load anywhere:

```python
p = Prompt.load("prompts/classify.yaml")
```

Example YAML:

```yaml
name: classify
version: v1
description: Sentiment classifier
template: "Classify this text as positive or negative: {text}"
tags:
  - classify
  - sentiment
```

---

## `Compare` — Side-by-Side Prompt Evaluation

```python
from llmpromptvault import Compare

cmp = Compare(v1, v2)

cmp.log(
    response_a=response_v1,
    response_b=response_v2,
    model="gpt-4o",
    latency_ms_a=820,
    latency_ms_b=950,
    tokens_a=45,
    tokens_b=62,
)

cmp.show()
cmp.diff()
cmp.summary()
```

You can:

- Compare output length
- Compare token usage
- Compare latency
- Aggregate results across multiple runs

---

## `Registry` — Share Prompts Across Projects

```python
from llmpromptvault import Registry

reg = Registry("./shared_prompts")

reg.push(v1)
reg.push(v2)

reg.list()
reg.versions("classify")

latest = reg.pull("classify")
specific = reg.pull("classify", "v1")

reg.delete("classify", "v1")
```

Registry is useful for:

- Team collaboration
- Shared prompt libraries
- Reproducible experiments

---

# Works With Any LLM

Because LLMPromptVault never calls an LLM itself, it works with anything.

### OpenAI

```python
from openai import OpenAI

client = OpenAI()

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": rendered}]
)

p.log(rendered, response.choices[0].message.content, model="gpt-4o")
```

### Anthropic

```python
import anthropic

client = anthropic.Anthropic(api_key="...")

response = client.messages.create(
    model="claude-haiku-4-5-20251001",
    max_tokens=1024,
    messages=[{"role": "user", "content": rendered}]
)

p.log(rendered, response.content[0].text, model="claude-haiku-4-5-20251001")
```

### Ollama (Local)

```python
import requests

response = requests.post(
    "http://localhost:11434/api/generate",
    json={"model": "llama3", "prompt": rendered, "stream": False}
)

p.log(rendered, response.json()["response"], model="llama3")
```

---

# Typical Project Structure

```
your_project/
├── prompts/
│   └── classify.yaml
├── .promptvault/
│   ├── history.json
│   └── runs.db
└── main.py
```

Add `.promptvault/` to `.gitignore` to keep run logs local,
or commit it to share analytics with your team.


---

# License

MIT © LLMPromptVault Contributors
