Metadata-Version: 2.4
Name: verd
Version: 0.1.1
Summary: Multi-LLM debate CLI for confident answers
Author: Manas Karra
License: MIT
Keywords: llm,debate,verdict,ai,multi-model,code-review
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Topic :: Software Development :: Quality Assurance
Requires-Python: >=3.11
Description-Content-Type: text/markdown
Requires-Dist: openai>=1.0.0
Requires-Dist: rich>=13.0.0
Requires-Dist: python-dotenv>=1.0.0
Requires-Dist: mcp>=1.0.0
Provides-Extra: slack
Requires-Dist: slack-bolt>=1.18.0; extra == "slack"
Requires-Dist: httpx>=0.27.0; extra == "slack"

# verd

Multi-LLM debate for confident answers. Takes any content + a question, runs it through multiple AI models in a structured multi-round debate, and returns a confidence-weighted verdict with strengths, issues, and fixes.

Instead of asking one AI "are you sure?", verd spawns multiple models from different families, has them challenge each other across rounds, then a stronger judge synthesizes the final verdict.

## Install

```bash
pip install verd
```

## Setup

verd works with any OpenAI-compatible API. Pick one:

### Option 1: OpenRouter (easiest, all models, one key)

Sign up at [openrouter.ai](https://openrouter.ai), get an API key, then:

```bash
export OPENAI_API_KEY=sk-or-...
export OPENAI_BASE_URL=https://openrouter.ai/api/v1
```

### Option 2: Direct OpenAI

```bash
export OPENAI_API_KEY=sk-...
export OPENAI_BASE_URL=https://api.openai.com/v1
```

Note: only OpenAI models will work. Edit `verd/models.py` to use only OpenAI models.

### Option 3: LiteLLM proxy (use native keys from any provider)

If you have API keys from multiple providers (Anthropic, Google, OpenAI, etc.):

```bash
pip install litellm
litellm --config litellm_config.yaml  # starts local proxy on port 4000
```

Example `litellm_config.yaml`:
```yaml
model_list:
  - model_name: claude-sonnet-4-6
    litellm_params:
      model: anthropic/claude-sonnet-4-20250514
      api_key: sk-ant-...
  - model_name: gpt-5-mini
    litellm_params:
      model: openai/gpt-5-mini
      api_key: sk-...
  - model_name: gemini-2.5-flash
    litellm_params:
      model: gemini/gemini-2.5-flash
      api_key: AIza...
```

Then point verd at it:
```bash
export OPENAI_API_KEY=sk-anything
export OPENAI_BASE_URL=http://localhost:4000/v1
```

### Option 4: Any OpenAI-compatible provider

Azure OpenAI, Together, Groq, Fireworks, etc. — just set the base URL and API key.

### Save to .env

Or create a `.env` file in your working directory:
```bash
cp .env.example .env
# edit with your keys
```

### Custom models

Edit `verd/models.py` to match whatever models your provider supports. The default config uses models available through LiteLLM proxies and OpenRouter.

## Usage

```bash
# Auto-scan current directory
cd backend && verd "is this production-ready?"

# Single file
verd "is this JWT implementation secure?" -f auth.py

# Multiple files
verd "any issues?" -f auth.py middleware.py routes.py

# Directory
verd "is this codebase sound?" -d src/ --ext .py

# Inline question
verdl "is O(n^2) acceptable for n=1000?"

# Git diffs
verd "are these changes safe?" -g              # unstaged
verd "ready to commit?" -gs                    # staged
verdh "should we merge this?" -gb main         # branch diff

# Pipe
cat auth.py | verd "is this secure?"

# Quiet mode (verdict only, no transcript)
verd "any bugs?" -f app.py -q

# JSON output
verd "any bugs?" -f app.py --json
```

## Modes

| Command | Models | Rounds | Speed | Cost |
|---------|--------|--------|-------|------|
| `verdl` | 2 | 1 | ~10s | ~$0.01 |
| `verd` | 4 | 2 | ~30s | ~$0.05 |
| `verdh` | 5 + web search | 3 | ~70s | ~$0.30 |

## Flags

```
claim                  the question to evaluate (required)

Content input (pick one, or auto-scans current dir):
  -c, --context TEXT     inline content string
  -f FILE [FILE ...]     one or more files
  -d [DIR]               directory (default: current dir)
  -g, --git              unstaged git diff
  -gs, --git-staged      staged git diff
  -gb, --git-branch REF  git diff REF...HEAD

Directory filters:
  --ext EXT [EXT ...]    filter by extension (.py .ts)
  --exclude PATTERN      glob pattern to exclude (test_*)

Output:
  -q, --quiet            hide debate transcript, show only verdict
  --json                 raw JSON output
  --timeout SECONDS      override timeout per model call
  --version              show version
```

## Exit Codes

- `0` — PASS
- `1` — FAIL
- `2` — UNCERTAIN

Useful for scripting: `verd "are tests passing?" -f test.py && deploy`

## MCP — Claude Code / Cursor

Add to `~/.claude.json` or `~/.cursor/mcp.json`:

```json
{
  "mcpServers": {
    "verd": {
      "command": "verd-mcp",
      "env": {
        "OPENAI_API_KEY": "your-key",
        "OPENAI_BASE_URL": "https://openrouter.ai/api/v1"
      }
    }
  }
}
```

Then use `verd`, `verdl`, or `verdh` as tools directly in chat.

## Slack

Install with Slack dependencies:

```bash
pip install verd[slack]
```

Create a Slack app with Socket Mode, add bot scopes (`app_mentions:read`, `channels:history`, `chat:write`, `reactions:write`, `im:history`, `im:write`, `users:read`), then:

```bash
export SLACK_BOT_TOKEN=xoxb-...
export SLACK_APP_TOKEN=xapp-...
export SLACK_SIGNING_SECRET=...
verd-slack
```

Usage in Slack:
- `@verd what do you think?` — reads thread/channel context, debates, replies
- `@verd deep is this secure?` — uses verdh (5 models + web search)
- `@verd quick is this right?` — uses verdl (fast)
- `@verd last 50 what's the consensus?` — reads last 50 messages
- `/verd should we use Kafka?` — slash command with progress updates
- `/verdl is this correct?` — quick slash command
- `/verdh any security issues?` — deep slash command

## How it works

1. Your question + content gets sent to multiple AI models in parallel
2. Each model gives its independent assessment (PASS/FAIL/UNCERTAIN)
3. Models see each other's responses and cross-examine for 1-3 rounds
4. A stronger judge model synthesizes the debate into a final verdict
5. You get: verdict, confidence %, strengths, issues, and actionable fixes

The key insight: different models catch different things. Claude spots security issues GPT misses. Gemini catches logic errors DeepSeek overlooks. The debate format forces them to challenge each other rather than just agreeing.
