Metadata-Version: 2.4
Name: clawtrace
Version: 0.1.0
Summary: Review, score, and curate your coding agent conversation traces locally
Author: kaiaiagent
License-Expression: MIT
Keywords: claude-code,codex,gemini-cli,opencode,openclaw,dataset,conversations
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Software Development :: Libraries
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Provides-Extra: dev
Requires-Dist: pytest; extra == "dev"
Dynamic: license-file

# ClawTrace

Review, score, and curate your coding agent conversation traces locally. ClawTrace indexes session logs from Claude Code, Codex, Gemini CLI, OpenCode, OpenClaw, Kimi CLI, and Cline, redacts secrets and PII, and gives you a browser workbench to triage and export the results.

Requires Python 3.10+.

## Review & score your traces

Index your local sessions, auto-score them for quality, and open a browser workbench — all in three commands:

```bash
pip install clawtrace
clawtrace scan                                   # Index all local sessions
clawtrace score --batch --auto-triage             # AI-score sessions, auto-approve 4-5, auto-block 1-2
clawtrace serve                                   # Open workbench at localhost:8384
```

`score --batch` sends each session to Claude for evaluation (1-5 quality rating) and `--auto-triage` automatically approves high-quality sessions and blocks low-quality ones, leaving score-3 sessions for your manual review. The workbench gives you a full browser UI to read transcripts, adjust scores, search across sessions, and bundle approved traces for export.

## Give this to your agent

Paste this into Claude Code, Codex, Gemini CLI, OpenCode, OpenClaw, or any coding agent:

```
Help me review and curate my coding agent traces using ClawTrace.
Install it, set up the skill, then walk me through the process.

STEP 1 — INSTALL
  pip install clawtrace
  If that fails, ask the user where the source is.

STEP 2 — INSTALL SKILL
  clawtrace update-skill claude

STEP 3 — SCAN & SCORE
  clawtrace scan                              # Index sessions into local DB
  clawtrace score --batch --auto-triage       # AI-score + auto-approve/block

STEP 4 — REVIEW
  clawtrace serve                             # Open workbench at localhost:8384

Tell the user: "Your workbench is open at localhost:8384. Everything is 100% local.
Use the Inbox to triage traces, Search to find sessions, and Bundles to assemble exports."
```

<details>
<summary><b>Terminal workflow (works on remote VMs — no browser needed)</b></summary>

The entire review-and-share workflow runs in your terminal. Your coding agent drives these commands for you.

```bash
# 1. Scan — discover and index sessions
clawtrace scan

# 2. Review — browse and triage
clawtrace inbox --json --limit 20
clawtrace search "refactor auth" --json

# 3. Triage — approve or block
clawtrace approve <session_id> --reason "clean trace"
clawtrace block <session_id> --reason "proprietary code"

# 4. Score (optional) — AI-assisted quality scoring
clawtrace score --batch --auto-triage

# 5. Preview — review what will be shared (shows summaries + risk flags)
clawtrace share --status approved --preview

# 6. Share — after user confirms
clawtrace share --status approved --note "week 12 traces"
```

For a visual review experience: `clawtrace serve` (local) or `clawtrace serve --remote` (prints SSH tunnel command for remote VMs).

</details>

<details>
<summary><b>Manual usage (without an agent)</b></summary>

### Quick start

```bash
pip install clawtrace

# Scan and score
clawtrace scan
clawtrace score --batch --auto-triage

# Open the workbench
clawtrace serve

# Or triage from the terminal
clawtrace inbox --json --limit 20
clawtrace approve <session-id> --reason "good trace"
clawtrace block <session-id> --reason "low quality"

# Configure redactions and exclusions
clawtrace config --exclude "personal-stuff,scratch"
clawtrace config --redact-usernames "my_github_handle,my_discord_name"
clawtrace config --redact "my-domain.com,my-secret-project"

# Export locally
clawtrace export --output /tmp/clawtrace_export.jsonl

# Optional: generate structured PII findings (hybrid = rules + Claude)
clawtrace export --output /tmp/clawtrace_export.jsonl --pii-review --pii-provider hybrid

# Optional: also produce a sanitized JSONL automatically
clawtrace export --output /tmp/clawtrace_export.jsonl --pii-review --pii-apply --pii-provider hybrid
```

### Commands

| Command | Description |
|---------|-------------|
| `clawtrace scan` | Index local sessions into workbench DB |
| `clawtrace score --batch --auto-triage` | AI-score all unscored sessions, auto-approve 4-5 and block 1-2 |
| `clawtrace score --batch --limit 20` | AI-score up to 20 sessions without triage |
| `clawtrace serve` | Open workbench UI at localhost:8384 |
| `clawtrace serve --remote` | Print SSH tunnel command for remote VM access |
| `clawtrace inbox --json --limit 20` | List sessions as JSON (for agent parsing) |
| `clawtrace search <query> --json` | Full-text search across sessions |
| `clawtrace approve <id> [id ...]` | Approve sessions by ID |
| `clawtrace block <id> [id ...]` | Block sessions by ID |
| `clawtrace shortlist <id> [id ...]` | Shortlist sessions for review |
| `clawtrace bundle-create --status approved` | Create bundle from all approved sessions |
| `clawtrace bundle-list` | List all bundles |
| `clawtrace bundle-view <bundle_id>` | View bundle details and sessions |
| `clawtrace bundle-export <bundle_id>` | Export bundle to disk (JSONL + manifest) |
| `clawtrace bundle-share <bundle_id>` | Upload bundle to ClawTrace ingest service |
| `clawtrace share --status approved` | One-step: bundle + export + share |
| `clawtrace export --pii-review --pii-apply` | Export, generate findings, and produce sanitized JSONL |
| `clawtrace config --source all` | Select source scope (`claude`, `codex`, `gemini`, `opencode`, `openclaw`, `kimi`, or `all`) |
| `clawtrace config --exclude "a,b"` | Add excluded projects (appends) |
| `clawtrace config --redact "str1,str2"` | Add strings to always redact (appends) |
| `clawtrace config --redact-usernames "u1,u2"` | Add usernames to anonymize (appends) |
| `clawtrace export` | Export to local JSONL |
| `clawtrace export --no-thinking` | Exclude extended thinking blocks |
| `clawtrace list` | List all projects with exclusion status |
| `clawtrace status` | Show current stage and next steps (JSON) |
| `clawtrace update-skill claude` | Install/update the clawtrace skill for Claude Code |

</details>

<details>
<summary><b>What gets exported</b></summary>

| Data | Included | Notes |
|------|----------|-------|
| User messages | Yes | Full text (including voice transcripts) |
| Assistant responses | Yes | Full text output |
| Extended thinking | Yes | Claude's reasoning (opt out with `--no-thinking`) |
| Tool calls | Yes | Tool name + inputs + outputs |
| Token usage | Yes | Input/output tokens per session |
| Model & metadata | Yes | Model name, git branch, timestamps |

### Privacy & Redaction

ClawTrace applies multiple layers of protection:

1. **Path anonymization** — File paths stripped to project-relative
2. **Username hashing** — Your macOS username + any configured usernames replaced with stable hashes
3. **Secret detection** — Regex patterns catch JWT tokens, API keys (Anthropic, OpenAI, GitHub, AWS, etc.), database passwords, private keys, Discord webhooks, and more
4. **Entropy analysis** — Long high-entropy strings in quotes are flagged as potential secrets
5. **Email redaction** — Personal email addresses removed
6. **Custom redaction** — You can configure additional strings and usernames to redact
7. **Tool call redaction** — Secrets in tool inputs and outputs are redacted

**This is NOT foolproof.** Always review your exported data before sharing.
Automated redaction cannot catch everything — especially service-specific
identifiers, third-party PII, or secrets in unusual formats.

To help improve redaction, report issues: https://github.com/kaiaiagent/clawtrace/issues

</details>

<details>
<summary><b>Data schema</b></summary>

Each line in `conversations.jsonl` is one session:

```json
{
  "session_id": "abc-123",
  "project": "my-project",
  "model": "claude-opus-4-6",
  "git_branch": "main",
  "start_time": "2025-06-15T10:00:00+00:00",
  "end_time": "2025-06-15T10:30:00+00:00",
  "messages": [
    {"role": "user", "content": "Fix the login bug", "timestamp": "..."},
    {
      "role": "assistant",
      "content": "I'll investigate the login flow.",
      "thinking": "The user wants me to look at...",
      "tool_uses": [
          {
            "tool": "bash",
            "input": {"command": "grep -r 'login' src/"},
            "output": {"text": "src/auth.py:42: def login(user, password):"},
            "status": "success"
          }
        ],
      "timestamp": "..."
    }
  ],
  "stats": {
    "user_messages": 5, "assistant_messages": 8,
    "tool_uses": 20, "input_tokens": 50000, "output_tokens": 3000
  }
}
```

</details>


<details>
<summary><b>Gotchas</b></summary>

- **`--exclude`, `--redact`, `--redact-usernames` APPEND** — they never overwrite. Safe to call repeatedly.
- **Source selection is REQUIRED before export** — set `clawtrace config --source claude|codex|gemini|opencode|openclaw|all`.
- **PII audit is critical** — automated redaction is not foolproof.
- **Large exports take time** — 500+ sessions may take 1-3 minutes.

</details>

## License

MIT
