Metadata-Version: 2.4
Name: corpis
Version: 0.1.2
Summary: Codebase knowledge layer for AI coding agents
License-Expression: MIT
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: chromadb>=0.4
Requires-Dist: openai>=1.0
Requires-Dist: python-dotenv>=1.0
Dynamic: license-file

# Corpis

A codebase knowledge layer for AI coding agents. Corpis analyses every source file in your project, distils each one into a structured logic record (purpose, mechanics, decision history), and makes the whole thing queryable in plain English.

Every AI agent working in your repo — Claude, Gemini, GPT — can query Corpis before reading source files, getting instant grounded answers without burning context.

## Install

```bash
pip install corpis
```

## Quickstart

```bash
cd my-project
corpis init        # scaffold corpis/, create .env, write agent instruction files
# add your OPENAI_API_KEY to corpis/.env
corpis genesis     # ingest the codebase (run once)
corpis query       # start asking questions
```

## Commands

| Command | Description |
|---------|-------------|
| `corpis init` | Scaffold Corpis in the current project |
| `corpis genesis [--reset]` | Ingest all source files into the knowledge corpus |
| `corpis query` | Interactive TUI chat interface (or pipe: `echo "question" \| corpis query`) |
| `corpis update <file> "<why>"` | Update a file's record after a code change |
| `corpis delete <file>` | Remove a file's record from the corpus |
| `corpis cache` | Browse the semantic answer cache |

## How it works

1. **Genesis** — walks your repo, sends each file to GPT-4o, and writes a structured logic record: what the code does, why it exists, and how it works. Records are stored in `corpis/docs/` organised by business domain.

2. **Query** — two-stage agentic retrieval: scout (semantic search across all records) → AI selects the most relevant ones → synthesise a grounded answer. Answers are cached semantically so identical or paraphrased questions return instantly.

3. **Update** — after every code change, run `corpis update <file> "<why>"`. The record gets a new dated changelog entry and any stale cached answers are invalidated automatically.

## Agent integration

`corpis init` writes instruction files for every major AI coding agent:

- `CLAUDE.md` — imperative instructions for Claude Code
- `AGENTS.md` — instructions for OpenAI Codex / GPT agents
- `GEMINI.md` — instructions for Gemini CLI

Agents are instructed to query Corpis before reading source files and to run `corpis update` after every change — keeping the knowledge corpus current without any manual effort.

## Requirements

- Python 3.9+
- OpenAI API key (used for embeddings and synthesis)
