Metadata-Version: 2.4
Name: checkinnit
Version: 0.2.0
Summary: Git-native AI code intelligence layer - semantic understanding, risk detection, and memory for your commits
Author: AI Code Reviewer Team
License: MIT
Project-URL: Homepage, https://github.com/ManeeshProg/checkinnit
Project-URL: Documentation, https://github.com/ManeeshProg/checkinnit#readme
Project-URL: Repository, https://github.com/ManeeshProg/checkinnit
Project-URL: Issues, https://github.com/ManeeshProg/checkinnit/issues
Keywords: git,code-review,ai,llm,security,static-analysis
Classifier: Development Status :: 4 - Beta
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Software Development :: Quality Assurance
Classifier: Topic :: Software Development :: Version Control :: Git
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: gitpython>=3.1.0
Requires-Dist: pyyaml>=6.0
Requires-Dist: requests>=2.28.0
Provides-Extra: memory
Requires-Dist: faiss-cpu>=1.7.0; extra == "memory"
Requires-Dist: sentence-transformers>=2.2.0; extra == "memory"
Provides-Extra: full
Requires-Dist: faiss-cpu>=1.7.0; extra == "full"
Requires-Dist: sentence-transformers>=2.2.0; extra == "full"
Provides-Extra: dev
Requires-Dist: pytest>=7.0.0; extra == "dev"
Requires-Dist: pytest-cov>=4.0.0; extra == "dev"
Requires-Dist: black>=23.0.0; extra == "dev"
Requires-Dist: ruff>=0.1.0; extra == "dev"
Dynamic: license-file

# AI Code Reviewer

An intelligent, multi-phase code review system that analyzes GitHub Pull Requests using local LLMs, semantic code understanding, and deterministic policy gates.

```
┌─────────────────────────────────────────────────────────────────────────────┐
│                           AI CODE REVIEWER                                   │
│                                                                             │
│   "Policy > AI  •  Determinism > Cleverness  •  Human Control > Autonomy"  │
└─────────────────────────────────────────────────────────────────────────────┘
```

## Features

- **Local-First**: Runs entirely on your machine using Ollama (no cloud API required)
- **Semantic Understanding**: Parses diffs into meaningful code chunks with risk detection
- **Historical Memory**: RAG-based retrieval of similar past changes and regressions
- **AI-Powered Analysis**: LLM reasoning with structured JSON output
- **GitHub Integration**: Fetches PRs, posts review comments automatically
- **Deterministic Decisions**: Policy gates that PASS/WARN/FAIL with explainable reasons
- **Human Override**: Label or comment-based override for blocking decisions

## Architecture Overview

```
┌──────────────────────────────────────────────────────────────────────────────┐
│                              GITHUB PR                                        │
│                         (Pull Request URL)                                    │
└─────────────────────────────────┬────────────────────────────────────────────┘
                                  │
                                  ▼
┌──────────────────────────────────────────────────────────────────────────────┐
│                           PHASE 4: GITHUB                                     │
│  ┌─────────────┐    ┌─────────────┐    ┌─────────────┐    ┌─────────────┐   │
│  │   Client    │───▶│ Diff Fetcher│───▶│ Orchestrator│───▶│  Formatter  │   │
│  │  (API/Auth) │    │  (PR Files) │    │ (Workflow)  │    │ (Comments)  │   │
│  └─────────────┘    └─────────────┘    └──────┬──────┘    └─────────────┘   │
└──────────────────────────────────────────────┼───────────────────────────────┘
                                               │
                    ┌──────────────────────────┼──────────────────────────┐
                    │                          │                          │
                    ▼                          ▼                          ▼
┌─────────────────────────┐  ┌─────────────────────────┐  ┌─────────────────────────┐
│   PHASE 1/1.5: DIFF     │  │   PHASE 2: MEMORY       │  │   PHASE 5: POLICY       │
│                         │  │                         │  │                         │
│  ┌───────────────────┐  │  │  ┌───────────────────┐  │  │  ┌───────────────────┐  │
│  │   Git Extractor   │  │  │  │   RAG Store       │  │  │  │  Confidence       │  │
│  │   (Commits/Diffs) │  │  │  │   (FAISS + LlamaI)│  │  │  │  Calibration      │  │
│  └─────────┬─────────┘  │  │  └─────────┬─────────┘  │  │  └─────────┬─────────┘  │
│            │            │  │            │            │  │            │            │
│            ▼            │  │            ▼            │  │            ▼            │
│  ┌───────────────────┐  │  │  ┌───────────────────┐  │  │  ┌───────────────────┐  │
│  │   Diff Parser     │  │  │  │   Chunk Ingester  │  │  │  │   Merge Gates     │  │
│  │   (Semantic)      │  │  │  │   (Documents)     │  │  │  │   (PASS/WARN/FAIL)│  │
│  └─────────┬─────────┘  │  │  └─────────┬─────────┘  │  │  └─────────┬─────────┘  │
│            │            │  │            │            │  │            │            │
│            ▼            │  │            ▼            │  │            ▼            │
│  ┌───────────────────┐  │  │  ┌───────────────────┐  │  │  ┌───────────────────┐  │
│  │   Semantic Chunks │  │  │  │   Chunk Retriever │  │  │  │   Human Overrides │  │
│  │   (ID, Risk Tags) │  │  │  │   (Similarity)    │  │  │  │   (Labels/Comments│  │
│  └───────────────────┘  │  │  └───────────────────┘  │  │  └───────────────────┘  │
└─────────────────────────┘  └─────────────────────────┘  └─────────────────────────┘
            │                            │                            │
            └────────────────┬───────────┴────────────────────────────┘
                             │
                             ▼
┌──────────────────────────────────────────────────────────────────────────────┐
│                           PHASE 3: REVIEWER                                   │
│                                                                              │
│  ┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐          │
│  │  Prompt Builder │───▶│  Review Engine  │───▶│  ReviewResult   │          │
│  │  (System/User)  │    │  (Ollama/OpenAI)│    │  (Structured)   │          │
│  └─────────────────┘    └─────────────────┘    └─────────────────┘          │
│                                                                              │
│  Providers: Ollama (local) │ OpenAI │ Gemini                                │
└──────────────────────────────────────────────────────────────────────────────┘
```

## Data Flow

```
PR URL ──▶ Fetch Diff ──▶ Parse Chunks ──▶ Retrieve History ──▶ LLM Review ──▶ Policy Gate ──▶ Post Comment
              │                │                  │                 │              │
              ▼                ▼                  ▼                 ▼              ▼
         [GitHub API]    [SemanticChunk]    [FAISS Vector]    [ReviewResult]  [GateResult]
                              │                  │                 │              │
                              │                  ▼                 │              │
                              │            [Historical            │              │
                              │             Context]              │              │
                              │                  │                 │              │
                              └──────────────────┴─────────────────┘              │
                                                 │                                │
                                                 ▼                                ▼
                                          [LLM Prompt]                    [PASS/WARN/FAIL]
```

## Project Structure

```
ai-code-reviewer/
├── diff_analyzer/           # Phase 1/1.5: Git diff extraction
│   ├── __init__.py
│   ├── get_diff.py          # CommitDiffExtractor (GitPython)
│   └── parser.py            # DiffParser, SemanticChunk, risk detection
│
├── memory/                  # Phase 2: RAG-based commit memory
│   ├── __init__.py
│   ├── rag_store.py         # CommitMemoryStore (FAISS + LlamaIndex)
│   ├── ingest.py            # ChunkIngester (Document conversion)
│   └── retrieve.py          # ChunkRetriever (Similarity search)
│
├── reviewer/                # Phase 3: LLM code review engine
│   ├── __init__.py
│   ├── engine.py            # ReviewEngine (Ollama/OpenAI/Gemini)
│   └── prompt.py            # PromptBuilder, system prompts
│
├── github/                  # Phase 4: GitHub PR integration
│   ├── __init__.py
│   ├── client.py            # GitHubClient (API, auth, retry)
│   ├── diff_fetcher.py      # PRDiffFetcher (PR files, patches)
│   ├── comment_formatter.py # CommentFormatter (Markdown)
│   └── orchestrator.py      # PRReviewOrchestrator (main workflow)
│
├── policy/                  # Phase 5: Deterministic trust gates
│   ├── __init__.py
│   ├── types.py             # MergeDecision, GateResult, BlockingIssue
│   ├── confidence.py        # ConfidenceCalibrator (evidence quality)
│   ├── gates.py             # MergeGate (PASS/WARN/FAIL logic)
│   └── overrides.py         # OverrideDetector (labels, comments)
│
├── examples/                # Test scripts
│   ├── test_github.py       # Phase 4 tests
│   └── test_policy.py       # Phase 5 tests (10 deterministic tests)
│
├── index/                   # FAISS vector index (generated)
├── .env                     # API keys (GITHUB_TOKEN, etc.)
├── requirements.txt         # Python dependencies
└── README.md                # This file
```

## Installation

### Prerequisites

- Python 3.10+
- [Ollama](https://ollama.com/) installed and running
- Git

### Setup

```bash
# Clone the repository
git clone https://github.com/ManeeshProg/checkinnit.git
cd checkinnit

# Create virtual environment
python -m venv venv
source venv/bin/activate  # Linux/Mac
# or
.\venv\Scripts\activate   # Windows

# Install dependencies
pip install -r requirements.txt

# Pull Ollama model
ollama pull qwen2.5:7b-instruct

# Configure environment
cp .env.example .env
# Edit .env and add your GITHUB_TOKEN
```

### Environment Variables

```env
# Required for GitHub integration
GITHUB_TOKEN=ghp_xxxxxxxxxxxxxxxxxxxx

# Optional: Use OpenAI instead of Ollama
OPENAI_API_KEY=sk-xxxxxxxxxxxxxxxxxxxx

# Optional: Use Gemini
GEMINI_API_KEY=AIzaxxxxxxxxxxxxxxxxxxxxxxx
```

## Usage

### Review a GitHub PR

```python
from github import review_pr_url

# Review with comment posting
result = review_pr_url("https://github.com/owner/repo/pull/123")

print(f"Success: {result['success']}")
print(f"Decision: {result['gate_result'].decision}")  # PASS/WARN/FAIL
print(f"Issues: {len(result['review_result'].issues)}")
```

### CLI Usage

```bash
# Review a PR
python -m github.orchestrator https://github.com/owner/repo/pull/123

# Review without posting comment
python -m github.orchestrator https://github.com/owner/repo/pull/123 --no-comment

# Use OpenAI instead of Ollama
python -m github.orchestrator https://github.com/owner/repo/pull/123 --provider openai
```

### Programmatic Usage

```python
from github import PRReviewOrchestrator, ReviewConfig

# Custom configuration
config = ReviewConfig(
    llm_provider="ollama",           # or "openai", "gemini"
    llm_model="qwen2.5:7b-instruct", # model name
    temperature=0.0,                  # deterministic output
    use_memory=True,                  # use historical context
    enable_policy_gates=True,         # apply PASS/WARN/FAIL logic
    post_comment=True,                # post to GitHub
)

orchestrator = PRReviewOrchestrator(config=config)
result = orchestrator.review_pr("owner", "repo", 123)
```

## Policy Gate Logic

The policy module makes **deterministic** decisions based on explicit rules:

### Decision Matrix

```
┌─────────────────────────────────────────────────────────────────────────────┐
│                           MERGE DECISION LOGIC                               │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│  ┌─────────────────────────────────────────────────────────────────────┐   │
│  │                          ❌ FAIL                                     │   │
│  │                                                                     │   │
│  │  Triggers when ANY of:                                              │   │
│  │  • severity == "high" AND issue_type == "security"                  │   │
│  │    AND evidence.similarity >= 0.6 AND confidence >= 0.7            │   │
│  │  • Repeated regression pattern detected                            │   │
│  │                                                                     │   │
│  │  Result: Merge BLOCKED (unless override applied)                   │   │
│  └─────────────────────────────────────────────────────────────────────┘   │
│                                    │                                        │
│                                    ▼                                        │
│  ┌─────────────────────────────────────────────────────────────────────┐   │
│  │                          ⚠️ WARN                                     │   │
│  │                                                                     │   │
│  │  Triggers when ANY of:                                              │   │
│  │  • overall_risk == "medium"                                         │   │
│  │  • calibrated_confidence < 0.7                                      │   │
│  │  • Non-security high severity issue                                 │   │
│  │                                                                     │   │
│  │  Result: Human review recommended                                   │   │
│  └─────────────────────────────────────────────────────────────────────┘   │
│                                    │                                        │
│                                    ▼                                        │
│  ┌─────────────────────────────────────────────────────────────────────┐   │
│  │                          ✅ PASS                                     │   │
│  │                                                                     │   │
│  │  Requires ALL of:                                                   │   │
│  │  • overall_risk == "low"                                            │   │
│  │  • No high severity issues                                          │   │
│  │  • calibrated_confidence >= 0.8                                     │   │
│  │                                                                     │   │
│  │  Result: Safe to merge                                              │   │
│  └─────────────────────────────────────────────────────────────────────┘   │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘
```

### Confidence Calibration

```
┌─────────────────────────────────────────────────────────────────────────────┐
│                        CONFIDENCE CALIBRATION                                │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│  Base Confidence (from LLM)                                                 │
│         │                                                                   │
│         ▼                                                                   │
│  ┌─────────────────┐                                                        │
│  │  Line Confidence │ ──▶ "medium" = 0.9x penalty                          │
│  │  Factor          │                                                       │
│  └────────┬────────┘                                                        │
│           │                                                                 │
│           ▼                                                                 │
│  ┌─────────────────┐                                                        │
│  │  Similarity     │ ──▶ < 0.6 = 0.85x penalty                             │
│  │  Factor         │ ──▶ >= 0.8 = 1.05x boost                              │
│  └────────┬────────┘                                                        │
│           │                                                                 │
│           ▼                                                                 │
│  ┌─────────────────┐                                                        │
│  │  Evidence       │ ──▶ Missing historical = 0.8x penalty                 │
│  │  Factor         │ ──▶ Good evidence = 1.05x boost                       │
│  └────────┬────────┘                                                        │
│           │                                                                 │
│           ▼                                                                 │
│  ┌─────────────────┐                                                        │
│  │  Regression     │ ──▶ Repeated = 1.15x boost                            │
│  │  Factor         │                                                       │
│  └────────┬────────┘                                                        │
│           │                                                                 │
│           ▼                                                                 │
│  Calibrated Confidence (used for decisions)                                │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘
```

### Human Overrides

Override mechanisms to allow merging blocked PRs:

```
┌─────────────────────────────────────────────────────────────────────────────┐
│                          HUMAN OVERRIDES                                     │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│  Method 1: PR Label                                                         │
│  ┌─────────────────────────────────────────────────────────────────────┐   │
│  │  Add label: "ai-override-approved"                                   │   │
│  │                                                                     │   │
│  │  Effect: FAIL → WARN (merge unblocked)                              │   │
│  └─────────────────────────────────────────────────────────────────────┘   │
│                                                                             │
│  Method 2: PR Comment                                                       │
│  ┌─────────────────────────────────────────────────────────────────────┐   │
│  │  Comment: /ai-approve                                               │   │
│  │           /ai-approve false positive                                │   │
│  │           /ai-approve accepted risk                                 │   │
│  │                                                                     │   │
│  │  Effect: FAIL → WARN (merge unblocked)                              │   │
│  └─────────────────────────────────────────────────────────────────────┘   │
│                                                                             │
│  Override Info Captured:                                                    │
│  • Reason (label_approved, comment_approved, false_positive, etc.)         │
│  • Approved by (username)                                                   │
│  • Timestamp                                                                │
│  • Comment text (if applicable)                                             │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘
```

## Review Output Format

### JSON Structure (Phase 3)

```json
{
  "overall_risk": "medium",
  "confidence": 0.85,
  "issues": [
    {
      "chunk_id": "93ef8ccbfd005a9e",
      "file_path": "auth/login.py",
      "severity": "high",
      "issue_type": "security",
      "summary": "SQL injection vulnerability",
      "suggestion": "Use parameterized queries",
      "evidence": {
        "similarity": 0.85,
        "previous_commit": "abc123def"
      }
    }
  ],
  "positive_changes": [
    {
      "file_path": "tests/test_auth.py",
      "summary": "Added comprehensive unit tests"
    }
  ]
}
```

### PR Comment Example

The system posts formatted markdown comments to GitHub PRs:

```markdown
## 🤖 AI Code Review

🟡 **MEDIUM RISK**

### Summary

| Metric | Value |
|--------|-------|
| ❌ High Severity | 1 |
| ⚠️ Medium Severity | 0 |
| 🔵 Low Severity | 0 |
| Chunks Reviewed | 3 |
| Confidence | 85% |

### Issues Found

#### ❌ HIGH Severity

<details>
<summary>🔒 <b>SQL injection vulnerability</b></summary>

**File:** `auth/login.py`
**Type:** security

**Suggestion:**
> Use parameterized queries

**Evidence:**
- Similarity to past issue: 85%
- Previous commit: `abc123def`

</details>

---

### ⚠️ Merge Decision: **WARN**

**Reason:** Review recommended: high severity security issue
**Confidence:** 85%

---
*Generated by AI Code Reviewer using `qwen2.5:7b-instruct`*
```

## Testing

### Run All Tests

```bash
# Phase 4: GitHub integration tests
python examples/test_github.py --mock

# Phase 5: Policy logic tests (deterministic)
python examples/test_policy.py
```

### Test Coverage

| Test | Description | Status |
|------|-------------|--------|
| URL Parsing | Parse GitHub PR URLs | ✅ |
| Comment Formatter | Generate markdown comments | ✅ |
| Error Formatting | Graceful error messages | ✅ |
| Security Regression → FAIL | High severity + security + evidence | ✅ |
| Low Confidence → WARN | Calibrated confidence < 0.7 | ✅ |
| Clean PR → PASS | Low risk, high confidence | ✅ |
| Override FAIL → WARN | Label/comment override | ✅ |
| Override Detection | Detect labels and commands | ✅ |
| Non-Security High → WARN | Performance issues don't block | ✅ |
| Determinism | Same input = same output | ✅ |

## Design Principles

```
┌─────────────────────────────────────────────────────────────────────────────┐
│                         DESIGN PRINCIPLES                                    │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│  1. POLICY > AI                                                             │
│     └─▶ LLM provides insight, policy makes decisions                       │
│                                                                             │
│  2. DETERMINISM > CLEVERNESS                                                │
│     └─▶ Same input always produces same output                             │
│                                                                             │
│  3. HUMAN CONTROL > AUTONOMY                                                │
│     └─▶ Humans can always override AI decisions                            │
│                                                                             │
│  4. EXPLICIT RULES > HIDDEN HEURISTICS                                      │
│     └─▶ All decision logic is visible and testable                         │
│                                                                             │
│  5. LOCAL-FIRST > CLOUD DEPENDENCY                                          │
│     └─▶ Works offline with Ollama                                          │
│                                                                             │
│  6. GRACEFUL FAILURE > CRASH                                                │
│     └─▶ Never crashes on malformed input                                   │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘
```

## Supported LLM Providers

| Provider | Model | Default | API Key Required |
|----------|-------|---------|------------------|
| Ollama | qwen2.5:7b-instruct | ✅ | No |
| OpenAI | gpt-4o-mini | | Yes |
| Gemini | gemini-1.5-flash | | Yes |

## Local Commit Analysis

For analyzing local Git commits (without GitHub):

```python
from diff_analyzer import CommitDiffExtractor, DiffParser

# Extract diff from a commit
extractor = CommitDiffExtractor("/path/to/repo")
commit_diff = extractor.get_commit_diff("abc123")

# Parse into semantic chunks
parser = DiffParser()
for file_diff in commit_diff.files:
    parsed = parser.parse(
        file_path=file_diff.file_path,
        change_type=file_diff.change_type,
        diff_text=file_diff.diff
    )

    for chunk in parsed.chunks:
        print(f"Chunk ID: {chunk.chunk_id}")
        print(f"Type: {chunk.chunk_type.value}")
        print(f"Risk Tags: {chunk.risk_tags}")
        print(f"Content: {chunk.content}")
```

## Roadmap

- [x] Phase 1: Git diff extraction
- [x] Phase 1.5: Semantic chunks with IDs and risk tags
- [x] Phase 2: RAG memory with FAISS
- [x] Phase 3: LLM review engine
- [x] Phase 4: GitHub PR integration
- [x] Phase 5: Policy gates and overrides
- [ ] Phase 6: GitHub Actions integration
- [ ] Phase 7: Web dashboard
- [ ] Phase 8: Multi-repo support

## Contributing

1. Fork the repository
2. Create a feature branch
3. Make your changes
4. Run tests: `python examples/test_policy.py`
5. Submit a pull request

## License

MIT License - see LICENSE file for details.

## Acknowledgments

- [LlamaIndex](https://www.llamaindex.ai/) for RAG infrastructure
- [FAISS](https://github.com/facebookresearch/faiss) for vector search
- [Ollama](https://ollama.com/) for local LLM inference
- [GitPython](https://gitpython.readthedocs.io/) for Git operations
