Metadata-Version: 2.4
Name: sanicode
Version: 0.1.0
Summary: AI-assisted code sanitization scanner with OWASP ASVS, NIST 800-53, and ASD STIG compliance mapping.
Project-URL: Homepage, https://github.com/rdwj/sanicode
Project-URL: Repository, https://github.com/rdwj/sanicode
Project-URL: Issues, https://github.com/rdwj/sanicode/issues
Author: Sanicode Contributors
License: Apache-2.0
Keywords: compliance,llm,owasp,sast,security,stig
Classifier: Development Status :: 2 - Pre-Alpha
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Security
Requires-Python: >=3.10
Requires-Dist: fastapi>=0.100
Requires-Dist: litellm>=1.0
Requires-Dist: networkx>=3.0
Requires-Dist: prometheus-client>=0.17
Requires-Dist: rich>=13.0
Requires-Dist: tomli>=2.0; python_version < '3.11'
Requires-Dist: typer>=0.9.0
Requires-Dist: uvicorn[standard]>=0.20
Provides-Extra: dev
Requires-Dist: build>=1.0; extra == 'dev'
Requires-Dist: httpx>=0.24; extra == 'dev'
Requires-Dist: pytest-asyncio>=0.21; extra == 'dev'
Requires-Dist: pytest-cov>=4.0; extra == 'dev'
Requires-Dist: pytest>=7.0; extra == 'dev'
Requires-Dist: ruff>=0.1.0; extra == 'dev'
Requires-Dist: twine>=5.0; extra == 'dev'
Description-Content-Type: text/markdown

# Sanicode

Sanicode scans Python codebases for input validation and sanitization gaps, builds a knowledge graph of data flow (entry points, sanitizers, sinks), and maps every finding to OWASP ASVS 5.0, NIST 800-53, and ASD STIG v4r11 controls. Output formats include SARIF (for GitHub Code Scanning integration), JSON, and Markdown.

Unlike pattern-only tools like Bandit or Semgrep, sanicode constructs a data flow graph so findings carry context about *how* tainted data reaches a sink and *whether* sanitization exists along the path.

## Install

```
pip install sanicode
```

Requires Python 3.10+.

## Quick start

Scan a codebase and generate a Markdown report:

```
sanicode scan .
```

Generate SARIF output for CI integration:

```
sanicode scan . -f sarif
```

Reports are written to `sanicode-reports/` by default.

## API server

Start the FastAPI server for remote or hybrid scan mode:

```
sanicode serve
```

This starts on port 8080 with Prometheus metrics at `/metrics`.

### Endpoints

```
POST /api/v1/scan              Submit a scan (async)
GET  /api/v1/scan/{id}         Poll scan status
GET  /api/v1/scan/{id}/findings   Retrieve findings (JSON or ?format=sarif)
GET  /api/v1/scan/{id}/graph      Retrieve knowledge graph
POST /api/v1/analyze           Instant snippet analysis
GET  /api/v1/compliance/map    Compliance framework lookup
GET  /api/v1/health            Liveness check
GET  /metrics                  Prometheus metrics
```

## CLI commands

```
sanicode scan .                       # Scan codebase, generate reports
sanicode scan . -f sarif              # SARIF output
sanicode scan . -f json -f sarif      # Multiple formats
sanicode serve                        # Start API server on :8080
sanicode report scan-result.json      # Re-generate reports from saved results
sanicode report scan-result.json -s high   # Filter by severity
sanicode report scan-result.json --cwe 89  # Filter by CWE
sanicode config --show                # Show resolved configuration
sanicode config --init                # Create starter sanicode.toml
sanicode graph . --export graph.json  # Export knowledge graph
```

## Detection rules

| Rule   | Description                      | CWE     |
|--------|----------------------------------|---------|
| SC001  | `eval()`                         | CWE-78  |
| SC002  | `exec()`                         | CWE-78  |
| SC003  | `os.system()`                    | CWE-78  |
| SC004  | `subprocess` with `shell=True`   | CWE-78  |
| SC005  | `pickle.loads()`                 | CWE-502 |
| SC006  | SQL string formatting            | CWE-89  |
| SC007  | `__import__()`                   | CWE-94  |
| SC008  | `yaml.load()` without `Loader`   | CWE-502 |

Each finding is enriched with CWE metadata and mapped to the active compliance profiles.

## Compliance frameworks

Sanicode maps findings to three frameworks out of the box:

- **OWASP ASVS 5.0** -- V1: Encoding and Sanitization requirements (L1/L2/L3)
- **NIST 800-53** -- SI-10 (Information Input Validation), SI-15 (Information Output Filtering), and related controls
- **ASD STIG v4r11** -- APSC-DV-002510 (CAT I), APSC-DV-002520 (CAT II), APSC-DV-002530 (CAT II), and related checks

## Configuration

Create a config file:

```
sanicode config --init
```

This writes a `sanicode.toml` in the current directory. Config is loaded from (in order):

1. `--config` flag
2. `sanicode.toml` in the current directory
3. `~/.config/sanicode/config.toml`

Sanicode works fully without any configuration. LLM tiers are optional -- without them, the tool runs in degraded mode using AST pattern matching, knowledge graph construction, and compliance lookups. LLM integration adds context-aware reasoning on top of these.

### LLM tiers (optional)

The config supports three tiers for different task complexities, each pointing at any OpenAI-compatible endpoint (Ollama, vLLM, OpenShift AI):

| Tier        | Purpose                           | Recommended model       |
|-------------|-----------------------------------|-------------------------|
| `fast`      | Classification, severity scoring  | Granite Nano, Mistral 7B |
| `analysis`  | Data flow reasoning               | Granite Code 8B         |
| `reasoning` | Compliance mapping, reports       | Llama 3.1 70B           |

## Current status

Phase 1 MVP: Python-only scanning, 8 detection rules, local and API server modes. LLM integration is planned but not yet wired; the tool operates in degraded mode with AST patterns and compliance mapping.

## License

Apache-2.0
