Metadata-Version: 2.4
Name: reconkit
Version: 0.1.0
Summary: A small, safe-by-default toolkit for scraping, OSINT, and file triage.
Author: reconkit
License: MIT
License-File: LICENSE
Keywords: forensics,osint,scraping,triage
Requires-Python: >=3.10
Requires-Dist: beautifulsoup4>=4.12
Requires-Dist: dnspython>=2.6
Requires-Dist: exifread>=3.0
Requires-Dist: filetype>=1.2
Requires-Dist: httpx>=0.27
Requires-Dist: ipwhois>=1.2
Requires-Dist: lxml>=5.1
Requires-Dist: python-whois>=0.9
Requires-Dist: rich>=13.7
Requires-Dist: typer>=0.12
Provides-Extra: dev
Requires-Dist: build>=1.2; extra == 'dev'
Requires-Dist: mypy>=1.10; extra == 'dev'
Requires-Dist: pytest-asyncio>=0.23; extra == 'dev'
Requires-Dist: pytest>=8.0; extra == 'dev'
Requires-Dist: ruff>=0.4; extra == 'dev'
Requires-Dist: twine>=5.1; extra == 'dev'
Description-Content-Type: text/markdown

# reconkit (Judicium surface)

`reconkit` is a **conservative, safe-by-default** toolkit that exposes a Noir Stack / Hexarch / Judicium-flavored public API.

It is designed to produce **audit-grade structured outputs** that explicitly state:
- what was observed (evidence)
- how it was observed (provenance)
- what is and is not being asserted (verification limits)

This project is intended for lawful use on systems/data you own or have permission to assess.

## Noir Stack primitives

Public primitives:
- `EvidenceArtifact`: immutable bytes + provenance
- `EvidenceBundle`: a set of artifacts for joint assessment
- `ProvenanceRecord`: acquisition/derivation metadata + trust boundary
- `VerificationResult`: structured decision output

All structured outputs include:
- `decision_basis`
- `verification_status`
- `provenance_confidence`

Because this is Judicium, not divination.

## Guarantees

What the system will reliably do:
- Preserve evidence payload bytes as acquired/derived, and compute stable SHA-256 for artifacts.
- Emit structured, JSON-serializable outputs suitable for logging and later review.
- Respect declared acquisition constraints (e.g., `max_bytes`) and treat them as policy boundaries.
- Respect `robots.txt` by default for web acquisition.

## Assumptions

What the system assumes unless you override it:
- `trust_boundary` is declared explicitly for acquisitions. The tool will not invent one.
- Web acquisition is transport-level observation, not identity/authenticity proof.
- Public-internet boundary data (DNS/WHOIS/RDAP) is administrative metadata.

## Non-goals

What the system explicitly does not do:
- No anti-bot bypassing, fingerprint evasion, proxy-rotation orchestration, or credentialed scraping.
- No claims of attribution, intent, or identity. (You can do that in your report; the tool will not do it for you.)
- No “automatic truth.” The system is already busy being correct.

## Install

```bash
python -m pip install -e ".[dev]"
```

## CLI quickstart (Noir terminology)

All commands emit audit-grade JSON.

```bash
# Evidence acquisition / derivation
reconkit evidence acquire-web "https://example.com" --trust-boundary public-internet
reconkit evidence derive-links "https://example.com"
reconkit evidence derive-text "https://example.com"

# Boundary observations (OSINT-ish, but we say the quiet part out loud)
reconkit boundary dns example.com
reconkit boundary whois example.com
reconkit boundary ip 8.8.8.8

# Verification (informational)
reconkit verify file-hash .\somefile.bin --algo sha256
reconkit verify file-type .\somefile.bin
reconkit verify strings .\somefile.bin --min-len 6
reconkit verify exif .\photo.jpg
```

## Notes

- If acquisition is blocked by `robots.txt`, that is not a “bug.” It is a boundary condition.
- If you need features that reduce accountability, you are in the wrong repository. Convenient.
