Metadata-Version: 2.4
Name: novyx-core
Version: 1.0.0
Summary: Durable, persistent knowledge graph for AI agents - local-first, schema-driven, cryptographically verified
Author-email: Novyx Labs <hello@novyx.ai>
Maintainer-email: Novyx Labs <hello@novyx.ai>
License: MIT
Project-URL: Homepage, https://github.com/novyxlabs/novyx-core
Project-URL: Documentation, https://github.com/novyxlabs/novyx-core/blob/main/README.md
Project-URL: Repository, https://github.com/novyxlabs/novyx-core
Project-URL: Issues, https://github.com/novyxlabs/novyx-core/issues
Project-URL: Changelog, https://github.com/novyxlabs/novyx-core/blob/main/CHANGELOG.md
Keywords: ai,knowledge-graph,agent,persistence,semantic,json-ld,linked-data,rdf,memory,llm,rag,vector-search,embeddings,multi-tenant,federation
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Database
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Typing :: Typed
Requires-Python: >=3.10
Description-Content-Type: text/markdown
Provides-Extra: semantic
Requires-Dist: sentence-transformers>=2.2.0; extra == "semantic"
Requires-Dist: torch>=2.0.0; extra == "semantic"
Provides-Extra: dashboard
Requires-Dist: streamlit>=1.28.0; extra == "dashboard"
Requires-Dist: pandas>=2.0.0; extra == "dashboard"
Provides-Extra: api
Requires-Dist: fastapi>=0.104.0; extra == "api"
Requires-Dist: uvicorn[standard]>=0.24.0; extra == "api"
Requires-Dist: slowapi>=0.1.9; extra == "api"
Requires-Dist: python-multipart>=0.0.6; extra == "api"
Requires-Dist: strawberry-graphql[fastapi]>=0.216.0; extra == "api"
Requires-Dist: graphql-core>=3.2.0; extra == "api"
Requires-Dist: pyjwt>=2.8.0; extra == "api"
Requires-Dist: passlib[bcrypt]>=1.7.4; extra == "api"
Provides-Extra: dev
Requires-Dist: pytest>=7.0.0; extra == "dev"
Requires-Dist: pytest-asyncio>=0.21.0; extra == "dev"
Requires-Dist: httpx>=0.25.0; extra == "dev"
Requires-Dist: structlog>=23.0.0; extra == "dev"
Requires-Dist: strawberry-graphql[fastapi]>=0.216.0; extra == "dev"
Requires-Dist: graphql-core>=3.2.0; extra == "dev"
Requires-Dist: pyjwt>=2.8.0; extra == "dev"
Requires-Dist: passlib[bcrypt]>=1.7.4; extra == "dev"
Requires-Dist: ijson>=3.2.0; extra == "dev"
Requires-Dist: rdflib>=7.0.0; extra == "dev"
Requires-Dist: neo4j>=5.14.0; extra == "dev"
Requires-Dist: requests>=2.31.0; extra == "dev"

# Novyx Core

> **Code that endures. Intelligence that persists.**

A production-ready persistent knowledge graph for AI agents with deep semantic understanding and cryptographic verification. Built for agents that think across time horizons—days, weeks, or months—without losing context.

[![PyPI](https://img.shields.io/pypi/v/novyx-core?color=blue)](https://pypi.org/project/novyx-core/)
[![Python](https://img.shields.io/badge/Python-3.10+-green)](https://python.org)
[![Schema](https://img.shields.io/badge/Schema-v1.2.0-blue)](CORE_SCHEMA.jsonld)
[![License](https://img.shields.io/badge/License-MIT-yellow)](LICENSE)
[![AI Visibility](https://img.shields.io/badge/AI%20Visibility-100%2F100-brightgreen)](organization.jsonld)

---

## 🎯 Mission

Novyx Core is a **persistent knowledge graph system** designed for AI-first workflows. Unlike ephemeral chatbots or session-based systems, Novyx maintains **cryptographically verified state** across time, enabling:

- **Long-horizon intelligence**: Projects that span weeks without context loss
- **Semantic memory**: 384-dimensional embeddings for every artifact
- **AI discoverability**: Machine-readable metadata for 2026+ AI search engines
- **Verifiable integrity**: SHA-256 hashing and automated auditing

---

## ⚙️ Core Tools

| Tool                        | Purpose                                          | Key Features                                                        |
| --------------------------- | ------------------------------------------------ | ------------------------------------------------------------------- |
| **🧠 Semantic Pulse**        | Ingest raw text and generate embeddings          | `all-MiniLM-L6-v2` model, auto-linking, external authority support  |
| **🛡️ Sentinel**              | Verify data integrity and schema compliance      | Backward-compatible v1.0/v1.1.0, hash validation, link verification |
| **🔍 Query Engine**          | Semantic search and knowledge graph traversal    | Vector similarity search, authority filtering, graph statistics     |
| **🌐 AI Visibility Manager** | Generate Schema.org entities for discoverability | Organization profile, external identity linking, trust signals      |
| **🎨 Interactive Dashboard** | Web UI for all core tools                        | File upload, semantic search, integrity checks, graph visualization |

---

## 🚀 Quick Start

### Prerequisites

- Python 3.10+ required
- No cloud dependencies - runs entirely on your machine

### Installation

**Option 1: Install from PyPI (Recommended)**

```bash
# Install core package
pip install novyx-core

# With semantic search support
pip install novyx-core[semantic]

# With API + GraphQL support
pip install novyx-core[api]

# Full installation (all features)
pip install novyx-core[semantic,api,dashboard]
```

**Option 2: Install from source**

```bash
# Clone the repository
git clone https://github.com/novyxlabs/novyx-core.git
cd novyx-core

# Install in development mode
pip install -e ".[dev]"
```

**Option 3: Docker**

```bash
# Using docker-compose (see docker-compose.yml)
docker-compose up -d

# Or run directly
docker run -p 8000:8000 -v $(pwd)/memory:/app/memory novyxlabs/novyx-core
```

### Basic Usage

**1. Ingest a new artifact**
```bash
# Simple ingestion
python tools/ingestor.py --file inbox/note.txt --category research

# With external authority linking
python tools/ingestor.py \
  --file inbox/insight.txt \
  --category decisions \
  --link https://github.com/novyxlabs
```

**2. Search the knowledge graph**
```bash
# Semantic search (finds by meaning, not keywords)
python tools/query.py --search "deep learning and cognitive science"

# Authority search (filter by external links)
python tools/query.py --authority "github"

# Show statistics
python tools/query.py --stats
```

**3. Verify integrity**
```bash
# Run before every commit
python tools/sentinel.py

# Verbose mode
python tools/sentinel.py --verbose
```

**4. Generate AI visibility report**
```bash
python tools/entity_generator.py generate-report
```

**5. Launch interactive dashboard**
```bash
streamlit run tools/dashboard.py
```
*Open browser to http://localhost:8501 for web UI*

---

## 🛡️ Enforcement Automation (Layer 5)

### Enable Commit Gate in 10 Seconds

Novyx Core includes automated integrity checks that can run before every commit:

```bash
# Install git hooks (one-time setup)
bash scripts/install_git_hooks.sh

# Or use the CLI
python3 -m tools.cli install-hooks
```

This creates a `.git/hooks/pre-commit` hook that:
- ✅ Runs Sentinel integrity validation
- ✅ Blocks commits if validation fails
- ✅ (Optional) Enforces graph integrity with `NOVYX_ENFORCE_GRAPH=1`

**Enable graph enforcement:**
```bash
export NOVYX_ENFORCE_GRAPH=1  # Add to ~/.bashrc for persistence
```

### Operator Reports

Generate daily snapshot reports for monitoring:

```bash
# Via CLI (recommended)
python3 -m tools.cli report --directory memory --out reports/

# Or directly
python3 -m tools.operator_report --directory memory --out reports/
```

Reports include:
- 📊 Sentinel integrity summary (passed/failed counts)
- 🔗 Graph statistics (artifacts, links, orphans)
- 📅 New artifacts since last report
- ⚠️ Risk notes (orphaned links, parse errors)

### CLI Commands

**Graph Analysis:**
```bash
# Show graph statistics
python3 -m tools.cli graph --directory memory

# Explain an artifact
python3 -m tools.cli explain --id urn:uuid:... --directory memory
python3 -m tools.cli explain --file memory/decisions/my_decision.jsonld

# Repair orphaned links
python3 -m tools.cli repair --directory memory --dry-run
python3 -m tools.cli repair --directory memory  # Actually fix
python3 -m tools.cli repair --directory memory --create-stubs  # Create stub artifacts
```

**Semantic Search (Phase 8):**
```bash
# Basic semantic search (requires: pip install -e ".[semantic]")
python3 -m tools.cli query --directory memory --search "database selection"

# With custom similarity threshold (0.0-1.0)
python3 -m tools.cli query --directory memory --search "API authentication" --min-score 0.5

# Limit number of results
python3 -m tools.cli query --directory memory --search "GraphQL vs REST" --top-k 5

# JSON output for programmatic use
python3 -m tools.cli query --directory memory --search "microservices" --json

# Statistics only (no search)
python3 -m tools.cli query --directory memory
```

**Direct Query Engine:**
```bash
# Using query.py directly
python3 -m tools.query --search "machine learning embeddings"
python3 -m tools.query --search "vector embeddings" --min-score 0.7 --top-k 3
python3 -m tools.query --search "semantic similarity" --json
python3 -m tools.query --stats  # Show graph statistics
```

### CI/CD Integration

GitHub Actions workflow included at `.github/workflows/ci.yml`:
- ✅ Compile checks
- ✅ Unit tests
- ✅ Sentinel validation
- ✅ Graph enforcement

---

## 🌐 API Deployment (Phase 6)

### RESTful API Server

Novyx Core exposes a production-ready FastAPI server with 8 endpoints for programmatic access:

**Start API Server:**
```bash
# Development (with auto-reload)
uvicorn tools.api:app --reload --host 0.0.0.0 --port 8000

# Production
uvicorn tools.api:app --host 0.0.0.0 --port 8000 --workers 4
```

**Environment Variables:**
```bash
export NOVYX_API_KEY="your-secret-key-here"
export NOVYX_RATE_LIMIT="100/minute"
```

### Production Security

Novyx Core includes a startup security check that validates configuration before the API starts.

**Required Environment Variables for Production:**

| Variable           | Description            | Requirement                      |
| ------------------ | ---------------------- | -------------------------------- |
| `NOVYX_API_KEY`    | API authentication key | Min 16 chars, no default markers |
| `NOVYX_JWT_SECRET` | JWT signing secret     | Min 32 chars, no default markers |

**Optional Security Variables:**

| Variable                | Description                               | Default |
| ----------------------- | ----------------------------------------- | ------- |
| `NOVYX_ENVIRONMENT`     | Set to `development` to suppress warnings | (none)  |
| `NOVYX_STRICT_SECURITY` | Set to `1` to exit on insecure config     | `0`     |

**Behavior:**
- **Development** (`NOVYX_ENVIRONMENT=development`): Prints gentle warnings to stdout
- **Non-development**: Prints loud warnings to stderr
- **Strict mode** (`NOVYX_STRICT_SECURITY=1`): Exits with code 1 if insecure (non-development only)

**CORS Configuration (Phase 12):**

| Variable            | Description                               | Default                                       |
| ------------------- | ----------------------------------------- | --------------------------------------------- |
| `NOVYX_CORS_ORIGINS` | Comma-separated list of allowed origins  | `http://localhost:3000,http://localhost:8080` |

```bash
# Production CORS - only allow your domains
export NOVYX_CORS_ORIGINS="https://app.example.com,https://admin.example.com"
```

**Input Validation (Phase 12):**

All API endpoints validate:
- **artifact_id**: Must be valid UUID format (`urn:uuid:...` or bare UUID)
- **tenant_id**: Must be 3-64 lowercase alphanumeric with hyphens/underscores
- Path traversal attacks (`../`, `..\\`) are blocked with HTTP 400

**Example Production Configuration:**
```bash
export NOVYX_API_KEY="$(openssl rand -hex 32)"
export NOVYX_JWT_SECRET="$(openssl rand -hex 32)"
export NOVYX_CORS_ORIGINS="https://app.example.com"
export NOVYX_STRICT_SECURITY="1"
```

**API Documentation:**
- Swagger UI: http://localhost:8000/api/docs
- ReDoc: http://localhost:8000/api/redoc
- Health Check: http://localhost:8000/health

### Endpoints

| Method | Endpoint                  | Description           | Auth Required |
| ------ | ------------------------- | --------------------- | ------------- |
| GET    | `/health`                 | Health check          | ❌             |
| POST   | `/api/v1/create/{type}`   | Create artifact       | ✅             |
| POST   | `/api/v1/ingest/text`     | Ingest raw text       | ✅             |
| POST   | `/api/v1/ingest/decision` | Create Decision       | ✅             |
| GET    | `/api/v1/query/search`    | Semantic search       | ✅             |
| GET    | `/api/v1/query/stats`     | Graph statistics      | ✅             |
| GET    | `/api/v1/graph`           | Graph analysis        | ✅             |
| POST   | `/api/v1/validate`        | Run Sentinel checks   | ✅             |
| POST   | `/api/v1/repair`          | Repair orphaned links | ✅             |

### cURL Examples

**1. Health Check:**
```bash
curl http://localhost:8000/health
```

**2. Create a Decision Artifact:**
```bash
curl -X POST http://localhost:8000/api/v1/ingest/decision \
  -H "X-API-Key: novyx-dev-key-change-in-production" \
  -H "Content-Type: application/json" \
  -d '{
    "title": "Choose Database",
    "context": "Need to select a database for new service",
    "options_considered": ["PostgreSQL", "MongoDB", "SQLite"],
    "chosen_option": "PostgreSQL",
    "reasoning": "Relational data with strong ACID guarantees",
    "constraints": ["Budget under $100/month", "Must scale to 10k users"],
    "assumptions": ["Data is relational", "Team knows SQL"],
    "confidence": 0.85,
    "expected_outcome": "Reliable data storage with good query performance",
    "category": "architecture"
  }'
```

**3. Search the Knowledge Graph:**
```bash
curl "http://localhost:8000/api/v1/query/search?search_term=database&limit=5" \
  -H "X-API-Key: novyx-dev-key-change-in-production"
```

**4. Get Graph Statistics:**
```bash
curl "http://localhost:8000/api/v1/query/stats?directory=memory" \
  -H "X-API-Key: novyx-dev-key-change-in-production"
```

**5. Analyze Graph (with orphan detection):**
```bash
curl "http://localhost:8000/api/v1/graph?directory=memory&include_orphans=true" \
  -H "X-API-Key: novyx-dev-key-change-in-production"
```

**6. Validate Artifacts:**
```bash
curl -X POST http://localhost:8000/api/v1/validate \
  -H "X-API-Key: novyx-dev-key-change-in-production" \
  -H "Content-Type: application/json" \
  -d '{
    "directory": "memory",
    "enforce_graph": false,
    "verbose": true
  }'
```

**7. Repair Graph (Dry Run):**
```bash
curl -X POST http://localhost:8000/api/v1/repair \
  -H "X-API-Key: novyx-dev-key-change-in-production" \
  -H "Content-Type: application/json" \
  -d '{
    "directory": "memory",
    "dry_run": true,
    "create_stubs": false
  }'
```

**8. Create Any Artifact Type:**
```bash
curl -X POST http://localhost:8000/api/v1/create/apiendpoint \
  -H "X-API-Key: novyx-dev-key-change-in-production" \
  -H "Content-Type: application/json" \
  -d '{
    "payload": {
      "title": "New Endpoint",
      "method": "GET",
      "path": "/api/v1/example",
      "description": "Example endpoint",
      "parameters": [],
      "response_schema": "{\"status\": \"success\"}"
    },
    "external_links": ["https://github.com/novyxlabs"]
  }'
```

### Docker Deployment

**Build & Run:**
```bash
# Build image
docker build -t novyx-core-api .

# Run container
docker run -d \
  -p 8000:8000 \
  -e NOVYX_API_KEY="your-production-key" \
  -e NOVYX_RATE_LIMIT="200/minute" \
  -v $(pwd)/memory:/app/memory \
  --name novyx-api \
  novyx-core-api

# View logs
docker logs -f novyx-api

# Stop container
docker stop novyx-api
```

### Vercel Deployment

Novyx Core API can be deployed to Vercel with minimal configuration:

**1. Create `vercel.json`:**
```json
{
  "builds": [
    {
      "src": "tools/api.py",
      "use": "@vercel/python"
    }
  ],
  "routes": [
    {
      "src": "/(.*)",
      "dest": "tools/api.py"
    }
  ],
  "env": {
    "NOVYX_API_KEY": "@novyx-api-key",
    "NOVYX_RATE_LIMIT": "100/minute"
  }
}
```

**2. Deploy:**
```bash
# Install Vercel CLI
npm install -g vercel

# Deploy
vercel --prod

# Set environment variable
vercel env add NOVYX_API_KEY production
```

**3. Access:**
```
https://your-project.vercel.app/health
https://your-project.vercel.app/api/docs
```

**Note:** For Vercel deployment, the `memory/` directory is ephemeral. Consider using:
- External storage (S3, Google Cloud Storage)
- Database backend (PostgreSQL with JSON columns)
- Git-based persistence (commit artifacts on change)

### Security Best Practices

1. **Change Default API Key:**
   ```bash
   export NOVYX_API_KEY="$(openssl rand -hex 32)"
   ```

2. **Use HTTPS in Production:**
   - Deploy behind reverse proxy (nginx, Caddy)
   - Or use platform SSL (Vercel, Railway, Fly.io)

3. **Configure Rate Limiting:**
   ```bash
   export NOVYX_RATE_LIMIT="50/minute"  # Adjust per environment
   ```

4. **Monitor API Usage:**
   ```bash
   python3 -m tools.cli report --directory memory --out reports/
   # Check API usage metrics in report
   ```

---

## 🔷 GraphQL API (Phase 7)

**New in v1.3.0:** Novyx Core now provides a GraphQL interface alongside the REST API, offering flexible querying with a single endpoint.

### GraphQL Playground

Access the interactive GraphQL Playground at:
```
http://localhost:8000/graphql
```

**Features:**
- 🎯 Single endpoint for all operations
- 📖 Auto-generated documentation & schema introspection
- 🔍 5 Query operations (read data)
- ✏️ 4 Mutation operations (modify data)
- 🔐 API key authentication (same as REST API)
- 📊 Real-time query execution with GraphiQL interface

### GraphQL Operations

**Queries (Read Operations):**
1. `hello` - Health check
2. `artifactTypes` - List all registered artifact types
3. `searchArtifacts` - Text-based search across artifacts
4. `graphStats` - Knowledge graph statistics
5. `validateGraph` - Run Sentinel validation

**Mutations (Write Operations):**
1. `createDecision` - Create Decision artifact
2. `ingestText` - Ingest raw text as DigitalDocument
3. `createArtifact` - Create generic artifact
4. `repairGraph` - Repair orphaned links

### GraphQL Examples

**1. Hello Query (Health Check):**
```graphql
query {
  hello
}
```

cURL equivalent:
```bash
curl -X POST http://localhost:8000/graphql \
  -H "X-API-Key: novyx-dev-key-change-in-production" \
  -H "Content-Type: application/json" \
  -d '{"query": "{ hello }"}'
```

**2. List Artifact Types:**
```graphql
query {
  artifactTypes
}
```

Response:
```json
{
  "data": {
    "artifactTypes": ["decision", "apiendpoint", "graphqloperation"]
  }
}
```

**3. Search Artifacts:**
```graphql
query SearchArtifacts($searchTerm: String!, $threshold: Float) {
  searchArtifacts(searchTerm: $searchTerm, threshold: $threshold, directory: "memory") {
    artifactId
    title
    similarityScore
    category
    createdAt
    snippet
  }
}
```

Variables:
```json
{
  "searchTerm": "database decision",
  "threshold": 0.3
}
```

cURL:
```bash
curl -X POST http://localhost:8000/graphql \
  -H "X-API-Key: novyx-dev-key-change-in-production" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "query SearchArtifacts($searchTerm: String!) { searchArtifacts(searchTerm: $searchTerm) { artifactId title similarityScore } }",
    "variables": {"searchTerm": "database"}
  }'
```

**4. Get Graph Statistics:**
```graphql
query {
  graphStats(directory: "memory") {
    totalArtifacts
    totalIds
    totalLinks
    orphanedLinks
    artifactsByType
    artifactsByCategory
    latestTimestamp
  }
}
```

**5. Validate Graph:**
```graphql
query {
  validateGraph(directory: "memory", enforceGraph: true) {
    totalArtifacts
    passed
    failed
    uniqueIds
    success
    errors
  }
}
```

**6. Create Decision (Mutation):**
```graphql
mutation CreateDecision($input: CreateDecisionInput!) {
  createDecision(input: $input) {
    success
    artifactId
    filePath
    message
  }
}
```

Variables:
```json
{
  "input": {
    "title": "Choose GraphQL vs REST",
    "context": "Evaluating API design approach",
    "optionsConsidered": ["GraphQL only", "REST only", "Hybrid GraphQL + REST"],
    "chosenOption": "Hybrid GraphQL + REST",
    "reasoning": "GraphQL provides flexible querying, REST offers simplicity. Hybrid gives best of both.",
    "confidence": 0.9,
    "expectedOutcome": "Flexible API with wide compatibility",
    "category": "architecture",
    "constraints": [],
    "assumptions": [],
    "externalLinks": []
  }
}
```

**7. Ingest Text (Mutation):**
```graphql
mutation IngestText($input: IngestTextInput!) {
  ingestText(input: $input) {
    success
    artifactId
    message
  }
}
```

Variables:
```json
{
  "input": {
    "text": "GraphQL provides a strongly-typed schema and efficient data fetching with a single endpoint.",
    "category": "research",
    "externalLinks": ["https://graphql.org"]
  }
}
```

**8. Create Generic Artifact (Mutation):**
```graphql
mutation CreateArtifact($input: CreateArtifactInput!) {
  createArtifact(input: $input) {
    success
    artifactId
    filePath
  }
}
```

Variables:
```json
{
  "input": {
    "artifactType": "apiendpoint",
    "payload": "{\"title\":\"New Endpoint\",\"method\":\"GET\",\"path\":\"/api/v2/data\",\"description\":\"Fetch data\",\"parameters\":[],\"response_schema\":\"{}\"}",
    "externalLinks": []
  }
}
```

**9. Repair Graph (Mutation):**
```graphql
mutation RepairGraph($input: RepairInput!) {
  repairGraph(input: $input) {
    orphansFound
    linksRemoved
    stubsCreated
    filesModified
    success
    message
  }
}
```

Variables:
```json
{
  "input": {
    "directory": "memory",
    "dryRun": true,
    "createStubs": false
  }
}
```

### GraphQL vs REST: When to Use Each

| Feature            | GraphQL                                 | REST                                |
| ------------------ | --------------------------------------- | ----------------------------------- |
| **Best For**       | Complex queries, flexible data fetching | Simple CRUD, caching, standard HTTP |
| **Endpoint**       | Single (`/graphql`)                     | Multiple (`/api/v1/*`)              |
| **Data Fetching**  | Request exactly what you need           | Fixed response structure            |
| **Documentation**  | Auto-generated schema introspection     | Swagger/OpenAPI                     |
| **Learning Curve** | Steeper (query language)                | Easier (HTTP verbs)                 |

**Recommendation:** Use both! GraphQL for complex querying, REST for simple operations.

### Testing GraphQL

Run the GraphQL test suite:
```bash
python3 -m pytest tests/test_graphql.py -v
# 22 tests covering all operations
```

---

## 🏢 Multi-Tenant Architecture (Phase 9)

Novyx Core supports **multi-tenant deployments** with isolated data, JWT authentication, and role-based access control.

### Key Features

- 🔐 **JWT Authentication**: Secure token-based authentication with role hierarchy
- 🏢 **Data Isolation**: Tenant-specific subdirectories ensure complete data separation
- 👥 **Role-Based Access**: Admin, Editor, and Viewer roles with hierarchical permissions
- 🔍 **Tenant Filtering**: CLI and API support for tenant-scoped operations
- 📊 **Tenant Metrics**: Per-tenant analytics and usage tracking

### Tenant Structure

Each tenant is defined by a `Tenant` artifact in `memory/tenants/`:

```json
{
  "tenant_id": "acme-corp",
  "name": "Acme Corporation",
  "admin_email": "admin@acme-corp.com",
  "plan_type": "enterprise",
  "status": "active",
  "max_artifacts": 10000,
  "features": ["semantic_search", "graphql", "api_access", "multi_user"],
  "users": [
    {
      "email": "admin@acme.com",
      "role": "admin",
      "name": "Admin User"
    },
    {
      "email": "editor@acme.com",
      "role": "editor",
      "name": "Editor User"
    }
  ]
}
```

### Authentication Flow

**1. Login to get JWT token:**
```bash
curl -X POST http://localhost:8000/api/v1/auth/login \
  -H "Content-Type: application/json" \
  -d '{
    "tenant_id": "acme-corp",
    "email": "admin@acme.com",
    "password": "your-password"
  }'
```

**Response:**
```json
{
  "access_token": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...",
  "token_type": "bearer",
  "tenant_id": "acme-corp",
  "email": "admin@acme.com",
  "role": "admin",
  "expires_in": 3600
}
```

**2. Use JWT token for authenticated requests:**
```bash
curl http://localhost:8000/api/v1/query/search?search_term=database \
  -H "Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9..."
```

**3. Check current user context:**
```bash
curl http://localhost:8000/api/v1/auth/me \
  -H "Authorization: Bearer your-jwt-token"
```

### Role-Based Permissions

| Role       | Permissions                                                    |
| ---------- | -------------------------------------------------------------- |
| **Viewer** | Read artifacts, search, query statistics                       |
| **Editor** | All viewer permissions + create/modify artifacts               |
| **Admin**  | All editor permissions + user management, tenant configuration |

**Permission hierarchy**: Admin > Editor > Viewer

### CLI Tenant Operations

**Query tenant-specific artifacts:**
```bash
python3 -m tools.cli query --directory memory/decisions --tenant acme-corp
# Shows only artifacts for acme-corp tenant
```

**Validate tenant artifacts:**
```bash
python3 -m tools.cli validate --directory memory/decisions --tenant acme-corp
# Validates only acme-corp's artifacts
```

**Repair tenant graph:**
```bash
python3 -m tools.cli repair --directory memory/decisions --tenant beta-testing --dry-run
# Repairs graph for beta-testing tenant only
```

### Data Isolation

Tenant artifacts are automatically stored in isolated subdirectories:

```
memory/
├── decisions/
│   ├── acme-corp/          # Isolated from other tenants
│   │   ├── decision_1.jsonld
│   │   └── decision_2.jsonld
│   └── beta-testing/       # Completely separate
│       ├── decision_1.jsonld
│       └── decision_2.jsonld
├── tenants/
│   ├── acme-corp/
│   │   └── acme_corporation.jsonld
│   └── beta-testing/
│       └── beta_testing.jsonld
```

**Security guarantee**: Tenants cannot access each other's data through filesystem isolation.

### Creating Tenant-Scoped Artifacts

Using the factory with `tenant_id`:

```python
from tools.factory import create_artifact, persist_artifact
from pathlib import Path

# Create artifact for specific tenant
artifact = create_artifact(
    "decision",
    {
        "title": "Technical Decision",
        "context": "Choosing database",
        "options_considered": ["PostgreSQL", "MongoDB"],
        "chosen_option": "PostgreSQL",
        "reasoning": "Better consistency guarantees",
        "constraints": ["Budget", "Timeline"],
        "assumptions": ["Relational data model"],
        "confidence": 0.9,
        "expected_outcome": "Stable database layer",
        "category": "architecture"
    },
    tenant_id="acme-corp"  # Tenant isolation
)

# Automatically persists to memory/decisions/acme-corp/
path = persist_artifact(artifact, Path("memory/decisions"))
```

### Environment Variables

Configure JWT authentication:

```bash
# JWT secret key (REQUIRED in production)
export NOVYX_JWT_SECRET="your-secret-key-min-32-chars"

# Token expiration (default: 60 minutes)
export NOVYX_JWT_EXPIRE_MINUTES="120"

# API key (backward compatibility)
export NOVYX_API_KEY="your-api-key"
```

### Testing Multi-Tenant Features

Run the multi-tenant test suite:

```bash
python3 -m pytest tests/test_multi_tenant.py -v
# Tests: JWT auth, roles, isolation, cross-tenant security
```

### Migration from Single-Tenant

Existing artifacts without `tenant_id` continue to work:
- Stored in base directory (e.g., `memory/decisions/`)
- Accessible via global API key authentication
- Can be migrated by adding `tenant_id` field and moving to tenant subdirectory

---

## 📦 Artifact Versioning & History (Phase 10)

Novyx Core provides **comprehensive version control** for all artifacts, enabling full audit trails, diff-based storage, and rollback capabilities.

### Key Features

- 📜 **Complete History**: Every artifact update creates a version record with diff
- 🔄 **Rollback**: Restore artifacts to any previous version
- 💾 **Diff-Based Storage**: Space-efficient versioning using unified diffs
- 🔐 **Integrity Verified**: Each version has its own cryptographic hash
- 👥 **Tenant-Aware**: Version histories are isolated per tenant
- 🎯 **Automated**: Versioning happens automatically on artifact updates

### How It Works

When an artifact is updated via `persist_artifact()`, Novyx automatically:
1. Loads the previous version of the artifact
2. Computes a unified diff between old and new content
3. Creates a `Version` artifact with the diff, timestamp, and change metadata
4. Stores the version in `memory/versions/{artifact_id}/`
5. Updates the main artifact with new content and hash

### CLI Version Management

**List all versions for an artifact:**
```bash
python3 -m tools.cli version --list --artifact-id urn:uuid:abc123
```

**Output:**
```
📋 Version History for urn:uuid:abc123
   Total versions: 5

   v1: 2026-01-10T14:30:00Z
      UUID: urn:uuid:version-001
   v2: 2026-01-11T09:15:00Z
      UUID: urn:uuid:version-002
   v3: 2026-01-12T16:45:00Z
      UUID: urn:uuid:version-003
```

**Rollback to a specific version:**
```bash
python3 -m tools.cli version --rollback --artifact-id urn:uuid:abc123 --version 2
```

**Output:**
```
🔄 Rolling back artifact urn:uuid:abc123 to version 2...
✅ Rollback successful!
   Updated: memory/decisions/my_decision_a1b2c3d4.jsonld
```

**Tenant-scoped version operations:**
```bash
python3 -m tools.cli version --list --artifact-id urn:uuid:xyz789 --tenant acme-corp
python3 -m tools.cli version --rollback --artifact-id urn:uuid:xyz789 --version 1 --tenant acme-corp
```

### API Version Endpoints

**GET /api/v1/version/{artifact_id}/list** - List versions

```bash
curl -X GET "http://localhost:8000/api/v1/version/abc123/list" \
  -H "X-API-Key: your-api-key"
```

**Response:**
```json
{
  "status": "success",
  "message": "Found 5 version(s) for artifact urn:uuid:abc123",
  "data": {
    "artifact_id": "urn:uuid:abc123",
    "version_count": 5,
    "versions": [
      {
        "version_number": 1,
        "uuid": "urn:uuid:version-001",
        "timestamp": "2026-01-10T14:30:00+00:00",
        "change_summary": "Initial decision created",
        "changed_by": "user@example.com"
      },
      {
        "version_number": 2,
        "uuid": "urn:uuid:version-002",
        "timestamp": "2026-01-11T09:15:00+00:00",
        "change_summary": "Enhanced reasoning",
        "changed_by": "user@example.com"
      }
    ]
  }
}
```

**POST /api/v1/version/{artifact_id}/rollback** - Rollback to version

```bash
curl -X POST "http://localhost:8000/api/v1/version/abc123/rollback" \
  -H "X-API-Key: your-api-key" \
  -H "Content-Type: application/json" \
  -d '{"version_number": 2}'
```

**Response:**
```json
{
  "status": "success",
  "message": "Artifact rolled back to version 2",
  "data": {
    "artifact_id": "urn:uuid:abc123",
    "version_number": 2,
    "updated_path": "memory/decisions/my_decision_a1b2c3d4.jsonld",
    "updated_at": "2026-01-13T10:30:00+00:00"
  }
}
```

### Version Schema

Each `Version` artifact follows the `novyx:Version` schema:

```json
{
  "@type": ["novyx:Version", "schema:CreativeWork"],
  "uuid": "urn:uuid:version-001",
  "artifact_id": "urn:uuid:abc123",
  "version_number": 2,
  "parent_version": "urn:uuid:version-000",
  "diff": "--- old\n+++ new\n@@ -1,3 +1,3 @@\n...",
  "change_summary": "Enhanced reasoning with more details",
  "changed_by": "user@example.com",
  "change_type": "update",
  "artifact_snapshot": null,
  "createdAt": "2026-01-11T09:15:00+00:00",
  "novyx:integrityHash": "7f83b1657ff1fc53b92dc18148a1d65dfc2d4b1fa3d677284addd200126d9069"
}
```

**Note**: Version 1 includes `artifact_snapshot` with the full initial state. Subsequent versions store only diffs.

### Programmatic Usage

```python
from tools.versioning import list_versions, rollback_to_version
from tools.factory import persist_artifact
from pathlib import Path

# List versions
versions = list_versions("urn:uuid:abc123")
print(f"Found {len(versions)} versions")

# Rollback to version 2
restored_artifact = rollback_to_version("urn:uuid:abc123", 2)

# Persist the restored artifact (skip_versioning to avoid circular versioning)
output_dir = Path("memory/decisions")
path = persist_artifact(restored_artifact, output_dir, skip_versioning=True)
```

### Testing Versioning

Run the versioning test suite (16 tests):

```bash
python3 -m pytest tests/test_versioning.py -v
# Tests: version creation, diffs, listing, rollback, multi-tenant, integrity
```

### Version Storage Structure

```
memory/
├── versions/
│   ├── abc123-def4-5678-9012-345678901234/  # Artifact-specific directory
│   │   ├── version_001.jsonld  # v1 with full snapshot
│   │   ├── version_002.jsonld  # v2 with diff
│   │   └── version_003.jsonld  # v3 with diff
│   └── acme-corp/  # Tenant-specific versions
│       └── xyz789-abc1-2345-6789-012345678901/
│           ├── version_001.jsonld
│           └── version_002.jsonld
```

### Use Cases

- **Audit Trails**: Track who changed what and when
- **Compliance**: Maintain immutable history for regulatory requirements
- **Experimentation**: Try changes and rollback if needed
- **Collaboration**: Review version history to understand decision evolution
- **Recovery**: Restore from accidental modifications

---

## 📦 Streaming & Batch Operations (Phase 13)

Novyx Core supports **memory-efficient streaming** and **batch operations** for large artifact collections, enabling scalable deployments.

### Key Features

- 🌊 **Streaming Iteration**: Generator-based artifact loading with O(1) memory per item
- 📦 **Batch Create**: Create up to 100 artifacts in a single operation
- ✅ **Batch Validation**: Stream-validate artifacts without loading entire dataset
- 🏢 **Multi-Tenant Aware**: All streaming operations support tenant filtering
- 🔍 **Streaming Semantic Search**: Search large datasets with bounded memory

### CLI Batch Commands

**Count artifacts (fast, no content loading):**
```bash
python3 -m tools.cli batch --count --directory memory
python3 -m tools.cli batch --count --directory memory --tenant acme-corp
```

**Test streaming performance:**
```bash
python3 -m tools.cli batch --stream-test --directory memory --batch-size 50
```

**Batch create from JSON file:**
```bash
# Create artifacts.json with array of artifact payloads
python3 -m tools.cli batch --create artifacts.json --directory memory

# With tenant isolation
python3 -m tools.cli batch --create artifacts.json --directory memory --tenant acme-corp

# Skip versioning for bulk imports
python3 -m tools.cli batch --create artifacts.json --directory memory --skip-versioning
```

**Stream-validate artifacts:**
```bash
python3 -m tools.cli batch --validate --directory memory --batch-size 100
```

### API Batch Endpoints

**POST /api/v1/batch/create** - Create multiple artifacts

```bash
curl -X POST http://localhost:8000/api/v1/batch/create \
  -H "X-API-Key: your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "artifacts": [
      {
        "@type": "novyx:Decision",
        "title": "Decision 1",
        "context": "Context",
        "options_considered": ["A", "B"],
        "chosen_option": "A",
        "reasoning": "Reasoning",
        "constraints": [],
        "assumptions": [],
        "confidence": 0.9,
        "expected_outcome": "Success",
        "category": "architecture"
      },
      {
        "@type": "novyx:Decision",
        "title": "Decision 2",
        "context": "Another context",
        "options_considered": ["X", "Y"],
        "chosen_option": "X",
        "reasoning": "Another reasoning",
        "constraints": [],
        "assumptions": [],
        "confidence": 0.85,
        "expected_outcome": "Success",
        "category": "architecture"
      }
    ],
    "tenant_id": "acme-corp",
    "skip_versioning": false
  }'
```

**Response:**
```json
{
  "status": "success",
  "message": "Batch create completed: 2 created, 0 failed",
  "data": {
    "created": 2,
    "failed": 0,
    "results": [
      {"index": 0, "artifact_id": "urn:uuid:...", "path": "...", "type": "decision"},
      {"index": 1, "artifact_id": "urn:uuid:...", "path": "...", "type": "decision"}
    ],
    "errors": []
  }
}
```

**GET /api/v1/batch/count** - Count artifacts

```bash
curl "http://localhost:8000/api/v1/batch/count?directory=memory&tenant_id=acme-corp" \
  -H "X-API-Key: your-api-key"
```

**POST /api/v1/batch/validate** - Stream-validate artifacts

```bash
curl -X POST http://localhost:8000/api/v1/batch/validate \
  -H "X-API-Key: your-api-key" \
  -H "Content-Type: application/json" \
  -d '{"directory": "memory", "batch_size": 50}'
```

### Programmatic Usage

```python
from tools.batch import (
    stream_artifacts,
    batch_create,
    batch_validate,
    count_artifacts
)
from pathlib import Path

# Stream artifacts in memory-efficient batches
for batch in stream_artifacts(Path("memory"), batch_size=100):
    for artifact in batch:
        process(artifact)

# Stream with tenant filter
for batch in stream_artifacts(Path("memory"), tenant_id="acme-corp"):
    print(f"Processing {len(batch)} artifacts")

# Batch create artifacts
artifacts = [
    {"@type": "novyx:Decision", "title": "Decision 1", ...},
    {"@type": "novyx:Decision", "title": "Decision 2", ...},
]
result = batch_create(artifacts, tenant_id="acme-corp")
print(f"Created: {result['created']}, Failed: {result['failed']}")

# Stream-validate with progress tracking
for validation_result in batch_validate(Path("memory"), batch_size=50):
    print(f"Batch: {validation_result['passed']}/{validation_result['batch_size']} passed")

# Fast artifact count (no content loading)
count = count_artifacts(Path("memory"), tenant_id="acme-corp")
```

### Streaming Semantic Search

For large datasets, use streaming semantic search to maintain bounded memory:

```python
from tools.query import QueryEngine

engine = QueryEngine()

# Streaming search across large datasets
results = engine.search_semantic_streaming(
    "deployment strategy",
    top_k=10,
    min_score=0.3,
    batch_size=100,
    tenant_id="acme-corp"
)
```

### Batch Limits

| Operation     | Limit | Notes                              |
| ------------- | ----- | ---------------------------------- |
| Batch Create  | 100   | Maximum artifacts per request      |
| Stream Batch  | 100   | Maximum batch size (configurable)  |
| Validation    | 100   | Maximum validation batch size      |

### Testing Batch Operations

```bash
python3 -m pytest tests/test_batch.py -v
# 19 tests covering streaming, batch create, validation, multi-tenant
```

---

## 📤 Export/Import Formats (Phase 11)

Novyx Core supports **export and import** to multiple formats for migrations, integrations, and interoperability with external systems.

### Supported Formats

| Format   | Export | Import | Description                                 |
| -------- | ------ | ------ | ------------------------------------------- |
| JSON-LD  | ✅     | ✅     | Native format, consolidated graph export    |
| RDF/Turtle | ✅   | ✅     | Semantic web standard (requires rdflib)     |
| Neo4j Cypher | ✅ | ❌     | Graph database import scripts               |

### CLI Export Commands

**Export to JSON-LD:**
```bash
python3 -m tools.cli export --format jsonld --output artifacts.jsonld --directory memory
```

**Export to RDF/Turtle:**
```bash
python3 -m tools.cli export --format rdf --output graph.ttl --directory memory
```

**Export to Neo4j Cypher:**
```bash
python3 -m tools.cli export --format neo4j --output import.cypher --directory memory
```

**Export specific tenant:**
```bash
python3 -m tools.cli export --format jsonld --output tenant_data.jsonld --directory memory --tenant acme-corp
```

### CLI Import Commands

**Import from JSON-LD:**
```bash
python3 -m tools.cli import --input artifacts.jsonld --format jsonld --output memory
```

**Import from RDF/Turtle:**
```bash
python3 -m tools.cli import --input graph.ttl --format rdf --output memory
```

**Import with tenant assignment:**
```bash
python3 -m tools.cli import --input external.jsonld --format jsonld --output memory --tenant imported-org
```

**Preserve original hashes (no regeneration):**
```bash
python3 -m tools.cli import --input backup.jsonld --format jsonld --output memory --no-regenerate-hashes
```

### API Export/Import Endpoints

**POST /api/v1/export** - Export artifacts

```bash
curl -X POST http://localhost:8000/api/v1/export \
  -H "X-API-Key: your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "directory": "memory",
    "format": "jsonld",
    "tenant_id": "acme-corp",
    "include_embeddings": false
  }'
```

**Response:**
```json
{
  "status": "success",
  "message": "Exported 42 artifact(s) to JSONLD",
  "data": {
    "total_artifacts": 42,
    "exported": 42,
    "failed": 0,
    "format": "jsonld",
    "content_base64": "eyJAY29udGV4dCI6Li4u..."
  }
}
```

**POST /api/v1/import** - Import artifacts (multipart/form-data)

```bash
curl -X POST http://localhost:8000/api/v1/import \
  -H "X-API-Key: your-api-key" \
  -F "file=@artifacts.jsonld" \
  -F "format=jsonld" \
  -F "output_dir=memory" \
  -F "tenant_id=imported-org"
```

### Programmatic Usage

```python
from tools.export import export_artifacts, ExportFormat
from tools.import_data import import_artifacts, ImportFormat
from pathlib import Path

# Export to JSON-LD
stats = export_artifacts(
    Path("memory"),
    ExportFormat.JSONLD,
    Path("export.jsonld"),
    tenant_id="acme-corp"
)
print(f"Exported {stats['exported']} artifacts")

# Export to RDF/Turtle (requires rdflib)
stats = export_artifacts(
    Path("memory"),
    ExportFormat.RDF,
    Path("graph.ttl"),
    rdf_format="turtle"
)
print(f"Created {stats['triples']} RDF triples")

# Export to Neo4j Cypher
stats = export_artifacts(
    Path("memory"),
    ExportFormat.NEO4J,
    Path("import.cypher")
)
print(f"Generated {stats['nodes']} Neo4j nodes")

# Import from JSON-LD
stats = import_artifacts(
    Path("export.jsonld"),
    ImportFormat.JSONLD,
    Path("memory/restored"),
    tenant_id="restored-tenant",
    regenerate_hashes=True
)
print(f"Imported {stats['imported']} artifacts")
```

### Neo4j Import Example

After exporting to Cypher, import into Neo4j:

```bash
# Export from Novyx
python3 -m tools.cli export --format neo4j --output novyx_import.cypher --directory memory

# Import to Neo4j (using cypher-shell)
cypher-shell -u neo4j -p password < novyx_import.cypher
```

The generated Cypher includes:
- Uniqueness constraints on `uuid`
- CREATE statements for each artifact
- Labels based on artifact type (`:Artifact`, `:Decision`, `:APIEndpoint`, etc.)
- Relationship creation for `sameAs` links

### Round-Trip Integrity

Export and re-import preserves all core data:

```python
# Export
export_artifacts(Path("memory"), ExportFormat.JSONLD, Path("backup.jsonld"))

# Import (with hash regeneration)
import_artifacts(Path("backup.jsonld"), ImportFormat.JSONLD, Path("memory/restored"))

# Validate restored artifacts
from tools.sentinel import NovyxSentinel
sentinel = NovyxSentinel()
results = sentinel.audit(Path("memory/restored"))
print(f"Validation: {results['passed']}/{results['total']} passed")
```

### Testing Export/Import

```bash
python3 -m pytest tests/test_export_import.py -v
# 17 tests covering round-trip, integrity, multi-tenant
```

---

### Federation: Distributed Sync (Phase 14)

Novyx Core supports **federation** - synchronizing artifacts between multiple Novyx instances.

#### RemoteRef: Tracking Remote Artifacts

```python
from tools.factory import create_remote_ref, persist_artifact
from pathlib import Path

# Create a reference to a remote artifact
ref = create_remote_ref(
    remote_url="https://partner.novyx.ai/api/artifacts/urn:uuid:dec-123",
    remote_artifact_id="urn:uuid:dec-123",
    remote_instance="https://partner.novyx.ai",
    sync_status="pending",
    direction="pull",
    metadata={"name": "Partner Decision", "type": "Decision"}
)

# Persist the reference
persist_artifact(ref, Path("memory/federation"))
```

#### CLI: Federation Commands

```bash
# Check federation status
novyx sync status

# List all remote references
novyx sync list

# List pending refs only
novyx sync list --status pending

# Sync all pending refs
novyx sync run

# Sync with conflict resolution override
novyx sync run --conflict-resolution remote_wins

# Pull specific artifact from remote (requires requests)
novyx sync pull --remote https://partner.novyx.ai --artifact-id urn:uuid:dec-123
```

#### API: Federation Endpoints

```bash
# Get federation status
curl -X GET "http://localhost:8000/api/v1/federation/status" \
  -H "Authorization: Bearer YOUR_API_KEY"

# List remote references
curl -X GET "http://localhost:8000/api/v1/federation/refs?status=pending" \
  -H "Authorization: Bearer YOUR_API_KEY"

# Pull artifact from remote
curl -X POST "http://localhost:8000/api/v1/federation/pull?remote_instance=https://partner.novyx.ai&artifact_id=urn:uuid:dec-123" \
  -H "Authorization: Bearer YOUR_API_KEY"

# Sync pending refs
curl -X POST "http://localhost:8000/api/v1/federation/sync" \
  -H "Authorization: Bearer YOUR_API_KEY"
```

#### Conflict Resolution Strategies

| Strategy | Behavior |
|----------|----------|
| `remote_wins` | Remote changes overwrite local (default) |
| `local_wins` | Local changes preserved, remote ignored |
| `manual` | Mark as conflict, require manual resolution |

#### Testing Federation

```bash
python3 -m pytest tests/test_federation.py -v
# 10+ tests covering sync, conflicts, remote refs
```

---

## 🤖 AI Visibility: 100/100 Discoverability Score

Novyx Core is designed to be **machine-discoverable** by next-generation AI search agents. Our `organization.jsonld` file provides:

- ✅ **Schema.org compliance** (`Organization`, `SoftwareApplication`)
- ✅ **External identity verification** (`sameAs` links to GitHub, LinkedIn)
- ✅ **Semantic metadata** (product descriptions, trust signals)
- ✅ **Verifiable credentials** (cryptographic integrity hashing)

**For AI agents**: Start at [`organization.jsonld`](organization.jsonld) to understand what Novyx Labs builds.

---

## 🏗️ Architecture

### Data Model: JSON-LD 1.1

All artifacts are stored as **JSON-LD** (Linked Data) documents, ensuring:
- **Interoperability**: Standard RDF vocabularies (Schema.org, custom `novyx:` namespace)
- **Semantic richness**: Type information, relationships, and context preserved
- **Future-proof**: Data remains readable even as tools evolve

### Schema Versions

| Version    | ID Format       | Hash Field            | Timestamp Field    | Status      |
| ---------- | --------------- | --------------------- | ------------------ | ----------- |
| **v1.0**   | `Persistent_ID` | `Integrity_Hash`      | `Temporal_Marker`  | ✅ Supported |
| **v1.1.0** | `uuid`          | `novyx:integrityHash` | `novyx:ingestedAt` | ✅ Current   |

**Backward compatibility**: All tools support both schemas seamlessly.

### Local-First Philosophy

Novyx Core runs **entirely on your machine**:
- ❌ No cloud dependencies
- ❌ No API keys required
- ❌ No telemetry or tracking
- ✅ Full control over your data
- ✅ Works offline
- ✅ Git-friendly (plain JSON files)

---

## 📊 Example: Knowledge Graph in Action

```bash
$ python tools/query.py --stats

📊 Knowledge Graph Statistics
------------------------------
Total Artifacts: 6
  • research: 1
  • health: 3
  • decisions: 2
------------------------------
🧠 Semantic Connections: 3
🌐 Authority Links:      1
🛡️  Integrity Status:     Active
```

```bash
$ python tools/query.py --search "AI agents"

🔍 Semantic Results for: 'AI agents'
--------------------------------------------------
📄 Vision (Score: 0.8432)
   📂 Category: decisions
   💡 Excerpt: Novyx Core is designed to be discoverable by AI agents in 2026.
   🌐 Authority: https://github.com/novyxlabs
```

---

## 🔒 Integrity Guarantees

Every artifact includes:

1. **Cryptographic Hash**: SHA-256 of content (excluding the hash field itself)
2. **Temporal Marker**: ISO 8601 timestamp
3. **Persistent ID**: URN:UUID that never changes
4. **Context Links**: Traceable relationships to other artifacts

**Pre-commit enforcement**: `.cursorrules` mandates running `sentinel.py` before every commit, ensuring:
- ✅ All hashes are valid
- ✅ No orphaned links
- ✅ Schema compliance
- ✅ No data corruption

---

## 📚 Documentation

- **[Agent Manifesto](agents/MANIFESTO.md)**: Constitutional law for AI agents
- **[Core Schema](CORE_SCHEMA.jsonld)**: Data structure standards
- **[Cursor Rules](.cursorrules)**: Operational guidelines

---

## 🛠️ Development

### Running Tests

```bash
# Validate all artifacts
python tools/sentinel.py --verbose

# Generate AI visibility report
python tools/entity_generator.py generate-report
```

### Adding New Artifacts

```bash
# 1. Create a text file in inbox/
echo "Your insight here" > inbox/new_idea.txt

# 2. Ingest with semantic pulse
python tools/ingestor.py --file inbox/new_idea.txt --category research

# 3. Verify integrity
python tools/sentinel.py

# 4. Commit if passed
git add memory/
git commit -m "Add new research artifact"
```

---

## 🌟 Why Novyx Core?

| Problem            | Traditional AI           | Novyx Core                            |
| ------------------ | ------------------------ | ------------------------------------- |
| **Context Loss**   | Forgets between sessions | Persistent memory with embeddings     |
| **Data Integrity** | No verification          | SHA-256 + automated auditing          |
| **AI Visibility**  | Invisible to search      | Schema.org + external authority links |
| **Vendor Lock-in** | Cloud-dependent          | Local-first, open standards           |
| **Ephemeral**      | Session-based            | Designed for weeks/months             |

---

## 🤝 Contributing

Novyx Core follows strict durability and legibility standards. Before contributing:

1. Read [agents/MANIFESTO.md](agents/MANIFESTO.md)
2. Follow [.cursorrules](.cursorrules) protocols
3. Ensure `python tools/sentinel.py` passes
4. Write clear commit messages explaining "why," not "what"

---

## 📜 License

MIT License - See [LICENSE](LICENSE) for details.

---

## 🔗 Links

- **GitHub**: [github.com/novyxlabs/novyx-core](https://github.com/novyxlabs/novyx-core)
- **Organization Profile**: [organization.jsonld](organization.jsonld)
- **Schema**: [CORE_SCHEMA.jsonld](CORE_SCHEMA.jsonld)

---

**Novyx Labs** — Building AI that remembers, learns, and persists.

*"The best way to predict the future is to build systems that endure."*
