Metadata-Version: 2.4
Name: neurobrix
Version: 0.1.0a13
Summary: Universal Deep Learning Inference Engine — execute any AI model without model-specific code
Project-URL: Homepage, https://neurobrix.es
Project-URL: Repository, https://github.com/NeuroBrix/neurobrix
Project-URL: Issues, https://github.com/NeuroBrix/neurobrix/issues
Project-URL: Documentation, https://github.com/NeuroBrix/neurobrix#readme
Project-URL: Model Hub, https://neurobrix.es/models
Project-URL: Changelog, https://github.com/NeuroBrix/neurobrix/releases
Project-URL: Contributing Guide, https://github.com/NeuroBrix/neurobrix/blob/main/CONTRIBUTING.md
Author-email: Neural Networks Holding LTD <contact@neurobrix.es>
License-Expression: Apache-2.0
License-File: LICENSE
Keywords: deep-learning,diffusion,gpu,inference,llm,model-serving,neural-networks,onnx-alternative,pytorch,triton
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Operating System :: MacOS
Classifier: Operating System :: Microsoft :: Windows
Classifier: Operating System :: POSIX :: Linux
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.10
Requires-Dist: jinja2>=3.0.0
Requires-Dist: numpy>=1.24.0
Requires-Dist: pillow>=10.0.0
Requires-Dist: pyyaml>=6.0
Requires-Dist: requests>=2.28.0
Requires-Dist: safetensors>=0.4.0
Requires-Dist: sentencepiece>=0.1.99
Requires-Dist: soundfile>=0.12.0
Requires-Dist: tokenizers>=0.14.0
Requires-Dist: torch>=2.1.0
Requires-Dist: tqdm>=4.65.0
Provides-Extra: audio
Requires-Dist: librosa>=0.10.0; extra == 'audio'
Provides-Extra: dev
Requires-Dist: pytest>=7.0.0; extra == 'dev'
Requires-Dist: ruff>=0.1.0; extra == 'dev'
Provides-Extra: full
Requires-Dist: librosa>=0.10.0; extra == 'full'
Requires-Dist: mistral-common>=1.0.0; extra == 'full'
Requires-Dist: tiktoken>=0.5.0; extra == 'full'
Requires-Dist: transformers>=4.30.0; extra == 'full'
Requires-Dist: triton>=2.1.0; extra == 'full'
Provides-Extra: mistral
Requires-Dist: mistral-common>=1.0.0; extra == 'mistral'
Provides-Extra: tiktoken
Requires-Dist: tiktoken>=0.5.0; extra == 'tiktoken'
Provides-Extra: triton
Requires-Dist: triton>=2.1.0; extra == 'triton'
Description-Content-Type: text/markdown

<p align="center">
  <img src="https://raw.githubusercontent.com/NeuroBrix/neurobrix/main/assets/logo.svg" alt="NeuroBrix Logo" width="300"/>
</p>

<h1 align="center">NeuroBrix</h1>

<p align="center">
  <strong>Universal Deep Learning Inference Engine</strong><br/>
  One engine. Any model. Any modality. Zero model-specific code.
</p>

<p align="center">
  <a href="https://pypi.org/project/neurobrix/"><img src="https://img.shields.io/pypi/v/neurobrix?include_prereleases&color=blue" alt="PyPI"/></a>
  <a href="https://pypi.org/project/neurobrix/"><img src="https://img.shields.io/pypi/pyversions/neurobrix?include_prereleases" alt="Python 3.10 | 3.11 | 3.12"/></a>
  <a href="https://github.com/NeuroBrix/neurobrix/blob/main/LICENSE"><img src="https://img.shields.io/badge/license-Apache%202.0-green" alt="License"/></a>
  <a href="https://github.com/NeuroBrix/neurobrix/stargazers"><img src="https://img.shields.io/github/stars/NeuroBrix/neurobrix?style=social" alt="GitHub Stars"/></a>
  <a href="https://neurobrix.es/models"><img src="https://img.shields.io/badge/hub-neurobrix.es-orange" alt="NeuroBrix Hub"/></a>
</p>

<p align="center">
  <a href="https://neurobrix.es/models">Hub</a> &nbsp;|&nbsp;
  <a href="https://neurobrix.es/docs">Docs</a> &nbsp;|&nbsp;
  <a href="https://pypi.org/project/neurobrix/">PyPI</a> &nbsp;|&nbsp;
  <a href="#roadmap">Roadmap</a> &nbsp;|&nbsp;
  <a href="https://github.com/NeuroBrix/neurobrix/blob/main/CONTRIBUTING.md">Contributing</a>
</p>

---

## The Problem

The AI inference landscape is fragmented. Every model family requires its own stack, its own pipeline code, its own deployment tooling. Want to run a diffusion model? Learn ComfyUI or write custom diffusers pipelines. Need an LLM? Pick between Ollama, vLLM, llama.cpp — each with its own limitations. Audio? Video? Start from scratch.

**NeuroBrix eliminates this fragmentation entirely.**

One engine. One CLI. One container format. Import a model, run it. The runtime doesn't know or care whether it's executing a diffusion transformer, a mixture-of-experts LLM, a speech recognizer, or a video generator. It sees tensors, graphs, and execution plans — nothing else.

---

## Why NeuroBrix?

| Capability | Ollama | llama.cpp | vLLM | ComfyUI | **NeuroBrix** |
|:-----------|:------:|:---------:|:----:|:-------:|:-------------:|
| LLMs | Yes | Yes | Yes | -- | **Yes** |
| Image generation | -- | -- | -- | Yes | **Yes** |
| Video generation | -- | -- | -- | -- | **Yes** |
| Audio (STT + TTS) | -- | -- | -- | -- | **Yes** |
| Multimodal (understand + generate) | -- | -- | -- | -- | **Yes** |
| Mixture-of-Experts | -- | -- | Yes | -- | **Yes** |
| Multi-GPU auto-allocation | -- | -- | Yes | -- | **Yes** |
| Cross-platform (Linux, Windows, macOS) | Yes | Yes | -- | -- | **Yes** |
| Universal model format | -- | GGUF (LLM only) | -- | -- | **NBX (any model)** |
| No model-specific code | -- | -- | -- | -- | **Yes** |

Other tools solve one piece of the puzzle. NeuroBrix solves the whole puzzle.

---

## Installation

### Step 1: Install PyTorch with CUDA

```bash
# For CUDA 12.4 (RTX 30xx, 40xx, A100, H100)
pip install torch --index-url https://download.pytorch.org/whl/cu124

# For CUDA 12.1
pip install torch --index-url https://download.pytorch.org/whl/cu121

# For CUDA 11.8 (older GPUs like V100)
pip install torch --index-url https://download.pytorch.org/whl/cu118
```

Verify CUDA is available:
```bash
python -c "import torch; print(torch.cuda.is_available())"  # Should print: True
```

### Step 2: Install NeuroBrix

```bash
pip install neurobrix
```

### Platform Support

| Platform | GPU Support | Notes |
|----------|-------------|-------|
| **Linux** | CUDA, Triton kernels | Full support, recommended for production |
| **Windows** | CUDA | Fully supported. Triton not available on Windows |
| **macOS** | CPU only | MPS/Metal support planned |

**Requirements:** Python 3.10+ / PyTorch 2.1+ with CUDA / NVIDIA GPU

---

## Quick Start

```bash
# Import a model from the hub
neurobrix import Vendor/Model_Name --no-keep

# Generate an image (hardware auto-detected)
neurobrix run --model Model_Name \
    --prompt "A sunset over mountains" --steps 20

# Or serve for instant repeat inference
neurobrix serve --model Model_Name
neurobrix run --prompt "A robot painting on canvas" --output robot.png
neurobrix stop
```

### Serve Mode (Hot Run Mode , Recommended)

Loads weights into VRAM once and keeps the model warm. Every subsequent request runs with zero startup overhead.

```bash
neurobrix serve --model Model_Name

# Image generation (instant — model already loaded)
neurobrix run --prompt "A cat in a hat" --output cat.png

# LLM interactive chat
neurobrix chat --temperature 0.7

# Stop and free VRAM
neurobrix stop
```

---

## NeuroBrix Hub & Model Management

Models are hosted on the **[NeuroBrix Hub](https://neurobrix.es/models)** and managed locally through a two-tier storage system:

- **Store** (`~/.neurobrix/store/`) — downloaded `.nbx` archives (compressed)
- **Cache** (`~/.neurobrix/cache/`) — extracted models ready for inference

### Browse & Import

```bash
# Browse the full hub catalog
neurobrix hub

# Filter by family
neurobrix hub --category IMAGE
neurobrix hub --category LLM
neurobrix hub --category AUDIO
neurobrix hub --category VIDEO

# Search by name
neurobrix hub --search sana

# Import a model (downloads .nbx → extracts to cache)
neurobrix import vendor/model_name

# Import and delete the .nbx archive to save disk space
neurobrix import pixart/sigma-xl-1024 --no-keep

# Force re-import (overwrites existing)
neurobrix import Vendor/Model_Name --force
```

### List & Manage

```bash
# List installed models in cache (ready to run)
neurobrix list

# List downloaded .nbx archives in store
neurobrix list --store

# Show system info: installed models, hardware, disk usage
neurobrix info --models

# Remove a model from cache
neurobrix remove Model_Name

# Remove from both store and cache
neurobrix remove Model_Name --all

# Clean everything — free all disk space
neurobrix clean --all -y
```

### How It Works

```
neurobrix import Vendor/Model_Name --no-keep
  │
  ├─ 1. Download .nbx from neurobrix.es → ~/.neurobrix/store/
  ├─ 2. Extract to ~/.neurobrix/cache/Model_Name/
  ├─ 3. Validate manifest, components, weights
  └─ 4. Delete .nbx from store (--no-keep)

neurobrix run --model Model_Name --prompt "..."
  │
  └─ Reads directly from cache — zero extraction overhead
```

---

## Supported Models

NeuroBrix is a **runtime engine** — it executes models but does **not train or create** them. All models listed below are the work of their respective authors and are subject to their original licenses. **Users must review and accept each model's license before use.**

### Image Generation

| Model | Author | License | Size |
|-------|--------|---------|-----:|
| [Sana 1600M 4K](https://huggingface.co/Efficient-Large-Model/Sana_1600M_4Kpx_BF16) | NVIDIA / MIT | [Apache 2.0](https://huggingface.co/Efficient-Large-Model/Sana_1600M_4Kpx_BF16/blob/main/LICENSE) | 12 GB |
| [PixArt-Sigma-XL-2-1024-MS](https://huggingface.co/PixArt-alpha/PixArt-Sigma-XL-2-1024-MS) | PixArt | [OpenRAIL++](https://huggingface.co/PixArt-alpha/PixArt-Sigma-XL-2-1024-MS/blob/main/LICENSE) | 20 GB |
| [PixArt-XL-2-1024-MS](https://huggingface.co/PixArt-alpha/PixArt-XL-2-1024-MS) | PixArt | [OpenRAIL++](https://huggingface.co/PixArt-alpha/PixArt-XL-2-1024-MS/blob/main/LICENSE) | 20 GB |
| [Flex.1-alpha](https://huggingface.co/ostris/Flex.1-alpha) | Ostris | [Apache 2.0](https://huggingface.co/ostris/Flex.1-alpha/blob/main/LICENSE) | 24 GB |
| [Janus-Pro-7B](https://huggingface.co/deepseek-ai/Janus-Pro-7B) | DeepSeek | [MIT](https://huggingface.co/deepseek-ai/Janus-Pro-7B/blob/main/LICENSE) | 14 GB |

### Video Generation

| Model | Author | License | Size |
|-------|--------|---------|-----:|
| [SANA-Video 2B 720p](https://huggingface.co/Efficient-Large-Model/SANA-Video_2B_720p) | NVIDIA / MIT | [Apache 2.0](https://huggingface.co/Efficient-Large-Model/SANA-Video_2B_720p/blob/main/LICENSE) | 17 GB |

### Audio (Speech-to-Text + Text-to-Speech)

| Model | Author | License | Size | Type |
|-------|--------|---------|-----:|------|
| [Whisper Large](https://huggingface.co/openai/whisper-large) | OpenAI | [MIT](https://huggingface.co/openai/whisper-large/blob/main/LICENSE) | 6 GB | STT |
| [Whisper Large V3 Turbo](https://huggingface.co/openai/whisper-large-v3-turbo) | OpenAI | [MIT](https://huggingface.co/openai/whisper-large-v3-turbo/blob/main/LICENSE) | 3 GB | STT |
| [Parakeet TDT 1.1B](https://huggingface.co/nvidia/parakeet-tdt-1.1b) | NVIDIA | [CC-BY-4.0](https://huggingface.co/nvidia/parakeet-tdt-1.1b) | 4 GB | STT |
| [Canary-Qwen 2.5B](https://huggingface.co/nvidia/canary-qwen-2.5b) | NVIDIA | [CC-BY-4.0](https://huggingface.co/nvidia/canary-qwen-2.5b) | 10 GB | STT |
| [Voxtral Mini 3B](https://huggingface.co/mistralai/Voxtral-Mini-3B-2507) | Mistral AI | [Apache 2.0](https://huggingface.co/mistralai/Voxtral-Mini-3B-2507/blob/main/LICENSE) | 7 GB | STT |
| [Orpheus 3B](https://huggingface.co/canopylabs/orpheus-3b-0.1-ft) | Canopy Labs | [Apache 2.0](https://huggingface.co/canopylabs/orpheus-3b-0.1-ft) | 7 GB | TTS |
| [Kokoro 82M](https://huggingface.co/hexgrad/Kokoro-82M) | Hexgrad | [Apache 2.0](https://huggingface.co/hexgrad/Kokoro-82M) | 0.3 GB | TTS |
| [VibeVoice 1.5B](https://huggingface.co/WillHeld/VibeVoice-1.5B) | Will Held | [Apache 2.0](https://huggingface.co/WillHeld/VibeVoice-1.5B) | 6 GB | TTS |
| [OpenAudio S1 Mini](https://huggingface.co/FishAudio/OpenAudio-S1-Mini) | Fish Audio | [CC-BY-NC-SA-4.0](https://huggingface.co/FishAudio/OpenAudio-S1-Mini) | 2 GB | TTS |
| [Chatterbox](https://huggingface.co/resemble-ai/chatterbox) | Resemble AI | [MIT](https://huggingface.co/resemble-ai/chatterbox) | 1 GB | TTS |

### Large Language Models

| Model | Author | License | Size |
|-------|--------|---------|-----:|
| [DeepSeek-MoE-16B](https://huggingface.co/deepseek-ai/deepseek-moe-16b-chat) | DeepSeek | [MIT](https://huggingface.co/deepseek-ai/deepseek-moe-16b-chat/blob/main/LICENSE) | 31 GB |
| [Qwen3-30B-A3B-Thinking](https://huggingface.co/Qwen/Qwen3-30B-A3B) | Alibaba / Qwen | [Apache 2.0](https://huggingface.co/Qwen/Qwen3-30B-A3B/blob/main/LICENSE) | 57 GB |
| [TinyLlama 1.1B](https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v1.0) | TinyLlama | [Apache 2.0](https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v1.0/blob/main/LICENSE) | 4 GB |

> **Non-commercial models:** OpenAudio S1 Mini uses [CC-BY-NC-SA-4.0](https://creativecommons.org/licenses/by-nc-sa/4.0/) — non-commercial use only. Check each model's license before commercial deployment.

---

## The NBX Format

NeuroBrix introduces `.nbx` — a **universal container format for AI models**. Where GGUF is limited to LLMs and ONNX struggles with dynamic architectures, NBX captures any computation graph with full fidelity.

```
model.nbx (self-contained archive)
  ├── graph.json         Complete computation graph (TensorDAG)
  ├── topology.json      Execution flow and component connections
  ├── manifest.json      Component metadata
  ├── defaults.json      Runtime parameters
  └── weights/           Parameters in safetensors format
```

**What makes NBX different:**

- **Framework-independent** — no dependency on PyTorch, TensorFlow, or any framework at runtime interpretation level
- **Self-describing** — the container carries everything needed to execute
- **Modality-agnostic** — the same format works for diffusion, LLMs, MoE, audio, video, and any future architecture
- **Deterministic** — the execution graph is fully resolved at build time

---

## Prism: Automatic Hardware Allocation

You describe your hardware. NeuroBrix figures out the rest. Hardware is auto-detected — the `--hardware` flag is optional.

| Strategy | Description |
|----------|-------------|
| `single_gpu` | Model fits entirely in one GPU |
| `single_gpu_lifecycle` | Components loaded/unloaded sequentially |
| `pipeline_parallel` | Per-layer sequential fill across GPUs |
| `block_scatter` | Block-level distribution across GPUs |
| `weight_sharding` | Weight-file distribution across GPUs |
| `lazy_sequential` | Stream components through limited VRAM |
| `zero3` | CPU offload with GPU compute |

**GPU support:** NVIDIA, AMD, Intel, Apple (planned), plus Tenstorrent, Moore Threads, Biren, Iluvatar, Hygon DCU, Cambricon detection.

---

## Architecture

```
.nbx Container ──> Prism Solver ──> Execution Plan ──> CompiledSequence ──> Output
                   (hardware)       (strategy)         (zero-overhead)
```

The runtime compiles the entire execution graph at load time into a **CompiledSequence** — a zero-overhead execution path with pre-resolved tensor slots, automatic mixed precision, direct SDPA calls, and integer-indexed memory arena. No dict lookups per step. No interpretation overhead.

### ZERO Principles

| Principle | What It Means |
|-----------|---------------|
| **ZERO HARDCODE** | All values derived from the NBX container. Nothing hardcoded in the engine. |
| **ZERO FALLBACK** | System crashes explicitly if data is missing. No silent defaults. |
| **ZERO SEMANTIC** | Runtime has no domain knowledge. Only tensors and execution plans. |

---

## Roadmap

### Done

- [x] **CompiledSequence** — zero-overhead graph execution engine
- [x] **Prism solver** — automatic multi-GPU hardware allocation (7 strategies)
- [x] **Image family** — 6 diffusion models (PixArt, Sana, Flex, Janus)
- [x] **LLM family** — MoE (DeepSeek), dense (TinyLlama, Qwen3)
- [x] **Audio family** — 11 models, 5 flow handlers (STT + TTS)
- [x] **Video family** — SANA-Video 720p (first of 10 planned)
- [x] **Cross-platform** — Linux, Windows, macOS support
- [x] **Hardware auto-detection** — 10 GPU vendors, CPU-only fallback
- [x] **Persistent serving** — warm daemon with chat interface
- [x] **DtypeEngine** — automatic mixed precision (AMP)
- [x] **TilingEngine** — universal spatial tiling for large inputs
- [x] **NBX Hub** — model registry at neurobrix.es

### Next

- [ ] **Video family expansion** — remaining 9 models (Wan2.1, CogVideoX, Allegro, Mochi, Open-Sora)
- [ ] **Vision-Language Models** — multimodal understanding at scale
- [ ] **Quantization** — INT8/INT4 with NBX-native support
- [ ] **Apple Silicon** — Metal/MPS backend
- [ ] **Upscalers** — super-resolution models
- [ ] **3D generation** — mesh and NeRF models
- [ ] **Embeddings** — text and image embedding models
- [ ] **NeuroBrix Studio** — desktop GUI for model management

---

## CLI Reference

```bash
# Serving (recommended) — hardware auto-detected
neurobrix serve --model <name>
neurobrix chat [--temperature T] [--max-tokens N]
neurobrix run --prompt <text> [--output file] [--steps N] [--cfg F] [--seed N]
neurobrix stop

# Single-shot — hardware auto-detected
neurobrix run --model <name> --prompt <text> [options]

# Model management
neurobrix hub [--category IMAGE|LLM|AUDIO|VIDEO]
neurobrix import <org/name> [--no-keep] [--force]
neurobrix list [--store]
neurobrix remove <name> [--store|--all]
neurobrix clean [--store|--cache|--all] [-y]

# Inspection
neurobrix info [--models] [--hardware] [--system]
neurobrix inspect <model.nbx> [--topology] [--weights]
neurobrix validate <model.nbx> [--level deep] [--strict]
```

---

## Contributing

NeuroBrix is open source under the Apache 2.0 license. Contributions are welcome.

See **[CONTRIBUTING.md](https://github.com/NeuroBrix/neurobrix/blob/main/CONTRIBUTING.md)** for guidelines.

---

## Model Licenses & Responsible Use

**NeuroBrix is an inference engine — it does not create, train, or own any AI model.**

All models listed in this repository are the intellectual property of their respective authors. NeuroBrix converts published model weights into the `.nbx` container format for efficient execution. The original model licenses remain in full effect.

**User responsibilities:**

- **Review the license** of each model before downloading or using it
- **Non-commercial models** (e.g., CC-BY-NC-SA-4.0) may not be used for commercial purposes
- **Gated models** on Hugging Face require explicit license acceptance before access
- **Redistribution** of model weights is governed by each model's license, not by NeuroBrix's license
- **You are solely responsible** for ensuring your use complies with the applicable model license

**NeuroBrix Hub (neurobrix.es):**

The NeuroBrix Hub hosts pre-built `.nbx` packages for convenience. These packages contain model weights in their original precision, repackaged in the NBX container format. All models on the hub are sourced from publicly available releases with permissive or open licenses. If you are a model author and believe your work is hosted in violation of your license terms, please contact us at legal@neurobrix.es for immediate removal.

---

## License

**NeuroBrix Engine** — Apache License 2.0

Copyright 2025-2026 Hocine Benkelaya

NeuroBrix is developed by [**WizWorks OÜ**](https://wizworks.io), a property of [**Neural Networks Holding LTD**](https://neuralnetworkholding.com).

The Apache 2.0 license covers the NeuroBrix engine, CLI, runtime, and NBX format tooling. **It does not cover the model weights** executed by the engine — those are governed by their respective licenses as listed in the [Supported Models](#supported-models) section.

See [LICENSE](LICENSE) for the full text.
