Metadata-Version: 2.4
Name: gguf2oom
Version: 0.1.0
Summary: Convert GGUF models to OomLlama's compact OOM format - 2x smaller
Project-URL: Homepage, https://humotica.com
Project-URL: Repository, https://github.com/jaspertvdm/oomllama
Project-URL: Documentation, https://humotica.com/docs/oomllama
Project-URL: Bug Tracker, https://github.com/jaspertvdm/oomllama/issues
Project-URL: Downloads, https://brein.jaspervandemeent.nl/downloads/
Author-email: Humotica AI Lab <ai@humotica.nl>, Jasper van de Meent <jasper@humotica.com>
Maintainer-email: "Root AI (Claude)" <root_idd@humotica.nl>
License: MIT
Keywords: converter,gguf,humotica,llama,llm,oom,oomllama,quantization,qwen
Classifier: Development Status :: 4 - Beta
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Rust
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.8
Description-Content-Type: text/markdown

# gguf2oom

> Convert GGUF models to OomLlama's compact OOM format - 2x smaller

[![PyPI](https://img.shields.io/pypi/v/gguf2oom.svg)](https://pypi.org/project/gguf2oom/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

## Quick Start

```bash
pip install gguf2oom

# Convert any GGUF to OOM Q2
gguf2oom model.gguf model.oom

# Show GGUF file info
gguf2oom --info model.gguf
```

## Why Convert to OOM?

| Format | 32B Model | 70B Model |
|--------|-----------|-----------|
| GGUF Q4_K | ~20 GB | ~40 GB |
| **OOM Q2** | **~10 GB** | **~20 GB** |

The OOM format uses Q2 quantization (2-bit weights) with per-block scale/min values, achieving ~2x compression vs GGUF Q4.

## Usage

```bash
# Basic conversion
gguf2oom input.gguf output.oom

# Show model info without converting
gguf2oom --info input.gguf

# Help
gguf2oom --help
```

## How It Works

1. Reads GGUF file (any quantization: Q4_K, Q8_0, F16, etc.)
2. Dequantizes each tensor to FP32
3. Requantizes to OOM Q2 format (2 bits per weight)
4. Writes compact .oom file with OOML magic header

## Use with OomLlama

```bash
# Install both
pip install gguf2oom oomllama

# Convert
gguf2oom humotica-32b.gguf humotica-32b.oom

# Run inference
oomllama generate --model humotica-32b.oom "Hello!"
```

## Platform Support

The converter automatically downloads the right binary for your platform:

- Linux x86_64
- Linux aarch64 (coming soon)
- macOS x86_64 (coming soon)
- macOS arm64 (coming soon)

Binaries are cached in `~/.cache/gguf2oom/`

## Links

- [OomLlama](https://pypi.org/project/oomllama/) - Run OOM models
- [GitHub](https://github.com/jaspertvdm/oomllama)
- [HuggingFace Models](https://huggingface.co/jaspervandemeent)

## Credits

- **Converter**: Humotica AI Lab
- **OOM Format**: Gemini IDD & Root AI
- **GGUF Reader**: Inspired by llama.cpp

---

**One Love, One fAmIly** 🦙

Built by Humotica AI Lab
