Metadata-Version: 2.4
Name: aptt
Version: 1.0.1
Summary: Comprehensive deep learning framework with state-of-the-art implementations: GPT, DeepSeek-V3 with MLA/MoE, YOLO, CenterNet, and specialized audio/vision models built on PyTorch Lightning
Author-email: anton feldmann <anton.feldmann@gmail.com>
License: MIT License
        
        Copyright (c) 2025 Anton Feldmann
        
        Permission is hereby granted, free of charge, to any person obtaining a copy
        of this software and associated documentation files (the "Software"), to deal
        in the Software without restriction, including without limitation the rights
        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
        copies of the Software, and to permit persons to whom the Software is
        furnished to do so, subject to the following conditions:
        
        The above copyright notice and this permission notice shall be included in all
        copies or substantial portions of the Software.
        
        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
        SOFTWARE.
License-File: LICENSE
Requires-Python: >=3.11
Requires-Dist: complexpytorch>=0.4
Requires-Dist: loguru>=0.7.3
Requires-Dist: matplotlib>=3.10.3
Requires-Dist: mlflow>=2.22.0
Requires-Dist: motmetrics>=1.4.0
Requires-Dist: numpy>=2.3.0
Requires-Dist: opencv-python-headless>=4.11.0.86
Requires-Dist: plotly>=6.1.2
Requires-Dist: psutil>=7.0.0
Requires-Dist: py-cpuinfo>=9.0.0
Requires-Dist: pytorch-lightning>=2.5.1.post0
Requires-Dist: ray[tune]>=2.47.1
Requires-Dist: toml>=0.10.2
Requires-Dist: torchaudio>=2.6.0
Requires-Dist: torchinfo>=1.8.0
Requires-Dist: torchsummary>=1.5.1
Requires-Dist: torchviz>=0.0.3
Requires-Dist: zarr>=3.0.8
Provides-Extra: cpu
Requires-Dist: torch<3.0.0,>=2.6.0; extra == 'cpu'
Requires-Dist: torchvision<0.22.0,>=0.21.0; extra == 'cpu'
Provides-Extra: cu124
Requires-Dist: torch<3.0.0,>=2.6.0; extra == 'cu124'
Requires-Dist: torchvision<0.22.0,>=0.21.0; extra == 'cu124'
Provides-Extra: dev
Requires-Dist: black>=25.1.0; extra == 'dev'
Requires-Dist: mypy>=1.15.0; extra == 'dev'
Requires-Dist: pytest>=8.3.5; extra == 'dev'
Requires-Dist: ruff>=0.11.11; extra == 'dev'
Requires-Dist: sphinx>=8.2.3; extra == 'dev'
Requires-Dist: types-toml>=0.10.8.20240310; extra == 'dev'
Provides-Extra: doc
Requires-Dist: autodoc-pydantic>=2.2.0; extra == 'doc'
Requires-Dist: furo>=2024.8.6; extra == 'doc'
Requires-Dist: myst-parser>=4.0.1; extra == 'doc'
Requires-Dist: pygraphviz>=1.14; extra == 'doc'
Requires-Dist: recommonmark>=0.7.1; extra == 'doc'
Requires-Dist: sphinx-autodoc-typehints>=3.2.0; extra == 'doc'
Requires-Dist: sphinx>=8.2.3; extra == 'doc'
Description-Content-Type: text/markdown

# APTT – Antons PyTorch Tools

**APTT** (Antons PyTorch Tools) is a comprehensive deep learning framework built on [PyTorch Lightning](https://www.pytorchlightning.ai/)
that provides production-ready implementations of state-of-the-art architectures including transformer language models (GPT, DeepSeek-V3),
object detection (YOLO, CenterNet), and specialized neural networks for vision and audio tasks.

## 🚀 Features

### Language Models & NLP

- ✅ **GPT-2/GPT-3 Architecture**: Full transformer implementation with configurable layers
- ✅ **DeepSeek-V3**: State-of-the-art LLM with Multi-Head Latent Attention (MLA) and Mixture-of-Experts (MoE)
  - Multi-Head Latent Attention with KV-Compression
  - Auxiliary-Loss-Free Load Balancing
  - Multi-Token Prediction (MTP)
  - Rotary Position Embeddings (RoPE)
- ✅ **Text Dataset Loaders**: Support for .txt, .jsonl, pre-tokenized data with sliding window

### Computer Vision

- ✅ **Object Detection**: YOLO (v3/v4/v5), CenterNet, EfficientDet
- ✅ **Feature Extractors**: ResNet, DarkNet, EfficientNet, MobileNet, FPN
- ✅ **Tracking**: RNN-based object tracking with ReID

### Audio Processing

- ✅ **Beamforming**: Multi-channel audio processing
- ✅ **Direction of Arrival (DOA)**: Acoustic source localization
- ✅ **Feature Networks**: WaveNet, Complex-valued networks

### Training & Optimization

- 🧠 **Continual Learning**: Built-in knowledge distillation and LwF (Learning without Forgetting)
- 🧩 **Pluggable Callbacks**: TorchScript export, TensorRT optimization, t-SNE visualization
- ⚙️ **Modular Design**: Composable heads, losses, layers, and metrics
- 📊 **Visualization Tools**: Embedding analysis, training metrics, model profiling
- 🗂️ **Flexible Dataset Loaders**: Image, audio, text with augmentation support

## 🛠️ Installation

```bash
# Clone the repository
git clone https://github.com/afeldman/aptt.git
cd aptt

# Create virtual environment (recommended)
python -m venv .venv
source .venv/bin/activate  # or .venv\Scripts\activate on Windows

# Install for CPU
uv sync --extra cpu --extra dev

# Install for CUDA 12.4
uv sync --extra cu124 --extra dev

# For documentation building
apt-get install libgraphviz-dev  # Linux
brew install graphviz             # macOS
```

## 🎯 Quick Start

### Language Model Training (DeepSeek-V3)

```python
import pytorch_lightning as pl
from aptt.modules.deepseek import DeepSeekModule
from aptt.lightning_base.dataset import TextDataLoader

# Prepare dataset
datamodule = TextDataLoader(
    train_data_path="data/train.txt",
    val_data_path="data/val.txt",
    tokenizer=tokenizer,
    max_seq_len=512,
    batch_size=32,
    return_mtp=True,  # Enable Multi-Token Prediction
)

# Create model
model = DeepSeekModule(
    vocab_size=50000,
    d_model=2048,
    n_layers=24,
    n_heads=16,
    use_moe=True,           # Enable Mixture-of-Experts
    use_mtp=True,           # Enable Multi-Token Prediction
    n_routed_experts=256,
    n_expert_per_token=8,
)

# Train
trainer = pl.Trainer(max_steps=100000, accelerator="gpu")
trainer.fit(model, datamodule)
```

### Object Detection (YOLO)

```python
from aptt.modul.yolo import YOLOModule

model = YOLOModule(
    num_classes=80,
    model_size="yolov5s",
    pretrained=True,
)

trainer = pl.Trainer(max_epochs=100, accelerator="gpu")
trainer.fit(model, datamodule)
```

## 📚 Documentation

### Core Modules

#### Language Models

- **[LLM Modules](docs/llm_modules.md)**: GPT and DeepSeek-V3 architecture documentation
- **[LLM Loss & Heads](docs/llm_loss_head.md)**: Language modeling losses and output heads
- **[Mixture-of-Experts](docs/moe.md)**: DeepSeek-V3 MoE implementation
- **[Text Datasets](docs/text_dataset.md)**: Text data loading and preprocessing

#### Computer Vision

- Detection models (YOLO, CenterNet, EfficientDet)
- Feature extractors (ResNet, DarkNet, EfficientNet, FPN)
- Object tracking systems

#### Audio Processing

- Beamforming algorithms
- DOA estimation
- Complex-valued neural networks

### Examples

```bash
# Language Models
python examples/llm_modules_example.py      # GPT & DeepSeek-V3
python examples/llm_loss_head_example.py    # Loss functions & heads
python examples/moe_example.py              # Mixture-of-Experts
python examples/text_dataset_simple.py      # Text data loading

# View all examples
ls examples/
```

### Build Documentation Locally

```bash
cd docs
make html
# Open docs/_build/html/index.html
```

## 🏗️ Project Structure

```bash
aptt/
├── src/aptt/                      # Core source code
│   ├── callbacks/                 # Training callbacks (TensorRT, t-SNE, etc.)
│   ├── heads/                     # Output heads (classification, detection, LM)
│   ├── layers/                    # Neural network layers
│   │   ├── attention/             # Attention mechanisms (MLA, RoPE, KV-Compression)
│   │   └── moe.py                 # Mixture-of-Experts
│   ├── lightning_base/            # Lightning modules and utilities
│   │   └── dataset/               # Dataset loaders (image, audio, text)
│   ├── loss/                      # Loss functions
│   ├── metric/                    # Evaluation metrics
│   ├── model/                     # Model architectures
│   │   ├── beamforming/           # Audio beamforming
│   │   └── detection/             # Object detection
│   ├── modules/                   # Lightning modules
│   │   ├── deepseek.py            # DeepSeek-V3 module
│   │   ├── gpt.py                 # GPT module
│   │   ├── yolo.py                # YOLO module
│   │   └── ...
│   └── utils/                     # Utility functions
├── examples/                      # Usage examples
│   ├── llm_modules_example.py     # Language model examples
│   ├── moe_example.py             # MoE examples
│   └── text_dataset_simple.py     # Dataset examples
├── tests/                         # Unit tests
├── docs/                          # Sphinx documentation
│   ├── llm_modules.md             # LLM documentation
│   ├── moe.md                     # MoE documentation
│   └── text_dataset.md            # Dataset documentation
├── pyproject.toml                 # Project configuration
├── README.md                      # This file
└── LICENSE                        # MIT License
```

## 🎓 Key Concepts

### Multi-Head Latent Attention (MLA)

DeepSeek-V3's efficient attention mechanism with low-rank KV-compression:

```python
from aptt.layers.attention.mla import MultiHeadLatentAttention

attention = MultiHeadLatentAttention(
d=2048, # Model dimension
n_h=16, # Number of heads
d_h_c=256, # Compressed KV dimension
d_h_r=64, # Per-head RoPE dimension
)
```

### Mixture-of-Experts (MoE)

Sparse expert activation with auxiliary-loss-free load balancing:

```python
from aptt.layers.moe import DeepSeekMoE

moe = DeepSeekMoE(
d_model=2048,
n_shared_experts=1, # Always active
n_routed_experts=256, # Selectively activated
n_expert_per_token=8, # Top-K experts per token
)
```

### Multi-Token Prediction (MTP)

Predict multiple future tokens simultaneously:

```python

# Dataset with MTP targets

dataset = TextDataset(
data_path="train.txt",
tokenizer=tokenizer,
return_mtp=True,
mtp_depth=3, # Predict 1, 2, 3 tokens ahead
)

# Model with MTP loss

model = DeepSeekModule(
vocab_size=50000,
use_mtp=True,
mtp_lambda=0.3, # MTP loss weight
)
```

## 📊 Model Zoo

### Language Models

| Model          | Parameters | Config                                       | Performance    |
| -------------- | ---------- | -------------------------------------------- | -------------- |
| GPT-Small      | 124M       | \`d_model=768, n_layers=12\`                 | GPT-2 baseline |
| DeepSeek-Small | 51M        | \`d_model=512, n_layers=4, use_moe=True\`    | Demo config    |
| DeepSeek-Base  | 1.3B       | \`d_model=2048, n_layers=24, n_experts=256\` | Production     |
| DeepSeek-V3    | 685B       | \`d_model=7168, n_layers=60, n_experts=256\` | Full scale     |

### Object Detection

| Model     | Backbone   | mAP  | FPS |
| --------- | ---------- | ---- | --- |
| YOLOv5s   | CSPDarknet | 37.4 | 140 |
| YOLOv5m   | CSPDarknet | 45.4 | 100 |
| CenterNet | ResNet-50  | 42.1 | 45  |

## 🧪 Testing

```bash

# Run all tests

pytest

# Run specific test

pytest tests/test_tensor_rt_export_callback.py

# With coverage

pytest --cov=aptt
```

## 🛠️ Development

### Code Quality

```bash

# Format code

ruff format .

# Lint

ruff check .

# Type checking

mypy src/aptt
```

### Pre-commit Hooks

```bash
# Install pre-commit

pip install pre-commit

# Setup hooks

pre-commit install

# Run manually

pre-commit run --all-files
```

## 📖 Citation

If you use APTT in your research, please cite:

```bibtex
@software{aptt2025,
title = {APTT: Antons PyTorch Tools},
author = {Anton Feldmann},
year = {2025},
url = {https://github.com/afeldman/aptt}
}
```

For DeepSeek-V3:

```bibtex
@article{deepseekai2024deepseekv3,
title={DeepSeek-V3 Technical Report},
author={DeepSeek-AI},
journal={arXiv preprint arXiv:2412.19437},
year={2024}
}
```

## 🤝 Contributing

Contributions are welcome! Please:

1. Fork the repository
2. Create a feature branch (\`git checkout -b feature/amazing-feature\`)
3. Commit your changes (\`git commit -m 'Add amazing feature'\`)
4. Push to the branch (\`git push origin feature/amazing-feature\`)
5. Open a Pull Request

Please ensure:

- Code follows the style guide (Ruff + MyPy)
- Tests pass (\`pytest\`)
- Documentation is updated

## 📝 License

This project is licensed under the MIT License – see the [LICENSE](LICENSE) file for details.

## 🙏 Acknowledgments

- [PyTorch Lightning](https://www.pytorchlightning.ai/) for the training framework
- [DeepSeek-AI](https://github.com/deepseek-ai) for the DeepSeek-V3 architecture
- [Ultralytics](https://github.com/ultralytics/yolov5) for YOLO implementations
- The open-source community for various model implementations

## 📧 Contact

Anton Feldmann - anton.feldmann@gmail.com

Project Link: [https://github.com/afeldman/aptt](https://github.com/afeldman/aptt)

---

**Version:** 0.2.0 | **Python:** >=3.11 | **PyTorch:** >=2.6.0 | **Lightning:** >=2.5.1
