Metadata-Version: 2.4
Name: badas
Version: 1.0.0
Summary: Video Collision Anticipation Inference
Requires-Python: >=3.10
Description-Content-Type: text/markdown
Requires-Dist: torch>=2.0.0
Requires-Dist: transformers>=4.30.0
Requires-Dist: opencv-python>=4.8.0
Requires-Dist: numpy>=1.24.0
Requires-Dist: PyYAML>=6.0
Requires-Dist: huggingface_hub>=0.16.0
Provides-Extra: training
Requires-Dist: pytorch-lightning>=2.0.0; extra == "training"
Requires-Dist: torchmetrics>=1.0.0; extra == "training"
Requires-Dist: albumentations>=1.3.0; extra == "training"
Requires-Dist: pandas>=2.0.0; extra == "training"
Requires-Dist: wandb>=0.15.0; extra == "training"
Requires-Dist: tqdm>=4.65.0; extra == "training"
Requires-Dist: boto3>=1.28.0; extra == "training"
Requires-Dist: scikit-learn>=1.3.0; extra == "training"
Provides-Extra: dev
Requires-Dist: pytest>=7.0.0; extra == "dev"
Requires-Dist: black>=23.0.0; extra == "dev"
Requires-Dist: ruff>=0.0.280; extra == "dev"

# BADAS - Video Collision Anticipation Framework

[![Python 3.10+](https://img.shields.io/badge/python-3.10+-blue.svg)](https://www.python.org/downloads/)
[![PyTorch 2.0+](https://img.shields.io/badge/pytorch-2.0+-ee4c2c.svg)](https://pytorch.org/)
[![License](https://img.shields.io/badge/license-proprietary-red.svg)](LICENSE)

**BADAS** (Batch Anomaly Detection and Anticipation System) is a production-ready deep learning framework for predicting collision likelihood in dashcam video sequences. Built with PyTorch and PyTorch Lightning, it supports multiple vision transformer backbones and deployment formats optimized for real-time inference.

## Key Features

- **Multiple Backbone Support**: V-JEPA2 (ViT-L/H/G) and VideoMAE-v2
- **Production Deployment**: PyTorch, ONNX, and TensorRT backends
- **Knowledge Distillation**: Train smaller, faster models from larger teachers
- **Real-time Streaming**: Causal inference with temporal smoothing
- **GPU-Optimized**: FP16 inference, torch.compile, incremental preprocessing

## Performance

| Metric | Value |
|--------|-------|
| Inference Latency | ~42ms per prediction (TensorRT) |
| Model Size | ~624 MB (FP16) |
| Memory Usage | ~1.3 GB peak (FP16) |

## Installation

```bash
# Clone the repository
git clone https://github.com/getnexar/badas.git
cd badas

# Create conda environment
conda create -n badas python=3.10
conda activate badas

# Install package
pip install -e .

# Optional: Install inference backends
pip install onnxruntime-gpu  # ONNX support
pip install tensorrt         # TensorRT support (requires CUDA)
pip install decord           # Fast video decoding
```

## Quick Start

### Inference

```python
from badas.inference import BADAS, BADASConfig, SmoothingConfig

# Load model (auto-detects backend from extension)
predictor = BADAS("path/to/model.ckpt")

# Batch prediction on video
results = predictor.predict_video("dashcam.mp4", stride=4, verbose=True)
for r in results:
    print(f"[{r['timestamp']:.2f}s] Collision probability: {r['probability']:.3f}")

# Streaming inference (real-time compatible)
for pred in predictor.predict_stream("dashcam.mp4", stride=4):
    if pred['probability'] > 0.7:
        print(f"WARNING: High collision risk at {pred['timestamp']:.2f}s")
```

### Training

```bash
# Standard training
python -m badas.training.lightning_training --config configs/badas-1.5-training.yml

# Knowledge distillation (V-JEPA2 -> VideoMAE-v2 Base)
python -m badas.training.distillation --config configs/distillation_videomaev2.yml
```

### Data Preprocessing

```bash
# Convert raw videos to training clips
python badas/training/video_process.py \
    --csv-path /path/to/metadata.csv \
    --output-dir /path/to/clips \
    --target-fps 8 \
    --window-size 16 \
    --img-width 256 \
    --img-height 256 \
    --num-workers 8
```

## Project Structure

```
badas/
├── badas/
│   ├── core/                    # Shared model components
│   │   ├── base.py              # Abstract video classifier base
│   │   ├── vjepa2.py            # V-JEPA2 backbone implementation
│   │   ├── videomae.py          # VideoMAE-v2 backbone
│   │   ├── modules.py           # Heads & temporal processors
│   │   ├── preprocessing.py     # Shared preprocessing pipeline
│   │   └── registry.py          # Model factory & registration
│   │
│   ├── training/                # PyTorch Lightning training
│   │   ├── lightning_training.py    # Main training entry point
│   │   ├── distillation.py          # Knowledge distillation
│   │   ├── lightning_module.py      # Training logic & losses
│   │   ├── lightning_data_module.py # Data loading
│   │   ├── dataset.py               # Video dataset class
│   │   ├── config.py                # Configuration system
│   │   └── video_process.py         # Data preprocessing
│   │
│   └── inference/               # Production inference (no Lightning)
│       ├── inference.py         # Main BADAS predictor class
│       ├── backends.py          # PyTorch/ONNX/TensorRT backends
│       ├── smoothing.py         # Temporal smoothing
│       ├── preprocessors.py     # GPU preprocessing
│       └── frame_sampling.py    # Video frame extraction
│
├── configs/                     # Training configurations
├── scripts/                     # Utility scripts
└── pyproject.toml
```

## Supported Models

### V-JEPA2 (Recommended)

| Variant | Hidden Dim | Layers | Parameters |
|---------|------------|--------|------------|
| ViT-L   | 1024       | 24     | ~300M      |
| ViT-H   | 1280       | 32     | ~700M      |
| ViT-G   | 1408       | 40     | ~1B        |

```yaml
model:
  model_name: "facebook/vjepa2-vitl-fpc16-256-ssv2"
  use_future_prediction: true  # Leverage V-JEPA2's predictor
```

### VideoMAE-v2

Lightweight alternative (~86M parameters) for resource-constrained deployment.

```yaml
model:
  model_name: "MCG-NJU/videomae-base"
```

## Configuration

### Training Configuration

```yaml
# configs/example.yml
model:
  model_name: "facebook/vjepa2-vitl-fpc16-256-ssv2"
  num_classes: 2
  freeze_backbone: false
  temporal_method: "attention"  # mean, max, lstm, attention, probe
  head_type: "mlp"              # linear, mlp, attention

data:
  data_root: "/path/to/dataset"
  csv_path: "/path/to/metadata.csv"
  frame_count: 16
  img_size: 256
  dataset_fps: 8
  oversample: true
  oversample_ratio: 1.0  # 1:1 class balance

training:
  batch_size: 24
  learning_rate: 1.01e-5
  weight_decay: 1.0e-4
  max_steps: 100000
  loss_type: "bce"  # bce, focal, negative_margin
  early_stopping_metric: "val_ap"
  early_stopping_patience: 10

system:
  precision: "16-mixed"
  devices: -1  # All available GPUs

logging:
  use_wandb: true
  wandb_project: "badas"
```

### Distillation Configuration

Distill knowledge from V-JEPA2 teachers to VideoMAE-v2 Base student:

| Teacher | Student | Compression |
|---------|---------|-------------|
| V-JEPA2 ViT-L (~300M) | VideoMAE-v2 Base (~86M) | ~3.5x smaller |
| V-JEPA2 ViT-H (~700M) | VideoMAE-v2 Base (~86M) | ~8x smaller |
| V-JEPA2 ViT-G (~1B) | VideoMAE-v2 Base (~86M) | ~12x smaller |

```yaml
distillation:
  teacher_model_name: "facebook/vjepa2-vitl-fpc16-256-ssv2"  # ~300M params
  student_model_name: "MCG-NJU/videomae-base"  # ~86M params
  temperature: 4.0
  alpha_hard: 0.5      # Ground truth loss weight
  alpha_logit: 0.3     # Soft label distillation
  alpha_feature: 0.15  # Feature matching
  alpha_attention: 0.05  # Attention transfer
```

## Dataset Format

BADAS uses a flat directory structure with CSV metadata:

```
dataset/
├── clips/
│   ├── video1_win000_l0.mp4
│   ├── video1_win001_l1.mp4
│   └── ...
└── clips_metadata.csv
```

### CSV Format

```csv
path,label,split,source
clips/video1_win000_l0.mp4,0,train,video1
clips/video1_win001_l1.mp4,1,train,video1
clips/video2_win000_l0.mp4,0,val,video2
```

| Column | Description |
|--------|-------------|
| `path` | Relative path to clip file |
| `label` | 0 (safe) or 1 (collision) |
| `split` | train, val, or test |
| `source` | Original video identifier (optional) |

## Inference Backends

### PyTorch (Default)

```python
predictor = BADAS("model.ckpt")  # Full model with config
```

- Supports `torch.compile` optimization
- Full precision or FP16 inference
- Largest file size but most flexible

### ONNX

```python
predictor = BADAS("model.onnx")  # Cross-platform
```

- CPU and GPU support via ONNX Runtime
- Smaller file size
- Cross-platform compatibility

### TensorRT

```python
predictor = BADAS("model.trt")  # Maximum performance
```

- Lowest latency (~40ms)
- NVIDIA GPU required
- Best for production deployment

## Temporal Smoothing

Configure smoothing for stable real-time predictions:

```python
from badas.inference import SmoothingConfig

config = BADASConfig(
    smoothing=SmoothingConfig(
        enabled=True,
        alpha_rise=0.7,      # Smoothing when increasing
        alpha_fall=0.3,      # Smoothing when decreasing
        spike_threshold=0.3, # Spike detection threshold
        spike_dampening=0.15 # Reduce spike influence
    )
)
```

Presets available:
- `SmoothingConfig.fast_response()` - Quick reaction to changes
- `SmoothingConfig.stable()` - Conservative, fewer false alarms
- `SmoothingConfig.disabled()` - Raw predictions

## Training Features

### Loss Functions

- **BCE Loss**: Standard binary cross-entropy
- **Focal Loss**: Down-weights easy examples (gamma=2.0, alpha=0.25)
- **Negative Margin Loss**: Penalizes false positives above margin

### Data Augmentation

Built-in Albumentations pipeline:
- Random brightness/contrast
- Hue/saturation adjustment
- Gaussian blur
- Motion blur
- Weather effects (rain, snow)
- Random shadows

### Class Balancing

Control class distribution via `oversample_ratio`:
- `1.0` - Balanced (1:1 negative:positive)
- `2.0` - More negatives (2:1)
- `0.5` - More positives (1:2)

## Scripts

### Model Comparison

```bash
python scripts/compare_models.py \
    --models model1.ckpt model2.ckpt \
    --videos test1.mp4 test2.mp4 \
    --output comparison.png
```

### Kaggle Benchmark

```bash
python scripts/benchmark_kaggle.py \
    --model best_model.ckpt \
    --data-dir /path/to/kaggle/test \
    --output submission.csv
```

### Data Migration

```bash
# Convert old nested structure to flat CSV-based format
python scripts/migrate_to_flat_structure.py \
    --data-root /path/to/old_dataset \
    --output-dir /path/to/new_dataset \
    --splits train val test
```

## API Reference

### BADAS Class

```python
class BADAS:
    def __init__(
        self,
        checkpoint_path: str,
        config: BADASConfig = None
    ):
        """Initialize predictor with model checkpoint."""

    def predict_video(
        self,
        video_path: str,
        stride: int = 1,
        verbose: bool = False
    ) -> List[Dict]:
        """Batch prediction on entire video."""

    def predict_stream(
        self,
        video_path: str,
        stride: int = 1
    ) -> Generator[Dict, None, None]:
        """Streaming prediction (real-time compatible)."""

    def predict_frames(
        self,
        frames: List[np.ndarray]
    ) -> Dict:
        """Predict on preprocessed frames."""

    def warmup(self):
        """Compile model and run warmup inference."""
```

### Prediction Output

```python
{
    'frame_index': 100,
    'timestamp': 4.17,
    'probability': 0.823,
    'risk_level': 'high',  # low, medium, high
    'smoothed': True
}
```

## Environment Variables

| Variable | Description | Default |
|----------|-------------|---------|
| `BADAS_DEVICE` | Force device (cuda/cpu) | Auto-detect |
| `BADAS_PRECISION` | Force precision (fp16/fp32) | fp16 |
| `WANDB_API_KEY` | Weights & Biases API key | - |

## Citation

If you use BADAS in your research, please cite:

```bibtex
@software{badas2024,
  title={BADAS: Video Collision Anticipation Framework},
  author={Nexar},
  year={2024},
  url={https://github.com/getnexar/badas}
}
```

## License

Proprietary - Nexar Ltd. All rights reserved.

## Contributing

Please see [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines.

## Support

For issues and feature requests, please use the [GitHub Issues](https://github.com/getnexar/badas/issues) page.
