Metadata-Version: 2.4
Name: selgis
Version: 0.2.0
Summary: Universal Training Framework for PyTorch and HuggingFace Transformers
Author: Selgis ML
License: Apache 2.0
Keywords: pytorch,transformers,training,lora,peft,deep-learning,llm
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
License-File: NOTICE
Requires-Dist: torch>=2.0
Requires-Dist: numpy>=1.20
Requires-Dist: tqdm
Provides-Extra: transformers
Requires-Dist: transformers>=4.30; extra == "transformers"
Requires-Dist: datasets; extra == "transformers"
Requires-Dist: accelerate>=0.21.0; extra == "transformers"
Provides-Extra: peft
Requires-Dist: peft>=0.5.0; extra == "peft"
Provides-Extra: llm
Requires-Dist: bitsandbytes>=0.41.0; extra == "llm"
Requires-Dist: accelerate>=0.21.0; extra == "llm"
Requires-Dist: scipy; extra == "llm"
Provides-Extra: tracking
Requires-Dist: wandb; extra == "tracking"
Provides-Extra: dev
Requires-Dist: pytest; extra == "dev"
Requires-Dist: pytest-cov; extra == "dev"
Requires-Dist: ruff>=0.1; extra == "dev"
Provides-Extra: all
Requires-Dist: transformers>=4.30; extra == "all"
Requires-Dist: datasets; extra == "all"
Requires-Dist: accelerate>=0.21.0; extra == "all"
Requires-Dist: peft>=0.5.0; extra == "all"
Requires-Dist: bitsandbytes>=0.41.0; extra == "all"
Requires-Dist: scipy; extra == "all"
Requires-Dist: wandb; extra == "all"
Dynamic: license-file

# 🛡️ Selgis ML

**Autonomous Self-Healing Training Framework for PyTorch & Transformers.**

[![PyPI](https://img.shields.io/pypi/v/selgis?color=blue)](https://pypi.org/project/selgis/)
[![License](https://img.shields.io/badge/license-Apache%202.0-green)](LICENSE)
[![Python](https://img.shields.io/badge/python-3.10%2B-blue)](https://pypi.org/project/selgis/)

**Selgis** (Self-Guided Intelligent Stability) is a library that turns unstable neural network training into a reliable, predictable process. It automatically detects **Loss Spikes**, **NaN/Inf values**, and **plateaus**, applying dynamic weight **Rollback** mechanisms and Learning Rate **Surges** to recover the run.

Especially effective for **LoRA/QLoRA finetuning of LLMs** (Llama, Qwen, Mistral) on consumer hardware, where standard trainers often crash with `OutOfMemory` errors or degrade due to fp16 instability.

---

## 🔥 Why Selgis?

Have you ever woken up in the morning to find your overnight run crashed with `Loss: NaN` at 80%? Or that the model "forgot" everything it learned due to a bad batch? Selgis solves this.

*   **🛡️ Self-Healing Loop:** Automatic rollback to the last stable state upon detecting anomalies (loss spikes / NaN).
*   **🧠 Memory-Safe Architecture:** State preservation logic tracks *only* trainable parameters (`trainable-only`). This allows training **Qwen-4B / Llama-7B** on cards with **8-12 GB VRAM** without OOM during checkpoints.
*   **⚡ Final Surge:** If the model gets stuck on a plateau, Selgis can automatically boost the LR by 5-10x to break through local minima ("defibrillator effect").
*   **📉 Smart Defaults:** Built-in LR Finder and adaptive scheduler presets.

---

## 📊 Benchmarks

We tested Selgis under extreme conditions on real hardware (Tesla T4 16GB). Here are the results:

| Task | Model | Problem | Selgis Solution | Result |
| :--- | :--- | :--- | :--- | :--- |
| **LLM Finetuning** | **Qwen-2.5-4B** (QLoRA) | OOM on 12GB cards + Loss Spike | Trainable-only state + Rollback | **Memory: 8.2 GB**, Loss < 0.001 |
| **Seq2Seq** | LSTM (1.4M) | Catastrophic Spike (Acc 52% → 44%) | Rollback + Surge | **+7% Accuracy** (Recovered to 59.04%) |
| **NLP** | BERT-base | Instability on small batch (16) | Stable LR Finder | **100.0% Accuracy** (in 3 epochs) |
| **CV** | CNN (MNIST) | Overfitting & micro-spikes | Micro-rollbacks | **99.09%** (Held at generalization peak) |

> *"Selgis doesn't just prevent explosions. It returns training to a productive track."*

---

## 🚀 Installation

```bash
# Base version (PyTorch only)
pip install selgis

# Full version (with Transformers, LoRA, quantization, and WandB support)
pip install "selgis[all]"
```

---

## 🛠️ Quick Start

### 1. Robust LLM Training (Llama / Qwen)

Selgis handles protection while you use the familiar Transformers API. Now with native **BitsAndBytes** quantization support.

```python
from selgis import TransformerTrainer, TransformerConfig

# Configuration with native 4-bit quantization and protection
config = TransformerConfig(
    model_name_or_path="Qwen/Qwen-2.5-3B",
    
    # --- Native Quantization (New in v0.2.0) ---
    quantization_type="4bit", 
    bnb_4bit_compute_dtype="bfloat16",
    bnb_4bit_use_double_quant=True,
    
    # --- PEFT / LoRA ---
    use_peft=True,
    peft_config={
        "r": 16, 
        "target_modules": ["q_proj", "v_proj", "k_proj", "o_proj"]
    },
    
    # --- Selgis protection ---
    nan_recovery=True,      # Auto-rollback on NaN/Spike
    state_storage="disk"    # Save RAM (store state on disk)
)

# Start training (Trainer handles model loading and quantization automatically)
trainer = TransformerTrainer(model_or_path=config.model_name_or_path, config=config)
trainer.train() 
# You can go to sleep now. If the loss spikes, Selgis fixes it.
```

### 2. Standard PyTorch (Any Model)

```python
from selgis import Trainer, SelgisConfig
import torch

# Your model
model = torch.nn.Sequential(
    torch.nn.Linear(10, 32),
    torch.nn.ReLU(),
    torch.nn.Linear(32, 2),
)

# Config
config = SelgisConfig(
    max_epochs=10,
    lr_finder_enabled=True,  # Auto-find optimal LR before start
    spike_threshold=3.0      # Rollback if loss jumps 3x
)

trainer = Trainer(
    model=model, 
    config=config, 
    train_dataloader=loader, 
    criterion=torch.nn.CrossEntropyLoss()
)
trainer.train()
```

---

## 💻 CLI (Command Line Interface)

Selgis ships with a handy CLI for diagnostics and quick execution.

| Command | Description |
| :--- | :--- |
| `selgis device` | Check GPU/CUDA/MPS availability and print device info. |
| `selgis train` | Run a minimal demo training on synthetic data (Smoke Test). |
| `selgis train --config <path>` | Run training using a config file (**YAML/JSON supported**). |
| `selgis version` | Print the current library version. |

Example environment check:
```bash
$ selgis device
🚀 Device: cuda
   GPU: NVIDIA Tesla T4
   Memory: 14.75 GB
```

---

## 📚 API Reference

Full technical documentation for `SelgisCore`, `Trainer`, `Callbacks`, and configuration classes is available in [API.md](API.md).

Key components:
*   **SelgisCore**: The brain of the system (protection, rollback, state management).
*   **TransformerTrainer**: Wrapper for the HuggingFace ecosystem with native BitsAndBytes support.
*   **HistoryCallback**: Automatically saves training history to JSON for later analysis.
*   **LRFinder**: Tool for finding the optimal learning rate.

---

## 📄 License

Apache 2.0 License. Free for commercial and research use.

**Selgis AI** — Make training boring (in a good way).
