Metadata-Version: 2.4
Name: aion-torch
Version: 0.3.3
Summary: Adaptive Input/Output Normalization for deep neural networks. Enables stable training of extremely deep networks through adaptive residual scaling (Alpha) by Babayev's Theory
Author: Abbasagha Babayev
License-Expression: MIT
Project-URL: Homepage, https://github.com/Croxus-Labs/aion-torch
Project-URL: Repository, https://github.com/Croxus-Labs/aion-torch
Project-URL: Issues, https://github.com/Croxus-Labs/aion-torch/issues
Project-URL: Documentation, https://github.com/Croxus-Labs/aion-torch#readme
Project-URL: Changelog, https://github.com/Croxus-Labs/aion-torch/blob/main/CHANGELOG.md
Keywords: deep-learning,normalization,pytorch,residual-connections
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Science/Research
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: torch>=2.0.0
Provides-Extra: dev
Requires-Dist: pytest>=7.0; extra == "dev"
Requires-Dist: pytest-cov>=4.0; extra == "dev"
Requires-Dist: black>=23.0; extra == "dev"
Requires-Dist: ruff>=0.1.0; extra == "dev"
Requires-Dist: pre-commit>=3.0.0; extra == "dev"
Requires-Dist: mypy>=1.0.0; extra == "dev"
Provides-Extra: bench
Requires-Dist: matplotlib>=3.5; extra == "bench"
Requires-Dist: pandas>=1.5; extra == "bench"
Dynamic: license-file

# AION-Torch: Adaptive Input/Output Normalization

[![PyPI version](https://img.shields.io/pypi/v/aion-torch.svg)](https://pypi.org/project/aion-torch/)
[![PyPI downloads](https://img.shields.io/pypi/dm/aion-torch.svg)](https://pypi.org/project/aion-torch/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![Python 3.10+](https://img.shields.io/badge/python-3.10+-blue.svg)](https://www.python.org/downloads/release/python-3100/)

**AION-Torch** is a PyTorch library that implements **Adaptive Input/Output Normalization (AION)**, a method for stabilizing deep neural networks. AION automatically adjusts residual connections to prevent vanishing and exploding gradients, enabling stable training of very deep networks with minimal configuration.

---

## 🚀 Features

- **Adaptive Residual Scaling**: Automatically adjusts residual connection strength based on signal statistics
- **Stable Deep Training**: Prevents vanishing/exploding gradients even in networks with 1000+ layers
- **Drop-in Replacement**: Works with any architecture using residual connections (Transformers, ResNets, etc.)
- **Distributed Ready**: Fully supports DDP with synchronized statistics across all GPUs
- **Zero Config**: Sensible defaults work out-of-the-box, no hyperparameter tuning needed

## 📦 Installation

### From PyPI
```bash
pip install aion-torch
```

## ⚡ Quick Start

### 1. The `AionBlock` (Recommended)
The easiest way to use AION is to replace your standard residual blocks with `AionBlock`. It implements the **Pre-LayerNorm** pattern augmented with AION scaling.

```python
import torch
import torch.nn as nn
from aion_torch import AionBlock

# Define your transformation layer (e.g., Attention or MLP)
mlp_layer = nn.Sequential(
    nn.Linear(512, 2048),
    nn.GELU(),
    nn.Linear(2048, 512)
)

# Wrap it in an AionBlock
# Structure: x + alpha * layer(norm(x))
block = AionBlock(layer=mlp_layer, dim=512)

# Forward pass
x = torch.randn(8, 128, 512)
output = block(x)
```

### 2. Low-Level `AionResidual`
For custom architectures, you can use the `AionResidual` adapter directly.

```python
from aion_torch import AionResidual

class MyLayer(nn.Module):
    def __init__(self, dim):
        super().__init__()
        self.norm = nn.LayerNorm(dim)
        self.ffn = nn.Linear(dim, dim)
        # Initialize AION adapter
        self.aion = AionResidual(alpha0=0.1, beta=0.05)

    def forward(self, x):
        residual = x
        x_norm = self.norm(x)
        y = self.ffn(x_norm)

        # Apply adaptive residual connection
        # Formula: x + alpha * y
        return self.aion(residual, y)
```

## 🧠 How It Works

AION adaptively scales residual connections using a simple but effective formula:

$$
\alpha = \frac{\alpha_0}{1 + \beta \cdot \text{ratio}}
$$

where `ratio` measures the relative magnitude of the transformation output compared to the input. When the network becomes unstable (high ratio), AION automatically reduces the scaling factor. When stable (low ratio), it uses a stronger connection.

**Key insight**: By maintaining balanced signal propagation, AION ensures gradients flow stably through arbitrarily deep networks without exponential growth or decay.

### AION as the General Form

Mathematically, other stabilization methods are just **special cases** of the AION formula where adaptivity ($\beta$) is turned off:

| Method | AION Equivalent Parameters | Behavior |
|:---|:---|:---|
| **DeepNorm** | $\beta=0, \alpha_0 = \frac{1}{\sqrt{2L}}$ | Fixed static scaling based on depth |
| **Pre-LN** | $\beta=0, \alpha_0 = 1$ | No scaling (identity) |
| **ReZero** | $\beta=0, \alpha_0 = \text{learnable}$ | Learnable static scalar |
| **AION** | $\beta > 0$ | **Dynamic adaptation** based on signal energy |

AION generalizes these approaches by adding the **control term** ($1 + \beta \cdot \text{ratio}$), allowing it to react to instability in real-time rather than relying on static assumptions.

## 📚 Documentation

For the theoretical foundation and mathematical proofs, see the following documents:

- [Balance Theory](https://drive.google.com/open?id=1Go8uOayDtJykcdOlZ1RQgs3uenwaZmQr&usp=drive_copy) - Core theoretical foundation for AION

These are more general math papers that inspired the ideas, but are **not required** to use the library:

- [Angular Symmetry Goldbach](https://drive.google.com/open?id=1EiPzyXHHuYCfYIBzDY1rqqGz9pDlhEM1&usp=drive_copy)
- [Cone Symmetry Goldbach](https://drive.google.com/open?id=1jPQxw96TO85S07HXl7W7dvGuOTZtPJPE&usp=drive_copy)

## 🤝 Contributing

Contributions are welcome! Please read our [Contributing Guide](CONTRIBUTING.md) (coming soon) and check out the issues.

1.  Fork the repository
2.  Create your feature branch (`git checkout -b feature/amazing-feature`)
3.  Commit your changes (`git commit -m 'Add some amazing feature'`)
4.  Push to the branch (`git push origin feature/amazing-feature`)
5.  Open a Pull Request

## 📜 License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

---

<p align="center">
  <sub>Built with ❤️ for the ML community</sub>
</p>
