Metadata-Version: 2.4
Name: elra
Version: 0.0.2
Summary: ELRA: exponential learning rate adaption - gradient descent optimizer
Project-URL: Repository, https://git.th-wildau.de/alkl9873/adaptive_grad_decent
Author-email: Alexander Kleinsorge <alexander.kleinsorge@th-wildau.de>, Alexander Fauck <alexander.fauck@th-wildau.de>
License-Expression: GPL-3.0-or-later
License-File: LICENSE
Keywords: approximation,gradient descent,numerical,numerical solver,optimizer
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Science/Research
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Classifier: Topic :: Scientific/Engineering :: Mathematics
Requires-Python: >=3.10
Requires-Dist: torch>=2.0.0
Provides-Extra: dev
Requires-Dist: timm>=1.0.24; extra == 'dev'
Requires-Dist: torchvision>=0.25.0; extra == 'dev'
Provides-Extra: test
Requires-Dist: timm>=1.0.24; extra == 'test'
Requires-Dist: torchvision>=0.25.0; extra == 'test'
Description-Content-Type: text/markdown

# elra - gradient descent solver

Gradient descent solver/optimizer with internal good-natured hyperparameter control.

## Overview

This package provides an gradient descent solver/optimizer compatible to `torch.optim.Optimizer` (PyTorch Optimizer, e.g. Adam or SGD).
Even the original paper decribes two variants (C2M and P2M), only P2M is continued.

**Reference:** Alexander Fauck and Alexander Kleinsorge,
"ELRA: Exponential learning rate adaption gradient descent optimization method (2023)"

## Installation

For a quick start, you can just install the package via pip:

```bash
pip install elra
```

If you want to use a certain version of PyTorch, make sure to install it first. 
See the [PyTorch website](https://pytorch.org/) for more information.

### Requirements

- Python 3.10+
- PyTorch 2.0+ (for vector math and cuda support)

## Quick Start

### Example 1

This example is taken from the PyTorch documentation and based on the example [PyTorch: optim](https://docs.pytorch.org/tutorials/beginner/pytorch_with_examples.html#pytorch-optim)

```python
import torch
import math

from elra import ElraOptimizer  # import the elra optimizer

# Create Tensors to hold input and outputs.
x = torch.linspace(-math.pi, math.pi, 2000)
y = torch.sin(x)

# Prepare the input tensor (x, x^2, x^3).
p = torch.tensor([1, 2, 3])
xx = x.unsqueeze(-1).pow(p)

# Use the nn package to define our model and loss function.
model = torch.nn.Sequential(
    torch.nn.Linear(3, 1),
    torch.nn.Flatten(0, 1)
)
loss_fn = torch.nn.MSELoss(reduction='sum')

# Use the optim package to define an Optimizer that will update the weights of
# the model for us. Here we will use elra
learning_rate = 1e-6

optimizer = ElraOptimizer(model.parameters(), model=model, lr=learning_rate)

for t in range(20):
    # Forward pass: compute predicted y by passing x to the model.
    y_pred = model(xx)

    # Compute and print loss.
    loss = loss_fn(y_pred, y)
    loss_val = loss.item()

    # Before the backward pass, use the optimizer object to zero all of the
    # gradients for the variables it will update (which are the learnable
    # weights of the model). This is because by default, gradients are
    # accumulated in buffers( i.e, not overwritten) whenever .backward()
    # is called. Checkout docs of torch.autograd.backward for more details.
    optimizer.zero_grad()

    # Backward pass: compute gradient of the loss with respect to model
    # parameters
    loss.backward()

    # Calling the step function on an Optimizer makes an update to its
    # parameters
    if isinstance(optimizer, ElraOptimizer):
        optimizer.step(loss_val)  # elra needs loss value for internal control
    else:
        optimizer.step()

linear_layer = model[0]
print(f'Result: y = {linear_layer.bias.item()} + {linear_layer.weight[:, 0].item()} x + {linear_layer.weight[:, 1].item()} x^2 + {linear_layer.weight[:, 2].item()} x^3')
```

### Example 2

```python
import torch
from elra import ElraOptimizer  # ELRA_class.py (tbd 2026)

assert callable(model_class), "model_class not model()"  # as usual

batch_size, num_classes = 32, 10  # int: optional (0=default)
lr, wdecay = 1e-5, 0.99999  # float

# create/init optimizer class
optim = opt_class(model.parameters(), None, batch_size, num_classes, lr=lr, weight_decay=wdecay)
optim.add_param_group(..)  # optional (e.g. YOLO-5), Caution: on WD is considered

loss_limit: float = initial_loss * 1.5  # for retrace decision (failure check)
params = tuple(model.parameters())
optim.get_boost_model(True, model, None)  # turn on optional boosting (see below)

for X, y in data_loader:
    # Training Step
    optim.zero_grad(set_to_none=True)

    loss = loss_func(model(X), y)
    loss_item: float = loss.item()

    if not (loss_item < limitf):  # isnan(loss_item)
        optim.step_retrace(loss_item)  # happens < 1% of steps
        continue  # save backward on mis-step

    # grads = tt.cat( [p.grad.data.flatten() for p in params] )
    loss.backward()  # computes gradient (as usual)
    optim.step(loss_item)  # Caution: ELRA needs loss value (by design)

optim.set_valid_loss(val_loss_item: float)  # WD-control needs alidation loss values (e.g. once per epoch)

# optinal boosting: kind of trajectory_averaging (Polyak:1992)
cnt, avg_loss, model2 = optim.get_boost_model(True, model, device=None)
calc_FullBatchLoss(
    model2)  # early indication of better loss (due to agressive step-size oscillation ELRA seems slow in middle training phase)
```

## Creating Custom Tests

Should be comparable to `torch.optim.Optimizer`.


## Documentation

The user manual is included in the package:

```python
import elra_optimizer as elra

print(get_user_manual_path())  # Path to userManual.pdf
```

## License

GNU General Public License v3 or later (GPLv3+)

## Authors

- Alexander Kleinsorge (Technische Hochschule Wildau)
- Alexander Fauck (Technische Hochschule Wildau)

## Links

- [Repository](https://git.th-wildau.de/alkl9873/adaptive_grad_decent)
- [Documentation (PDF)](https://jugit.fz-juelich.de/mlz/ppapp/-/blob/main/py/R/userManual/userManual.pdf)


