Metadata-Version: 2.4
Name: dpo
Version: 2.5.0
Summary: Debt Payment Optimization
Author-email: Arya H <arya.h1718@gmail.com>
License: MIT
Project-URL: Homepage, https://github.com/Arya1718/dpo
Project-URL: Repository, https://github.com/Arya1718/dpo
Project-URL: Documentation, https://dpo-nas.readthedocs.io/
Project-URL: Issues, https://github.com/Arya1718/dpo/issues
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.8
Description-Content-Type: text/markdown
Requires-Dist: numpy>=1.21.0
Provides-Extra: benchmarks
Requires-Dist: nasbench>=1.0; extra == "benchmarks"
Requires-Dist: nas-bench-x11>=2.0; extra == "benchmarks"
Requires-Dist: nasbench301>=0.2; extra == "benchmarks"
Requires-Dist: hpobench>=0.0.8; extra == "benchmarks"
Requires-Dist: nats-bench>=1.0; extra == "benchmarks"
Provides-Extra: dev
Requires-Dist: pytest>=6.0; extra == "dev"
Requires-Dist: black>=21.0; extra == "dev"
Requires-Dist: flake8>=3.9; extra == "dev"
Requires-Dist: mypy>=0.910; extra == "dev"
Provides-Extra: docs
Requires-Dist: sphinx>=4.0; extra == "docs"
Requires-Dist: sphinx-rtd-theme>=1.0; extra == "docs"
Provides-Extra: gpu
Requires-Dist: torch>=1.9.0; extra == "gpu"

<p align="center">
  <h1 align="center">DPO — Debt-Payment Optimization</h1>
  <p align="center">
    <em>A population-based metaheuristic that intentionally accepts worse moves, records them as <strong>debt</strong>, and repays with interest to escape local optima.</em>
  </p>
  <p align="center">
    <a href="https://pypi.org/project/dpo/"><img src="https://img.shields.io/pypi/v/dpo?color=blue" alt="PyPI version"></a>
    <a href="https://pypi.org/project/dpo/"><img src="https://img.shields.io/pypi/pyversions/dpo" alt="Python"></a>
    <a href="https://opensource.org/licenses/MIT"><img src="https://img.shields.io/badge/License-MIT-green.svg" alt="License: MIT"></a>
    <a href="https://pypi.org/project/dpo/"><img src="https://img.shields.io/pypi/dm/dpo?color=orange" alt="Downloads"></a>
  </p>
</p>

---

## What is DPO?

DPO is an optimization algorithm inspired by financial debt dynamics. Unlike simulated annealing (which forgets history) or evolutionary algorithms (which only keep elites), DPO **deliberately takes worse solutions**, records the degradation as **debt**, then **repays it with interest** — forcing the search to overshoot past local optima and converge aggressively toward the global best.

$$X_{\text{new}} = X + \underbrace{\beta \cdot \text{debt}}_{\text{repayment}} + \underbrace{\gamma \cdot \text{debt}}_{\text{overshoot}} + \underbrace{\delta \cdot (\text{gBest} - X)}_{\text{global pull}}$$

**Key features:**

- **5 built-in presets** — NAS, HPO, Resource Allocation, Pathfinding/TSP, Scheduling
- **4 problem types** — Continuous, Combinatoric, NAS, and Hybrid (mixed continuous + discrete)
- **Island model** with specialized sub-populations and migration
- **NSGA-II** multi-objective support with Pareto archiving
- **Adaptive controls** — mutation, acceptance, temperature, debt memory all self-tune
- **One dependency** — only `numpy>=1.21.0` required; everything else is optional

---

## Benchmark Results

Comprehensive benchmark across **24 datasets** (NASBench-201, HPOBench, HPOLib, Synthetic, Noisy Synthetic), **14 methods**, **3 seeds**, **1,008 total runs**:

### Overall Ranking (all 24 datasets)

| Rank | Method | Mean Rank ↓ | #1 Wins | Mean AUC |
|---:|---|---:|---:|---:|
| 1 | JADE | 3.19 | 5 | 0.9581 |
| 2 | GWO | 3.83 | 4 | 0.9616 |
| 3 | **DPO** | **4.27** | **6** | **0.9301** |
| 4 | DE | 5.23 | 3 | 0.9264 |
| 5 | ACO | 6.02 | 0 | 0.9138 |
| 6 | GA | 6.19 | 3 | 0.9529 |
| 7 | PSO | 8.58 | 0 | 0.9206 |
| 8 | WOA | 10.12 | 1 | 0.9178 |
| 9 | FA | 11.17 | 0 | 0.9098 |
| 10 | ABC | 11.67 | 0 | 0.9013 |
| 11 | SA | 14.00 | 0 | 0.5282 |

> **DPO achieves the most #1 wins (6) of any method** and ranks **3rd overall** across all benchmarks. Ablation variants (DPO-NoDebt, DPO-NoAccept, DPO-NoRepay) consistently rank lower, confirming each component contributes to performance.

### Per-Family Highlights

| Benchmark Family | DPO Rank | Datasets | Notable Results |
|---|---:|---:|---|
| **NASBench-201** | 1st–5th | 3 | **#1 on CIFAR-10 & CIFAR-100** |
| **HPOBench** | 4th–5th | 8 | **#1 on Credit-G, Car** |
| **HPOLib** | 1st–8th | 5 | **#1 on Naval Propulsion, Slice Localization** |
| **Synthetic (20-D)** | 4th | 6 | Strong on high-dimensional BBOB functions |
| **Noisy Synthetic** | 4th | 3 | Robust under evaluation noise |

### NASBench-201 Detailed Results

| Dataset | DPO Score | Best Possible | DPO Rank | 95% Convergence |
|---|---:|---:|---:|---:|
| CIFAR-10 | **0.9172** | 0.9172 | **1st** | iter 29.7 |
| CIFAR-100 | **0.7372** | 0.7372 | **1st** | iter 34.0 |
| ImageNet-16-120 | 0.4706 | 0.4740 | 5th | iter 25.7 |

### HPOBench Highlights

| Dataset | DPO Score | DPO Rank |
|---|---:|---:|
| Australian | 0.9682 | 4th |
| Blood Transfusion | 0.7747 | **2nd** |
| Car | **1.0000** | **1st** (tied) |
| Credit-G | **0.9241** | **1st** |
| Segment | 0.9757 | 2nd |

**Reproduce all benchmarks:**
```bash
python -m dpo.benchmarks.hpo_comprehensive_benchmark --seeds 3 --population 40 --iterations 60
```

---

## Installation

```bash
pip install dpo
```

**From source (editable):**
```bash
git clone https://github.com/Arya1718/dpo.git
cd dpo
pip install -e .
```

**Optional extras:**
```bash
pip install dpo[dev]         # pytest, black, flake8, mypy
pip install dpo[gpu]         # PyTorch support
pip install dpo[benchmarks]  # NASBench, HPOBench backends
pip install dpo[docs]        # Sphinx documentation
```

**Requirements:** Python ≥ 3.8, NumPy ≥ 1.21.0

---

## Quick Start

### One-Line Optimization

```python
from dpo import dpo

result = dpo(preset='nas')
print(result['best_fitness'])
print(result['best_accuracy'])
```

### Recommended: DPO_Universal

```python
from dpo.core.universal import DPO_Universal, DPO_Presets

config = DPO_Presets.NAS_Config(population_size=60, max_iterations=200)
optimizer = DPO_Universal(config=config)
result = optimizer.optimize()

print(f"Best fitness:  {result['best_fitness']:.6f}")
print(f"Best accuracy: {result['best_accuracy']:.4f}")
print(f"Best solution: {optimizer.get_best_solution()}")
```

---

## How to Use DPO

DPO provides **three levels of control**, from simplest to most advanced:

```
Level 1 ─ dpo(preset='nas')                         # One function call
Level 2 ─ DPO_Universal(config=..., problem=...)     # Preset + custom problem
Level 3 ─ DPO_NAS(config=..., estimator=...)         # Full manual control
```

### Available Presets

| Preset | Factory Method | Best For | Key Traits |
|---|---|---|---|
| `'nas'` | `DPO_Presets.NAS_Config()` | Neural Architecture Search | 3 islands, NSGA-II, accuracy-dominant |
| `'hpo'` | `DPO_Presets.HyperparameterTuning_Config()` | ML hyperparameter tuning | single island, continuous mode, long debt memory |
| `'resource'` | `DPO_Presets.ResourceAllocation_Config()` | Cloud/network balancing | 4 islands, heaviest constraint penalty |
| `'pathfinding'` | `DPO_Presets.Pathfinding_Config()` | TSP, routing, paths | 5 islands, highest exploration, fast migration |
| `'scheduling'` | `DPO_Presets.Scheduling_Config()` | Job/factory scheduling | 4 islands, highest cost weight |

DPO also has built-in `DPO_Config` presets:

| Config Preset | Use Case | Population | Iterations |
|---|---|---:|---:|
| `DPO_Config.fast()` | Quick testing | 20 | 50 |
| `DPO_Config.balanced()` | General use | 40 | 150 |
| `DPO_Config.thorough()` | Deep search | 80 | 300 |
| `DPO_Config.publication()` | Reproducible research | 80 | 300 |
| `DPO_Config.continuous_analytic()` | Continuous benchmarks | 30 | 100 |

---

## Problem Types & Complete Examples

### 1. Continuous Optimization (HPO, Calibration, Black-Box)

Use `dpo_optimize()` for the simplest setup, or `ContinuousOptimizationProblem` for more control.

**Simple (one function call):**

```python
from dpo import dpo_optimize

def objective(params):
    lr = params['learning_rate']
    dropout = params['dropout']
    # Simulated training loss
    loss = (lr - 0.001)**2 + (dropout - 0.3)**2
    accuracy = max(0.0, 0.95 - loss)
    return loss, {
        'accuracy': accuracy,
        'latency_ms': 1.0,
        'memory_mb': 1.0,
        'flops_m': 1.0,
    }

result = dpo_optimize(
    objective=objective,
    bounds=[(1e-5, 0.1), (0.0, 0.9)],
    names=['learning_rate', 'dropout'],
    preset='balanced',
    max_iterations=100,
    population_size=40,
)
print(f"Best params: {result['best_solution']}")
print(f"Best loss:   {result['best_fitness']:.6f}")
```

**With Problem class:**

```python
from dpo.core.problem import ContinuousOptimizationProblem
from dpo.core.universal import DPO_Presets, DPO_Universal

def objective(params):
    x, y = params['x'], params['y']
    fitness = x**2 + y**2
    return fitness, {
        'accuracy': 1.0 / (1.0 + fitness),
        'latency_ms': 1.0,
        'memory_mb': 1.0,
        'flops_m': 1.0,
    }

problem = ContinuousOptimizationProblem(
    objective_fn=objective,
    param_bounds=[(-5.0, 5.0), (-5.0, 5.0)],
    param_names=['x', 'y'],
)

config = DPO_Presets.HyperparameterTuning_Config(
    population_size=30,
    max_iterations=100,
)
optimizer = DPO_Universal(problem=problem, config=config)
result = optimizer.optimize()

best = optimizer.get_best_solution()
print(f"x={best['x']:.6f}, y={best['y']:.6f}")
print(f"Minimum: {result['best_fitness']:.8f}")
```

### 2. Combinatoric Optimization (TSP, Routing, Scheduling)

**TSP (one function call):**

```python
import numpy as np
from dpo import dpo_solve_tsp

n_cities = 20
coords = np.random.default_rng(42).uniform(0, 100, (n_cities, 2))
dist = np.linalg.norm(coords[:, None] - coords[None, :], axis=2)

result = dpo_solve_tsp(
    distance_matrix=dist,
    preset='balanced',
    max_iterations=120,
    population_size=60,
)
print(f"Best tour length: {result['best_fitness']:.2f}")
```

**Job scheduling:**

```python
import numpy as np
from dpo.core.problem import CombinatoricOptimizationProblem
from dpo.core.universal import DPO_Presets, DPO_Universal

n_jobs, n_machines = 30, 4
proc_times = np.random.default_rng(0).integers(1, 20, (n_jobs, n_machines))

def makespan(seq_dict):
    seq = seq_dict['sequence']
    machine_time = np.zeros(n_machines)
    for job_idx in seq:
        m = np.argmin(machine_time)
        machine_time[m] += proc_times[job_idx % n_jobs, m]
    cost = float(np.max(machine_time))
    return cost, {
        'accuracy': 1.0 / (1.0 + cost),
        'latency_ms': cost,
        'memory_mb': 1.0,
        'flops_m': 1.0,
    }

problem = CombinatoricOptimizationProblem(
    objective_fn=makespan,
    problem_size=n_jobs,
)
config = DPO_Presets.Scheduling_Config(population_size=45, max_iterations=120)
optimizer = DPO_Universal(problem=problem, config=config)
result = optimizer.optimize()
print(f"Best makespan: {result['best_fitness']:.1f} time units")
```

### 3. Neural Architecture Search (NAS)

```python
from dpo.core.problem import NASProblem
from dpo.core.universal import DPO_Presets, DPO_Universal

class MyNASEstimator:
    """Your evaluator must implement estimate() returning (fitness, metrics_dict)."""

    def estimate(self, arch_dict, search_mode=True, iteration=0, **kwargs):
        # arch_dict contains: operations, kernels, skip_connections,
        #                     depth_multiplier, channel_multiplier
        ops = arch_dict.get('operations', [])
        depth = float(arch_dict.get('depth_multiplier', 1.0))
        channels = float(arch_dict.get('channel_multiplier', 1.0))

        # Replace with your real training / proxy evaluation
        accuracy = 0.80 + 0.05 * (depth + channels) / 3.0
        latency = 20.0 + 5.0 * depth
        memory = 10.0 + 5.0 * channels
        flops = 80.0 + 12.0 * depth * channels

        fitness = 1.0 - accuracy  # DPO minimizes fitness
        return fitness, {
            'accuracy': accuracy,
            'latency_ms': latency,
            'memory_mb': memory,
            'flops_m': flops,
        }

problem = NASProblem(
    evaluator=MyNASEstimator(),
    constraints={'latency': 100.0, 'memory': 50.0, 'flops': 300.0},
)

config = DPO_Presets.NAS_Config(population_size=60, max_iterations=200)
optimizer = DPO_Universal(problem=problem, config=config)
result = optimizer.optimize()

arch = optimizer.get_best_solution()
print(f"Best accuracy:  {result['best_accuracy']:.4f}")
print(f"Operations:     {arch['operations']}")
print(f"Kernels:        {arch['kernels']}")
print(f"Skip connects:  {arch['skip_connections']}")
print(f"Depth mult:     {arch['depth_multiplier']:.2f}")
print(f"Channel mult:   {arch['channel_multiplier']:.2f}")
```

**Aggressive NAS mode (best benchmark performance):**

```python
config = DPO_Presets.NAS_Config(
    population_size=80,
    max_iterations=300,
    aggressive_mode=True,   # stronger β=1.70, γ=1.30, elite_ratio=0.25
)
config.verbose = False
optimizer = DPO_Universal(problem=problem, config=config)
result = optimizer.optimize()
```

### 4. Hybrid Optimization (Mixed Continuous + Discrete)

```python
from dpo.core.problem import HybridProblem
from dpo.core.universal import DPO_Presets, DPO_Universal

def objective(params):
    lr = params.get('num_0', 1e-3)           # continuous
    reg = params.get('num_1', 1e-4)          # continuous
    model = params.get('disc_0', 'mlp')      # discrete

    bias = {'mlp': 0.02, 'cnn': 0.01, 'transformer': 0.015}.get(model, 0.02)
    fitness = (lr - 0.002)**2 * 1000 + (reg - 0.0002)**2 * 20000 + bias

    return float(fitness), {
        'accuracy': float(1.0 / (1.0 + fitness)),
        'latency_ms': 30.0,
        'memory_mb': 12.0,
        'flops_m': 80.0,
    }

problem = HybridProblem(
    objective_fn=objective,
    numeric_bounds=[(1e-5, 1e-2), (1e-6, 1e-3)],           # 2 continuous params
    discrete_options={'model_family': ['mlp', 'cnn', 'transformer']},  # 1 discrete
)

config = DPO_Presets.HyperparameterTuning_Config(population_size=30, max_iterations=100)
optimizer = DPO_Universal(problem=problem, config=config)
result = optimizer.optimize()
print(f"Best fitness: {result['best_fitness']:.6f}")
```

### 5. Resource Allocation

```python
from dpo.core.problem import ContinuousOptimizationProblem
from dpo.core.universal import DPO_Presets, DPO_Universal

def allocation_objective(params):
    cpu = params['cpu_fraction']
    mem = params['mem_fraction']
    bw = params['bandwidth']

    imbalance = abs(cpu - 0.45) + abs(mem - 0.35) + abs(bw - 0.20)
    over_commit = max(0.0, cpu + mem + bw - 1.0)
    fitness = imbalance + 6.0 * over_commit

    return fitness, {
        'accuracy': 1.0 / (1.0 + fitness),
        'latency_ms': 30.0 + 20.0 * imbalance,
        'memory_mb': 8.0 + 25.0 * mem,
        'flops_m': 50.0 + 40.0 * bw,
    }

problem = ContinuousOptimizationProblem(
    objective_fn=allocation_objective,
    param_bounds=[(0.0, 1.0), (0.0, 1.0), (0.0, 1.0)],
    param_names=['cpu_fraction', 'mem_fraction', 'bandwidth'],
)

config = DPO_Presets.ResourceAllocation_Config(population_size=50, max_iterations=150)
optimizer = DPO_Universal(problem=problem, config=config)
result = optimizer.optimize()

best = optimizer.get_best_solution()
print(f"CPU={best['cpu_fraction']:.3f}  MEM={best['mem_fraction']:.3f}  BW={best['bandwidth']:.3f}")
```

### 6. Custom Problem (Full Extensibility)

Implement the `Problem` abstract class for any domain:

```python
import numpy as np
from dpo.core.problem import Problem
from dpo.core.solution import NumericSolution
from dpo.core.universal import DPO_Universal

class RosenbrockProblem(Problem):
    """Custom problem: implement evaluate() and create_solution()."""

    def evaluate(self, solution, **kwargs):
        p = solution.to_dict()
        x, y = p['x'], p['y']
        fitness = (1 - x)**2 + 100 * (y - x**2)**2
        return fitness, {
            'accuracy': 1.0 / (1.0 + fitness),
            'latency_ms': 1.0,
            'memory_mb': 1.0,
            'flops_m': 1.0,
        }

    def create_solution(self, **kwargs):
        vals = np.random.uniform(-5.0, 5.0, size=2)
        return NumericSolution(vals, [(-5.0, 5.0)] * 2, ['x', 'y'])

    def get_problem_info(self):
        return {'name': 'Rosenbrock', 'type': 'hpo'}  # auto-selects HPO preset

result = DPO_Universal(problem=RosenbrockProblem()).optimize()
print(f"Minimum: {result['best_fitness']:.6f}")
```

---

## Objective Function Contract

Every objective function must return a **tuple** of `(fitness, metrics_dict)`:

```python
def my_objective(params: dict) -> tuple:
    fitness = ...       # float, lower is better — DPO minimizes this
    metrics = {
        'accuracy':   ...,  # float, higher is better (DPO tracks this internally)
        'latency_ms': ...,  # float, constraint metric
        'memory_mb':  ...,  # float, constraint metric
        'flops_m':    ...,  # float, constraint metric
    }
    return fitness, metrics
```

> **Tip:** If your problem doesn't have natural latency/memory/flops metrics, just set them to `1.0`. DPO will still work correctly using only the fitness value.

---

## Reading Results

`optimizer.optimize()` returns a result dictionary:

```python
result = optimizer.optimize()

# ── Core metrics ──────────────────────────────────────────────
result['best_fitness']          # float — lowest fitness found (lower is better)
result['best_accuracy']         # float — highest accuracy across all iterations
result['best_architecture']     # dict  — best solution as architecture dict
result['best_metrics']          # dict  — full metrics of the best solution

# ── Convergence history ───────────────────────────────────────
history = result['history']
history['best_accuracy']        # List[float] — accuracy curve (monotonic improving)
history['best_fitness']         # List[float] — fitness curve per iteration
history['avg_fitness']          # List[float] — population average
history['debt_norms']           # List[float] — mean debt magnitude per iteration
history['diversity_scores']     # List[float] — population diversity
history['acceptance_rates']     # List[float] — worse-move acceptance rates
history['auc_10']               # float — area under accuracy curve at 10%
history['auc_25']               # float — AUC at 25%
history['auc_50']               # float — AUC at 50%
history['time_to_95']           # int   — iteration reaching 95% of best
history['time_to_99']           # int   — iteration reaching 99% of best

# ── Acceptance statistics ─────────────────────────────────────
stats = result['acceptance_stats']
stats['total_candidates']       # total candidate solutions evaluated
stats['accepted_better']        # accepted because strictly better
stats['accepted_worse']         # accepted despite being worse (DPO debt events)
stats['rejected']               # rejected candidates

# ── Helpers on the optimizer object ───────────────────────────
optimizer.get_best_solution()   # dict — best solution found
optimizer.get_history()         # same as result['history']
optimizer.get_config()          # DPO_Config — active configuration
```

### Plot Convergence

```python
import matplotlib.pyplot as plt

history = optimizer.get_history()
plt.plot(history['best_accuracy'], label='DPO', color='red', lw=2)
plt.xlabel('Iteration')
plt.ylabel('Best Accuracy')
plt.title('DPO Convergence')
plt.grid(True, alpha=0.3)
plt.legend()
plt.tight_layout()
plt.savefig('convergence.png', dpi=150)
```

---

## Configuration Reference

All parameters live in `DPO_Config` (from `dpo.core.config`):

### Core Algorithm

| Parameter | Default | Description |
|---|---|---|
| `alpha_0` | 0.15 | Exploration / mutation magnitude |
| `beta_0` | 1.0 | Debt repayment force |
| `gamma_0` | 1.0 | Overshoot multiplier |
| `delta_0` | 0.2 | Global-best pull strength |
| `decay_power` | 0.5 | Power-law decay exponent |

### Population & Islands

| Parameter | Default | Description |
|---|---|---|
| `population_size` | 40 | Total agents across all islands |
| `max_iterations` | 200 | Maximum optimization iterations |
| `elite_ratio` | 0.10 | Fraction of elites preserved |
| `island_model` | True | Enable island sub-populations |
| `num_islands` | 3 | Number of islands |
| `migration_freq` | 10 | Iterations between island migrations |
| `diversity_inject_freq` | 5 | Iterations between diversity injections |

### Debt Memory

| Parameter | Default | Description |
|---|---|---|
| `debt_memory_lambda` | 0.85 | EMA decay for debt accumulation (λ ∈ [0.7, 0.95]) |
| `min_debt_memory` | 0.70 | Minimum debt retention |
| `max_debt_memory` | 0.95 | Maximum debt retention |
| `debt_persistence_start` | 0.80 | Debt persistence at iteration 0 |
| `debt_persistence_end` | 0.20 | Debt persistence at final iteration |
| `debt_persistence_decay` | `'linear'` | Schedule: `'linear'`, `'exponential'`, `'cosine'` |

### Temperature & Acceptance

| Parameter | Default | Description |
|---|---|---|
| `temperature_start` | 1.0 | Initial SA temperature |
| `temperature_min` | 0.02 | Floor temperature |
| `temperature_decay` | 0.95 | Multiplicative cooling per iteration |
| `force_late_debt` | True | Force debt accumulation in late phase |
| `late_debt_start_ratio` | 0.60 | When late-phase debt forcing begins |

### Fitness Weights

| Parameter | Default | Description |
|---|---|---|
| `w_accuracy` | 0.60 | Weight on accuracy (set to 0.95 for NAS SOTA) |
| `w_cost` | 0.30 | Weight on cost (latency + memory + flops) |
| `w_penalty` | 0.10 | Weight on constraint violations |

### Constraints

| Parameter | Default | Description |
|---|---|---|
| `latency_constraint` | 100.0 | Max latency (ms) |
| `memory_constraint` | 50.0 | Max memory (MB) |
| `flops_constraint` | 300.0 | Max FLOPs (M) |
| `constraint_penalty_scale` | 2.0 | Multiplier on penalty terms |

### Convergence

| Parameter | Default | Description |
|---|---|---|
| `patience` | 30 | Early-stop patience (iterations) |
| `stagnation_threshold` | 15 | Iterations before stagnation boost |
| `exploration_phase_ratio` | 0.30 | Fraction of run in exploration mode |
| `adaptive_early_stop` | True | Enable adaptive early stopping |

### Ablation Flags

| Parameter | Default | Description |
|---|---|---|
| `enable_debt_accumulation` | True | Toggle debt accumulation |
| `enable_worse_acceptance` | True | Toggle worse-move acceptance |
| `enable_debt_repayment` | True | Toggle debt repayment |

### Override Any Parameter

```python
from dpo.core.universal import DPO_Presets, DPO_Universal

config = DPO_Presets.NAS_Config(aggressive_mode=True)
config.max_iterations = 300
config.population_size = 100
config.verbose = False

optimizer = DPO_Universal(config=config)
result = optimizer.optimize()
```

---

## Advanced Usage

### Silence All Logging

```python
import logging

logger = logging.getLogger("DPO-Silent")
logger.setLevel(logging.CRITICAL)
logger.disabled = True

optimizer = DPO_Universal(preset='nas', logger=logger)
```

### Multi-Seed Reproducible Runs

```python
import numpy as np
from dpo.core.universal import DPO_Universal

results = []
for seed in [42, 123, 999]:
    np.random.seed(seed)
    r = DPO_Universal(preset='nas').optimize()
    results.append(r['best_accuracy'])

print(f"Accuracy: {np.mean(results):.4f} ± {np.std(results):.4f}")
```

### Direct Access to Core Optimizer

```python
from dpo.core.optimizer import DPO_NAS
from dpo.core.config import DPO_Config
from dpo.evaluation.ensemble import EnsembleEstimator
from dpo.constraints.handler import AdvancedConstraintHandler

config = DPO_Config(population_size=80, max_iterations=200, alpha_0=0.25)
config.validate()

optimizer = DPO_NAS(
    config=config,
    estimator=EnsembleEstimator(),
    constraint_handler=AdvancedConstraintHandler(config),
)
optimizer.initialize_population()
result = optimizer.optimize()

print(optimizer.best_accuracy)
print(optimizer.best_agent.gene.to_architecture_dict())
print(len(optimizer.pareto_archive), "Pareto-optimal solutions")
```

### Auto-Detection of Problem Type

If your custom `Problem.get_problem_info()` returns a recognizable `type`, DPO auto-selects the matching preset — no `preset=` argument needed:

```python
class MyProblem(Problem):
    def get_problem_info(self):
        return {'name': 'MyTSP', 'type': 'pathfinding'}  # auto-selects pathfinding preset
    # ... evaluate(), create_solution() ...

optimizer = DPO_Universal(problem=MyProblem())   # preset auto-detected
result = optimizer.optimize()
```

| `type` keyword | Auto-Selected Preset |
|---|---|
| `'nas'` / `'architecture'` | NAS |
| `'resource'` / `'allocation'` | Resource Allocation |
| `'pathfinding'` / `'routing'` / `'tsp'` / `'vrp'` | Pathfinding |
| `'hpo'` / `'hyperparameter'` / `'tuning'` | HPO |
| `'scheduling'` / `'schedule'` / `'job'` | Scheduling |

---

## How DPO Works (Algorithm Overview)

| Step | What Happens |
|---|---|
| 1 | Evaluate a candidate solution |
| 2 | If **worse** than current → **accept it**, record `debt = Δfitness` |
| 3 | Debt accumulates each time the search moves downhill |
| 4 | When an **improvement** is found → repay debt with interest, then **overshoot** |
| 5 | A **global-best pull** (`δ · (gBest − X)`) stabilizes at all times |
| 6 | Late phase (t > 0.75): mutation suppressed, elites re-optimized every iteration |

### Three-Phase Search Strategy

```
┌─────────────┬──────────────────┬────────────────────┐
│ Exploration  │  Transition       │  Convergence        │
│ 0% ─── 30%  │  30% ─── 75%     │  75% ─── 100%      │
│              │                   │                     │
│ High α       │  Balanced α       │  Suppressed α       │
│ Wide search  │  Debt repayment   │  Elite local search │
│ Accumulate   │  active, balanced │  Aggressive repay   │
│ debt freely  │  exploitation     │  Overshoot active   │
└─────────────┴──────────────────┴────────────────────┘
```

### DPO vs Other Algorithms

| Feature | DPO | SA | GA | DE | PSO |
|---|:---:|:---:|:---:|:---:|:---:|
| Accepts worse moves | ✅ | ✅ | ❌ | ❌ | ❌ |
| Remembers degradation history | ✅ | ❌ | ❌ | ❌ | ❌ |
| Repays with overshoot | ✅ | ❌ | ❌ | ❌ | ❌ |
| Population-based | ✅ | ❌ | ✅ | ✅ | ✅ |
| Island model | ✅ | ❌ | Optional | ❌ | ❌ |
| Multi-objective (NSGA-II) | ✅ | ❌ | Optional | ❌ | ❌ |
| Adaptive parameters | ✅ | ❌ | ❌ | ❌ | Partial |

---

## CLI Entry Points

After installing, three CLI commands are available:

```bash
dpo-benchmark    # Run NAS example
dpo-optimize     # Run HPO example
dpo-routing      # Run pathfinding example
```

---

## API Summary

### Top-Level Functions (from `dpo`)

| Function | Purpose |
|---|---|
| `dpo(problem, config, preset)` | Universal one-call optimizer |
| `dpo_optimize(objective, bounds, names, preset)` | Continuous optimization shortcut |
| `dpo_solve_tsp(distance_matrix, preset)` | TSP solver shortcut |
| `dpo_solve_nas(estimator, constraints, preset)` | NAS solver shortcut |

### Core Classes

| Class | Import | Purpose |
|---|---|---|
| `DPO_Universal` | `dpo.core.universal` | High-level optimizer (recommended) |
| `DPO_Presets` | `dpo.core.universal` | Pre-tuned config factories |
| `DPO_Config` | `dpo.core.config` | All tunable parameters |
| `DPO_NAS` | `dpo.core.optimizer` | Low-level core engine |
| `EnsembleEstimator` | `dpo.evaluation.ensemble` | Default NAS evaluator |
| `AdvancedConstraintHandler` | `dpo.constraints.handler` | Constraint penalty handler |

### Problem Classes

| Class | Import | For |
|---|---|---|
| `Problem` | `dpo.core.problem` | Abstract base — implement for custom domains |
| `ContinuousOptimizationProblem` | `dpo.core.problem` | Real-valued optimization |
| `CombinatoricOptimizationProblem` | `dpo.core.problem` | Permutation/sequence optimization |
| `NASProblem` | `dpo.core.problem` | Neural architecture search |
| `HybridProblem` | `dpo.core.problem` | Mixed continuous + discrete |

### Solution Classes

| Class | Import | For |
|---|---|---|
| `NumericSolution` | `dpo.core.solution` | Real-valued vectors with bounds |
| `CombinatoricSolution` | `dpo.core.solution` | Permutations / sequences |
| `HybridSolution` | `dpo.core.solution` | Mixed numeric + combinatoric |

---

## Project Structure

```
dpo/
├── __init__.py              # Top-level API: dpo(), dpo_optimize(), dpo_solve_tsp(), dpo_solve_nas()
├── core/
│   ├── optimizer.py         # DPO_NAS — core algorithm (2200+ lines)
│   ├── universal.py         # DPO_Universal + DPO_Presets (recommended entry point)
│   ├── config.py            # DPO_Config dataclass (70+ tunable parameters)
│   ├── agent.py             # SearchAgent with debt tracking
│   ├── problem.py           # Problem ABC + 4 concrete problem classes
│   └── solution.py          # Solution ABC + 3 concrete solution classes
├── architecture/
│   └── gene.py              # ArchitectureGene (NAS-specific encoding)
├── evaluation/
│   ├── ensemble.py          # EnsembleEstimator + ProblemBasedEstimator
│   ├── estimators.py        # ZeroShotEstimator, SurrogateEstimator
│   └── cache.py             # LRU evaluation cache with noise injection
├── constraints/
│   └── handler.py           # AdvancedConstraintHandler (adaptive penalties)
├── utils/
│   ├── helpers.py           # save_json, load_json
│   └── logger.py            # get_logger
├── examples/                # 6 runnable example scripts
└── benchmarks/              # Comprehensive evaluation framework
```

---

## Runnable Examples

| Script | Description | Run Command |
|---|---|---|
| `example_nas.py` | NAS with a mock estimator | `python -m dpo.examples.example_nas` |
| `example_hpo.py` | 3 HPO scenarios | `python -m dpo.examples.example_hpo` |
| `example_tsp.py` | TSP (10 & 150 cities) | `python -m dpo.examples.example_tsp` |
| `example_pathfinding.py` | 2D grid pathfinding | `python -m dpo.examples.example_pathfinding` |
| `example_resource_allocation.py` | Cloud resource balancing | `python -m dpo.examples.example_resource_allocation` |
| `example_hybrid.py` | Mixed continuous + discrete | `python -m dpo.examples.example_hybrid` |

---

## Citation

If you use DPO in your research, please cite:

```bibtex
@software{dpo2026,
  author = {Arya H},
  title  = {DPO: Debt-Payment Optimization},
  year   = {2026},
  url    = {https://github.com/Arya1718/dpo}
}
```

---

## License

MIT License. See [LICENSE](LICENSE) for details.

## Author

**Arya H** — [arya.h1718@gmail.com](mailto:arya.h1718@gmail.com)

## Links

- **PyPI:** [pypi.org/project/dpo](https://pypi.org/project/dpo/)
- **GitHub:** [github.com/Arya1718/dpo](https://github.com/Arya1718/dpo)
- **Docs:** [dpo-nas.readthedocs.io](https://dpo-nas.readthedocs.io/)
- **Issues:** [github.com/Arya1718/dpo/issues](https://github.com/Arya1718/dpo/issues)
