Metadata-Version: 2.4
Name: basanos
Version: 0.4.3
Summary: Correlation-aware portfolio optimization and analytics for Python.
Project-URL: Homepage, https://github.com/Jebel-Quant/basanos
Project-URL: Repository, https://github.com/Jebel-Quant/basanos
Project-URL: Issues, https://github.com/Jebel-Quant/basanos/issues
License-File: LICENSE
Requires-Python: >=3.11
Requires-Dist: jinja2>=3.1
Requires-Dist: numpy<3,>=2.4
Requires-Dist: plotly<7,>=6.6.0
Requires-Dist: polars>=1.37.1
Requires-Dist: pydantic<3,>=2.12.5
Requires-Dist: scipy<2,>=1.17.0
Description-Content-Type: text/markdown

<div align="center">

# <img src="https://raw.githubusercontent.com/Jebel-Quant/rhiza/main/.rhiza/assets/rhiza-logo.svg" alt="Rhiza Logo" width="30" style="vertical-align: middle;"> Basanos

**Correlation-aware portfolio optimization and analytics for Python.**

![Synced with Rhiza](https://img.shields.io/badge/synced%20with-rhiza-2FA4A9?color=2FA4A9)
[![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](LICENSE)
[![Python versions](https://img.shields.io/badge/Python-3.11%20•%203.12%20•%203.13%20•%203.14-blue?logo=python)](https://www.python.org/)
[![CI](https://github.com/Jebel-Quant/basanos/actions/workflows/rhiza_ci.yml/badge.svg?event=push)](https://github.com/Jebel-Quant/basanos/actions/workflows/rhiza_ci.yml)
[![Coverage](https://raw.githubusercontent.com/Jebel-Quant/basanos/refs/heads/gh-pages/coverage-badge.svg)](https://jebel-quant.github.io/basanos/tests/html-coverage/index.html)
[![Code style: ruff](https://img.shields.io/badge/code%20style-ruff-000000.svg?logo=ruff)](https://github.com/astral-sh/ruff)
[![uv](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/uv/main/assets/badge/v0.json)](https://github.com/astral-sh/uv)
![Last Updated](https://img.shields.io/github/last-commit/jebel-quant/basanos/main?label=Last%20updated&color=blue)
[![Paper](https://img.shields.io/badge/paper-basanos.pdf-red?logo=adobeacrobatreader)](https://github.com/jebel-quant/basanos/blob/paper/basanos.pdf)

</div>

---

Basanos computes **correlation-adjusted risk positions** from price data and expected-return signals. It estimates time-varying EWMA correlations, applies shrinkage towards the identity matrix, and solves a normalized linear system per timestamp to produce stable, scale-invariant positions — implementing a first hurdle for expected returns.

## Table of Contents

- [Idea](#idea)
- [Features](#features)
- [Installation](#installation)
- [Quick Start](#quick-start)
- [Notebooks](#notebooks)
- [How It Works](#how-it-works)
- [Shrinkage Methodology](#shrinkage-methodology)
- [Performance Characteristics](#performance-characteristics)
- [API Reference](#api-reference)
- [Configuration Reference](#configuration-reference)
- [Development](#development)
- [License](#license)

## Idea

Most systematic strategies produce a raw signal vector μ — one number per asset indicating how bullish or bearish the model is. Sizing each position in direct proportion to its signal ignores the fact that correlated assets will receive large, overlapping bets in the same direction, concentrating risk rather than diversifying it.

Basanos treats position sizing as a **linear system**:

```
C · x = μ
```

where C is the (shrunk, time-varying) correlation matrix and μ is the signal. Solving for x inverts the correlation structure — assets that share a lot of co-movement with the rest of the portfolio receive smaller positions, while idiosyncratic assets can carry more. The result is a set of *risk positions* that express the full information in the signal while respecting the portfolio's correlation geometry.

Three design choices keep the output stable and usable in practice:

1. **EWMA estimates** — both volatility and correlations are computed as exponentially weighted moving averages, so the optimizer adapts to changing regimes without requiring a fixed lookback window.
2. **Identity shrinkage** — the estimated correlation matrix is blended toward the identity matrix. This regularises the solve, guards against noise in the off-diagonal entries, and prevents numerically extreme positions when the sample is small relative to the number of assets. Setting `cfg.shrink = 0` (full shrinkage, C = I) is a meaningful corner case: the system reduces to `x = μ`, i.e. signal-proportional sizing — which is also the solution a Markowitz optimizer produces when all assets are treated as uncorrelated.
3. **Scale invariance** — positions are normalised by the inverse-matrix norm of μ, so doubling the signal magnitude does not double the position. Sizing is driven instead by a running estimate of realised profit variance, which scales risk up in good regimes and down in bad ones.

The output of the solve is a *risk position* (units of volatility). Dividing by per-asset EWMA volatility converts it into a *cash position* — how many dollars to hold in each asset.

**Why not just run a full optimizer?** The primary use case for basanos is **signal assessment**, not production execution. A fully constrained Markowitz optimizer — with turnover limits, sector caps, leverage constraints, and factor neutrality targets — will bend positions away from what the signal actually implies. The resulting P&L reflects the interaction of the signal with all those constraints, making it hard to tell whether the underlying signal has edge. Basanos deliberately avoids hard constraints to give the signal room to express itself cleanly. By orthogonalizing μ to known risk factors before passing it in, you can further isolate the pure alpha component and measure how much of the return comes from the signal itself versus incidental factor exposures. This makes it a natural *first hurdle*: a signal that cannot generate a reasonable Sharpe through this minimal framework is unlikely to survive the additional friction of a production optimizer.

## Features

- **Correlation-Aware Optimization** — EWMA correlation estimation with shrinkage towards identity
- **Dynamic Risk Management** — Volatility-normalized positions with configurable clipping and variance scaling
- **Signal Evaluation** — IC and Rank IC time series, ICIR summary statistics, and a naïve equal-weight Sharpe benchmark to isolate signal skill
- **Diagnostic Properties** — Condition number, effective rank, solver residual, signal utilisation, risk position, and gross leverage for every timestamp
- **Factor Risk Model** — `FactorModel` decomposition Σ = B·F·Bᵀ + D fitted via truncated SVD for low-rank covariance inspection
- **Portfolio Analytics** — Sharpe, VaR, CVaR, drawdown, skew, kurtosis, and more
- **Performance Attribution** — Tilt/timing decomposition to isolate allocation vs. selection effects
- **Interactive Visualizations** — Plotly dashboards for NAV, drawdown, lead/lag analysis, and correlation heatmaps
- **Trading Cost Analysis** — Estimate the impact of one-way trading costs on Sharpe ratio across a configurable basis-point range
- **Config Reports** — Self-contained HTML report for `BasanosConfig` with parameter table, shrinkage guidance, and interactive lambda-sweep chart
- **HTML Reports** — One-call self-contained dark-themed HTML report with statistics tables and interactive Plotly charts
- **Polars-Native** — Built on Polars DataFrames for high-performance, memory-efficient computation

## Installation

```bash
pip install basanos
```

Or with [uv](https://github.com/astral-sh/uv):

```bash
uv add basanos
```

## Quick Start

### Portfolio Optimization

```python
import numpy as np
import polars as pl
from basanos.math import BasanosConfig, BasanosEngine

n_days = 100
dates = pl.date_range(
    pl.date(2023, 1, 1),
    pl.date(2023, 1, 1) + pl.duration(days=n_days - 1),
    eager=True,
)
rng = np.random.default_rng(42)

prices = pl.DataFrame({
    "date": dates,
    "AAPL":  100.0 + np.cumsum(rng.normal(0, 1.0, n_days)),
    "GOOGL": 150.0 + np.cumsum(rng.normal(0, 1.2, n_days)),
})

# Expected-return signals in [-1, 1] (e.g. from a forecasting model)
mu = pl.DataFrame({
    "date": dates,
    "AAPL":  np.tanh(rng.normal(0, 0.5, n_days)),
    "GOOGL": np.tanh(rng.normal(0, 0.5, n_days)),
})

cfg = BasanosConfig(
    vola=16,    # EWMA lookback for volatility (days)
    corr=32,    # EWMA lookback for correlation (days, must be >= vola)
    clip=3.5,   # Clipping threshold for vol-adjusted returns
    shrink=0.5, # Shrinkage intensity towards identity [0, 1]
    aum=1e6,    # Assets under management
)

engine    = BasanosEngine(prices=prices, mu=mu, cfg=cfg)
positions = engine.cash_position  # pl.DataFrame of optimized cash positions
portfolio = engine.portfolio      # Portfolio object for analytics
```

### Portfolio Analytics

```python
import numpy as np
import polars as pl
from basanos.analytics import Portfolio

n_days = 60
dates = pl.date_range(
    pl.date(2023, 1, 1),
    pl.date(2023, 1, 1) + pl.duration(days=n_days - 1),
    eager=True,
)
rng = np.random.default_rng(42)

prices = pl.DataFrame({
    "date": dates,
    "AAPL":  100.0 * np.cumprod(1 + rng.normal(0.001, 0.020, n_days)),
    "GOOGL": 150.0 * np.cumprod(1 + rng.normal(0.001, 0.025, n_days)),
})

positions = pl.DataFrame({
    "date": dates,
    "AAPL":  np.full(n_days, 10_000.0),
    "GOOGL": np.full(n_days, 15_000.0),
})

portfolio = Portfolio.from_cash_position(prices=prices, cash_position=positions, aum=1e6)

# Performance metrics
nav      = portfolio.nav_accumulated   # Cumulative additive NAV
returns  = portfolio.returns           # Daily returns scaled by AUM
drawdown = portfolio.drawdown          # Distance from high-water mark

# Statistics
stats  = portfolio.stats
sharpe = stats.sharpe()["returns"]
vol    = stats.volatility()["returns"]
```

### Visualizations

```python
fig = portfolio.plots.snapshot()                          # NAV + drawdown dashboard
fig = portfolio.plots.lead_lag_ir_plot(start=-10, end=20) # Sharpe across position lags
fig = portfolio.plots.lagged_performance_plot(lags=[0, 1, 2, 3, 4])
fig = portfolio.plots.correlation_heatmap()
# fig.show()
```

### Generating Reports

`portfolio.report` returns a `Report` facade that produces a self-contained, dark-themed HTML document with a performance-statistics table and multiple interactive Plotly charts.

```python
report = portfolio.report

# Render to a string (e.g. to serve via an API or display in a notebook)
html_str = report.to_html()

# Or save directly to disk — a .html extension is added automatically
saved_path = report.save("output/report")
# → saves to output/report.html

# Customize the page title
report.save("output/my_report.html", title="My Strategy Report")
```

The generated report contains the following sections:

| Section | Content |
|---------|---------|
| **Performance** | Cumulative NAV + drawdown snapshot |
| **Risk Analysis** | Rolling Sharpe ratio and rolling volatility charts |
| **Annual Breakdown** | Sharpe ratio by calendar year |
| **Monthly Returns** | Monthly returns heatmap |
| **Performance Statistics** | Full statistics table (returns, drawdown, risk-adjusted, distribution) |
| **Correlation Analysis** | Asset correlation heatmap |
| **Lead / Lag** | Lead/lag information ratio chart |
| **Turnover Summary** | Portfolio turnover metrics |
| **Trading Cost Impact** | Sharpe ratio vs. one-way trading cost (basis points) |

### Trading Cost Analysis

`Portfolio` exposes two methods for understanding how trading costs erode strategy edge:

```python
# Net-of-cost daily returns (5 bps one-way cost)
adj_returns = portfolio.cost_adjusted_returns(cost_bps=5)

# Sharpe ratio sweep from 0 to 20 bps
impact = portfolio.trading_cost_impact(max_bps=20)
# Returns a pl.DataFrame with columns: cost_bps (Int64), sharpe (Float64)

# Interactive Plotly chart — Sharpe vs cost
fig = portfolio.plots.trading_cost_impact_plot(max_bps=20)
# fig.show()
```

### Config Reports

`BasanosConfig` and `BasanosEngine` each expose a report property that produces a self-contained, dark-themed HTML document with:

- a **parameter table** (all fields, values, constraints, and descriptions),
- a **shrinkage guidance table** (n/T regime heuristics), and
- a **theory section** on Ledoit-Wolf shrinkage.

When accessed from `BasanosEngine`, the report additionally includes an **interactive lambda-sweep chart** — the annualised Sharpe ratio as the shrinkage parameter λ is swept across [0, 1].

```python
import numpy as np
import polars as pl
from basanos.math import BasanosConfig, BasanosEngine

cfg = BasanosConfig(vola=16, corr=32, clip=3.5, shrink=0.5, aum=1e6)

# Config-only report (no lambda sweep)
html_str = cfg.report.to_html()
cfg.report.save("output/config_report")  # → output/config_report.html

# Engine report (includes lambda-sweep chart)
n = 100
_dates = pl.date_range(pl.date(2023, 1, 1), pl.date(2023, 1, 1) + pl.duration(days=n - 1), eager=True)
_rng = np.random.default_rng(0)
_prices = pl.DataFrame({"date": _dates, "AAPL": 100.0 + np.cumsum(_rng.normal(0, 1.0, n)), "GOOGL": 150.0 + np.cumsum(_rng.normal(0, 1.2, n))})
_mu = pl.DataFrame({"date": _dates, "AAPL": np.tanh(_rng.normal(0, 0.5, n)), "GOOGL": np.tanh(_rng.normal(0, 0.5, n))})
cfg_engine = BasanosEngine(prices=_prices, mu=_mu, cfg=cfg)
cfg_engine.config_report.save("output/config_with_sweep")
```

## Notebooks

Three interactive [Marimo](https://marimo.io/) notebooks live under
`book/marimo/notebooks/`. They are self-contained — each embeds its own
dependency list ([PEP 723](https://peps.python.org/pep-0723/)), so `uv run`
installs everything automatically.

| Notebook | Description | Key concepts |
|---|---|---|
| [`demo.py`](book/marimo/notebooks/demo.py) | End-to-end interactive demo of the Basanos optimizer | Signal generation, correlation-aware position sizing, portfolio analytics, reactive UI |
| [`ewm_benchmark.py`](book/marimo/notebooks/ewm_benchmark.py) | Validates and benchmarks the NumPy/SciPy EWM correlation implementation against the legacy pandas version | EWM, `scipy.signal.lfilter`, NaN handling, performance comparison |
| [`shrinkage_guide.py`](book/marimo/notebooks/shrinkage_guide.py) | Theoretical and empirical guide to tuning the shrinkage parameter λ | Marchenko-Pastur law, linear shrinkage `C(λ) = λ·C_EWMA + (1−λ)·I`, Sharpe vs. λ sweep, turnover analysis |

### Running the notebooks

```bash
# Launch all notebooks in the Marimo editor (opens http://localhost:2718)
make marimo

# Open a single notebook for interactive editing
marimo edit book/marimo/notebooks/demo.py

# Run a single notebook read-only / presentation mode
marimo run book/marimo/notebooks/demo.py

# Self-contained via uv — no prior install needed
uv run book/marimo/notebooks/demo.py
uv run book/marimo/notebooks/ewm_benchmark.py
uv run book/marimo/notebooks/shrinkage_guide.py
```

**Prerequisites**: Python ≥ 3.11 and `uv`. Each notebook's dependencies
(marimo, basanos, numpy, polars, plotly, …) are resolved automatically by `uv`.
If you are editing source code alongside the notebook, run `make install` first
so the local package is available.

## How It Works

The optimizer implements a three-step pipeline per timestamp:

1. **Volatility adjustment** — Log returns are normalized by an EWMA volatility estimate and clipped at `cfg.clip` standard deviations to limit the influence of outliers.

2. **Correlation estimation** — An EWMA correlation matrix is computed from the vol-adjusted returns using a lookback of `cfg.corr` days. The matrix is shrunk toward the identity matrix with retention weight `cfg.shrink` (λ):

   ```
   C_shrunk = λ · C_ewma + (1 − λ) · I
   ```

   where λ = `cfg.shrink`. `λ = 1.0` uses the raw EWMA matrix; `λ = 0.0` replaces it with the identity (treating all assets as uncorrelated). See [Shrinkage Methodology](#shrinkage-methodology) below for guidance on choosing λ.

3. **Position solving** — For each timestamp, the system `C_shrunk · x = mu` is solved for `x` (the risk position vector). The solution is normalized by the inverse-matrix norm of `mu`, making positions scale-invariant with respect to signal magnitude. Positions are further scaled by a running profit-variance estimate to adapt risk dynamically.

Cash positions are obtained by dividing risk positions by per-asset EWMA volatility.

## Shrinkage Methodology

### Why shrink?

Sample correlation matrices estimated from *T* observations of *n* assets are
poorly conditioned when *n* is large relative to *T* — the classical
*curse of dimensionality*. The Marchenko–Pastur law shows that extreme
eigenvalues of the sample matrix are severely biased (small eigenvalues are
deflated, large ones are inflated), making the matrix difficult to invert
reliably. Linear shrinkage toward the identity corrects this by pulling all
eigenvalues toward a common value, improving the numerical condition of the
matrix and reducing out-of-sample estimation error.

Basanos uses **convex linear shrinkage** (Ledoit & Wolf, 2004):

```
C_shrunk = λ · C_ewma + (1 − λ) · I_n
```

This is a special case of the general Ledoit–Wolf framework where the
shrinkage target is the identity matrix and the retention weight λ is
treated as a user-controlled hyperparameter. Unlike the analytically optimal
Ledoit–Wolf or Oracle Approximating Shrinkage (OAS) estimators, Basanos uses
a fixed λ — appropriate for *regularising a linear solver* rather than
*estimating a covariance matrix*, where practical stability often matters
more than minimum Frobenius loss.

### How to choose `cfg.shrink` (= λ)

The key quantity is the **concentration ratio** *n / T*, where *n* = number
of assets and *T* = `cfg.corr` (the EWMA lookback).

| Regime | n / T ratio | Suggested λ | Rationale |
|--------|------------|-------------|-----------|
| Many assets, short lookback | > 0.5 | 0.3 – 0.5 | High noise; strong regularisation |
| Moderate assets and lookback | 0.1 – 0.5 | 0.5 – 0.7 | Balanced |
| Few assets, long lookback | < 0.1 | 0.7 – 0.9 | Well-conditioned sample; light regularisation |

A useful heuristic starting point is **λ ≈ 1 − n / (2·T)** (where *n* = number
of assets and *T* = `cfg.corr`), which approximates the Ledoit–Wolf optimal
intensity. Always validate on held-out data.

**Sensitivity notes:**

- Below λ ≈ 0.3 the matrix can become nearly singular for small portfolios
  (e.g., n > 10 with `corr` < 50), leading to numerically unstable positions.
- Above λ ≈ 0.8 the off-diagonal correlations are so heavily damped that the
  optimizer behaves almost as if all assets were independent.
- Shrinkage is most influential in the range λ ∈ [0.3, 0.8].

### Interactive demonstration

The `book/marimo/notebooks/shrinkage_guide.py` notebook shows the empirical
effect of different shrinkage levels on portfolio Sharpe ratio and position
stability for a realistic synthetic dataset.

### References

- Ledoit, O., & Wolf, M. (2004). *A well-conditioned estimator for
  large-dimensional covariance matrices.* Journal of Multivariate Analysis,
  88(2), 365–411. https://doi.org/10.1016/S0047-259X(03)00096-4
- Chen, Y., Wiesel, A., Eldar, Y. C., & Hero, A. O. (2010). *Shrinkage
  algorithms for MMSE covariance estimation.* IEEE Transactions on Signal
  Processing, 58(10), 5016–5029. https://doi.org/10.1109/TSP.2010.2053029
- Stein, C. (1956). *Inadmissibility of the usual estimator for the mean of a
  multivariate normal distribution.* Proceedings of the Third Berkeley
  Symposium, 1, 197–206.

## Performance Characteristics

> **TL;DR** — the optimizer is practical for ≤ 250 assets with ≤ 10 years of
> daily data on a 16 GB workstation.  Beyond those limits, memory or compute
> time becomes the bottleneck.

### Computational complexity

Let *N* = number of assets and *T* = number of timestamps.

| Step | Complexity | Bottleneck |
|------|-----------|------------|
| Vol-adjustment (`ret_adj`, `vola`) | O(T·N) | EWMA per asset; scales linearly |
| EWM correlation (`cor`) | **O(T·N²)** | `lfilter` over all N² pairs in parallel |
| Linear solve per row (`cash_position`) | **O(N³) × T solves** | Cholesky/LU decomposition per timestamp |

For most practical portfolio sizes (N ≤ 200) the correlation step dominates.
At very large N (≥ 500) the per-solve cost O(N³) can also become significant.

### Memory usage

`_ewm_corr_numpy` allocates roughly **14 float64 arrays** of shape `(T, N, N)`
simultaneously at peak (input sequences fed to `scipy.signal.lfilter`, the IIR
filter outputs, the five EWM component arrays, and the result tensor):

```
Peak RAM ≈ 14 × 8 × T × N²  bytes  ≈  112 × T × N²  bytes
```

Practical working sizes:

| N (assets) | T (daily rows) | Approx. history | Peak memory |
|-----------|---------------|-----------------|-------------|
| 50 | 252 | ~1 year | ~70 MB |
| 100 | 252 | ~1 year | ~280 MB |
| 100 | 1 260 | ~5 years | ~1.4 GB |
| 100 | 2 520 | ~10 years | **~2.8 GB** |
| 200 | 1 260 | ~5 years | ~5.6 GB |
| 200 | 2 520 | ~10 years | **~11 GB** |
| 500 | 2 520 | ~10 years | **~70 GB ⚠** |
| 1 000 | 2 520 | ~10 years | **~280 GB ⛔** |

### Practical limits

| Zone | Condition | Guidance |
|------|-----------|----------|
| ✅ Comfortable | N ≤ 150, T ≤ 1 260 (~5 yr daily) | Runs on an 8 GB laptop in seconds |
| ⚠ Feasible with care | N ≤ 250, T ≤ 2 520 (~10 yr daily) | Requires ~11–12 GB RAM; plan for 10–60 s wall time |
| 🔴 Impractical | N > 500 or T > 5 000 | Peak memory exceeds 16 GB; consider mitigation strategies below |
| ⛔ Not supported | N > 1 000 with multi-year history | Solve cost and memory are prohibitive on commodity hardware |

> **Note on `cfg.corr`** — this is the EWM lookback window, not the total
> dataset length.  Even if you have 10 years of prices, keeping `cfg.corr`
> short (e.g., 63 days) does *not* reduce the peak memory cost of
> `_ewm_corr_numpy`: the function always allocates the full `(T, N, N)` tensor
> regardless of the lookback value.  To limit memory, reduce the number of rows
> passed in *T* itself (e.g., trim old prices) rather than adjusting `cfg.corr`.

### Mitigation strategies

When you hit memory or performance limits:

1. **Reduce the asset universe** — keep only the most liquid or relevant assets;
   pre-filter with univariate signal strength before running the optimizer.
2. **Shorten the price history** — `_ewm_corr_numpy` processes every row; trim
   older data to the minimum needed for the EWM warm-up (`cfg.corr` rows).
3. **Increase `cfg.shrink` toward 1.0** — stronger identity shrinkage reduces
   the sensitivity of the solve to noisy off-diagonal entries, allowing a
   shorter effective lookback without instability.
4. **Process in rolling windows** — run the optimizer on overlapping windows
   (e.g., 1-year chunks) and stitch results; correlation estimates will differ
   slightly at window boundaries but memory stays bounded.
5. **Use `cor_tensor` instead of `cor`** — returns a single `(T, N, N)` NumPy
   array rather than a Python dict, avoiding Python object overhead for large T.

### Benchmark data

Measured on a GitHub Actions runner (AMD EPYC 7763, 4 vCPUs, Python 3.12):

| Dataset | `cor` time | `cash_position` time |
|---------|-----------|---------------------|
| 5 assets, 252 rows (~1 yr) | 1.2 ms | 56 ms |
| 5 assets, 1 260 rows (~5 yr) | 5.4 ms | 222 ms |
| 20 assets, 252 rows (~1 yr) | 13.6 ms | — |

See [`BENCHMARKS.md`](BENCHMARKS.md) for full results and regression baselines.

## API Reference

### `basanos.math`

```python
from basanos.math import BasanosConfig, BasanosEngine, FactorModel
```

| Class | Description |
|-------|-------------|
| `BasanosConfig` | Immutable configuration (Pydantic model) |
| `BasanosEngine` | Core optimizer; produces positions and a `Portfolio` |
| `FactorModel` | Factor risk model decomposition Σ = B·F·Bᵀ + D |

**`BasanosEngine` properties**

| Property | Returns | Description |
|----------|---------|-------------|
| `assets` | `list[str]` | Numeric asset column names |
| `ret_adj` | `pl.DataFrame` | Vol-adjusted, clipped log returns |
| `vola` | `pl.DataFrame` | Per-asset EWMA volatility |
| `cor` | `dict[date, np.ndarray]` | EWMA correlation matrices keyed by date |
| `cor_tensor` | `np.ndarray` | All correlation matrices stacked as a `(T, N, N)` tensor; supports `.npy` round-trip |
| `cash_position` | `pl.DataFrame` | Optimized cash positions (risk divided by EWMA volatility) |
| `position_status` | `pl.DataFrame` | Per-row reason code for cash_position: `warmup`, `zero_signal`, `degenerate`, or `valid` |
| `risk_position` | `pl.DataFrame` | Risk positions before volatility scaling (= `cash_position × vola`) |
| `position_leverage` | `pl.DataFrame` | L1 norm of cash positions (gross leverage) per timestamp |
| `condition_number` | `pl.DataFrame` | Condition number κ of the shrunk correlation matrix per timestamp |
| `effective_rank` | `pl.DataFrame` | Entropy-based effective rank of the shrunk correlation matrix per timestamp |
| `solver_residual` | `pl.DataFrame` | Euclidean residual norm ‖C_shrunk·x − μ‖₂ per timestamp |
| `signal_utilisation` | `pl.DataFrame` | Per-asset fraction of μ surviving the correlation filter; 1 when C = I |
| `ic` | `pl.DataFrame` | Cross-sectional Pearson IC time series (signal vs. one-period forward return) |
| `rank_ic` | `pl.DataFrame` | Cross-sectional Spearman Rank IC time series |
| `ic_mean` | `float` | Mean IC across all timestamps (NaN-ignoring) |
| `ic_std` | `float` | Sample standard deviation of IC |
| `icir` | `float` | IC Information Ratio (`ic_mean / ic_std`) |
| `rank_ic_mean` | `float` | Mean Rank IC across all timestamps |
| `rank_ic_std` | `float` | Sample standard deviation of Rank IC |
| `naive_sharpe` | `float` | Sharpe ratio of the naïve equal-weight signal (μ = 1 benchmark) |
| `portfolio` | `Portfolio` | Ready-to-use portfolio for analytics |
| `config_report` | `ConfigReport` | HTML report with lambda-sweep chart + parameter table |

**`BasanosEngine` methods**

| Method | Signature | Description |
|--------|-----------|-------------|
| `sharpe_at_shrink` | `sharpe_at_shrink(shrink: float) → float` | Annualised Sharpe ratio for a given shrinkage weight λ ∈ [0, 1]; use for lambda sweeps |

**`BasanosConfig` properties**

| Property | Returns | Description |
|----------|---------|-------------|
| `report` | `ConfigReport` | HTML report with parameter table, shrinkage guidance, and theory (no lambda sweep) |

---

### `basanos.analytics`

```python
from basanos.analytics import Portfolio
```

| Class | Description |
|-------|-------------|
| `Portfolio` | Central data model for P&L, NAV, and attribution |
| `Stats` | Statistical risk/return metrics |
| `Plots` | Plotly-based interactive visualizations |
| `Report` | HTML report facade; produces self-contained dark-themed reports |

**`Portfolio` properties**

| Property | Description |
|----------|-------------|
| `profits` | Per-asset daily P&L |
| `profit` | Aggregate daily portfolio profit |
| `nav_accumulated` | Cumulative additive NAV |
| `nav_compounded` | Compounded NAV |
| `returns` | Daily returns scaled by AUM |
| `monthly` | Monthly compounded returns |
| `highwater` | Running NAV maximum |
| `drawdown` | Drawdown from high-water mark |
| `tilt` | Static allocation (average position) |
| `timing` | Dynamic timing (deviation from average) |
| `turnover` | Daily one-way turnover as a fraction of AUM |
| `turnover_weekly` | Weekly-aggregated turnover |
| `stats` | `Stats` instance |
| `plots` | `Plots` instance |
| `report` | `Report` instance for HTML report generation |

**`Portfolio` methods (trading costs)**

| Method | Description |
|--------|-------------|
| `cost_adjusted_returns(cost_bps)` | Daily returns net of estimated one-way trading costs (in bps) |
| `trading_cost_impact(max_bps=20)` | Sharpe ratio at each integer cost level 0 … max_bps (returns `pl.DataFrame`) |
| `turnover_summary()` | Summary DataFrame: mean daily/weekly turnover and turnover std |

**`Stats` methods**

| Method | Description |
|--------|-------------|
| `sharpe(periods)` | Annualized Sharpe ratio |
| `volatility(periods, annualize)` | Standard deviation of returns |
| `rolling_sharpe(window, periods)` | Rolling annualised Sharpe ratio time series |
| `rolling_volatility(window, periods, annualize)` | Rolling volatility time series |
| `annual_breakdown()` | Full summary statistics broken down by calendar year |
| `skew()` | Skewness |
| `kurtosis()` | Excess kurtosis |
| `value_at_risk(alpha, sigma)` | Parametric VaR |
| `conditional_value_at_risk(alpha, sigma)` | Expected shortfall (CVaR) |
| `avg_return()` | Mean return (zeros excluded) |
| `avg_win()` | Mean positive return |
| `avg_loss()` | Mean negative return |
| `win_rate()` | Fraction of profitable periods |
| `profit_factor()` | Gross wins / absolute gross losses |
| `payoff_ratio()` | Average win / absolute average loss |
| `monthly_win_rate()` | Fraction of profitable calendar months |
| `best()` | Maximum single-period return |
| `worst()` | Minimum single-period return |
| `worst_n_periods(n)` | *N* worst return periods (default 5) |
| `up_capture(benchmark)` | Up-market capture ratio vs. benchmark |
| `down_capture(benchmark)` | Down-market capture ratio vs. benchmark |
| `max_drawdown()` | Largest peak-to-trough decline as a fraction of peak |
| `avg_drawdown()` | Mean drawdown across all underwater periods |
| `max_drawdown_duration()` | Longest consecutive underwater period (calendar days) |
| `calmar(periods)` | Annualized return divided by max drawdown |
| `recovery_factor()` | Total return divided by max drawdown |

**`Plots` methods (rolling & sub-period)**

| Method | Description |
|--------|-------------|
| `rolling_sharpe_plot(window)` | Line chart of rolling Sharpe ratio |
| `rolling_volatility_plot(window)` | Line chart of rolling annualised volatility |
| `annual_sharpe_plot()` | Bar chart of Sharpe ratio by calendar year |
| `trading_cost_impact_plot(max_bps=20)` | Line chart of Sharpe ratio vs. one-way trading cost (0–max_bps bps) |

---

### `basanos.math.FactorModel`

A frozen dataclass representing a factor risk model decomposition:

```
Σ = B · F · Bᵀ + D
```

where **B** (n×k) is the factor loading matrix, **F** (k×k) is the factor covariance matrix, and **D** is the diagonal idiosyncratic variance.

```python
from basanos.math import FactorModel
import numpy as np

# Fit from a return matrix (T×n) using truncated SVD
returns = np.random.default_rng(0).normal(size=(200, 5))
fm = FactorModel.from_returns(returns, k=2)

fm.n_assets    # 5
fm.n_factors   # 2
fm.covariance  # reconstructed (5, 5) covariance matrix
```

**`FactorModel` attributes**

| Attribute | Shape | Description |
|-----------|-------|-------------|
| `factor_loadings` | `(n, k)` | Factor loading matrix **B**; column *j* gives each asset's sensitivity to factor *j* |
| `factor_covariance` | `(k, k)` | Factor covariance matrix **F** (positive definite) |
| `idiosyncratic_var` | `(n,)` | Per-asset idiosyncratic variance (strictly positive) |

**`FactorModel` properties and methods**

| Name | Returns | Description |
|------|---------|-------------|
| `n_assets` | `int` | Number of assets *n* |
| `n_factors` | `int` | Number of factors *k* |
| `covariance` | `np.ndarray (n, n)` | Reconstructed full covariance matrix Σ = B·F·Bᵀ + D |
| `from_returns(returns, k)` | `FactorModel` | Class method; fits the model from a `(T, n)` return matrix via truncated SVD |

---

### `basanos.math.ConfigReport`

Accessed via `cfg.report` (config-only) or `engine.config_report` (includes lambda sweep).

```python
from basanos.math import BasanosConfig, BasanosEngine

cfg    = BasanosConfig(vola=16, corr=32, clip=3.5, shrink=0.5, aum=1e6)
engine = BasanosEngine(prices=_prices, mu=_mu, cfg=cfg)

cfg.report.to_html()          # parameter table + shrinkage guidance + theory
engine.config_report.to_html()  # above + interactive lambda-sweep chart
```

**`ConfigReport` methods**

| Method | Signature | Description |
|--------|-----------|-------------|
| `to_html` | `to_html(title="Basanos Config Report") -> str` | Returns the complete HTML document as a string. |
| `save` | `save(path, title="Basanos Config Report") -> Path` | Writes the HTML document to *path*. A `.html` suffix is appended when missing. |

---

### `basanos.analytics.Report`

Accessed via `portfolio.report`.  Produces a self-contained HTML document with
dark-themed styling, a statistics table, and embedded interactive Plotly charts.

**`Report` methods**

| Method | Signature | Description |
|--------|-----------|-------------|
| `to_html` | `to_html(title="Basanos Portfolio Report") -> str` | Returns the complete HTML document as a string.  Plotly.js is loaded once from the CDN. |
| `save` | `save(path, title="Basanos Portfolio Report") -> Path` | Writes the HTML document to *path*.  A `.html` suffix is appended when the path has no extension. |

## Configuration Reference

**Required parameters**

| Parameter | Type | Constraint | Description |
|-----------|------|------------|-------------|
| `vola` | `int` | `> 0` | EWMA lookback for volatility (days) |
| `corr` | `int` | `>= vola` | EWMA lookback for correlation (days) |
| `clip` | `float` | `> 0` | Clipping threshold for vol-adjusted returns |
| `shrink` | `float` | `[0, 1]` | Shrinkage intensity — `0` = identity (no correlation adj.), `1` = raw EWMA |
| `aum` | `float` | `> 0` | Assets under management for position scaling |

**Optional parameters** (sensible defaults for most use cases)

| Parameter | Type | Default | Constraint | Description |
|-----------|------|---------|------------|-------------|
| `profit_variance_init` | `float` | `1.0` | `> 0` | Initial value for the profit-variance EMA |
| `profit_variance_decay` | `float` | `0.99` | `(0, 1)` | EMA decay factor λ for realised P&L variance; default gives ~69-period half-life |
| `denom_tol` | `float` | `1e-12` | `> 0` | Minimum normalisation denominator; positions are zeroed at or below this threshold |
| `position_scale` | `float` | `1e6` | `> 0` | Multiplicative factor applied to dimensionless risk positions to obtain base-currency cash positions |
| `min_corr_denom` | `float` | `1e-14` | `> 0` | Guard threshold for the EWMA correlation denominator; correlations below this are set to NaN |
| `max_nan_fraction` | `float` | `0.9` | `(0, 1)` | Maximum tolerated fraction of null values in any asset price column before raising `ExcessiveNullsError` |

```python
from basanos.math import BasanosConfig

# Conservative — longer lookbacks, stronger shrinkage
conservative = BasanosConfig(vola=32, corr=64, clip=3.0, shrink=0.7, aum=1e6)

# Responsive — shorter lookbacks, lighter shrinkage
responsive   = BasanosConfig(vola=8,  corr=16, clip=4.0, shrink=0.3, aum=1e6)
```

## Development

```bash
git clone https://github.com/Jebel-Quant/basanos.git
cd basanos
uv sync
```

| Command | Action |
|---------|--------|
| `make test` | Run the test suite |
| `make fmt` | Format and lint with ruff |
| `make typecheck` | Static type checking |
| `make deptry` | Audit declared dependencies |

Before submitting a PR, ensure all checks pass:

```bash
make fmt && make test && make typecheck
```

For architectural decisions and their rationale, see [docs/adr/](docs/adr/).

## License

See [LICENSE](LICENSE) for details.
