Metadata-Version: 2.4
Name: alphapurify
Version: 0.1.2
Summary: High-performance quantitative factor analysis and purification toolkit
Author-email: Elias Wu <elaiswu71@gmail.com>
License-Expression: MIT
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Operating System :: OS Independent
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE.txt
Requires-Dist: pandas
Requires-Dist: polars
Requires-Dist: duckdb
Requires-Dist: plotly
Requires-Dist: numpy
Requires-Dist: scipy
Requires-Dist: pyarrow
Requires-Dist: joblib
Requires-Dist: scikit-learn
Requires-Dist: tqdm
Dynamic: license-file

# AlphaPurify

**AlphaPurify** is a high-performance quantitative factor analysis and purification toolkit designed for institutional-grade research workflows.

It provides a fully modular, vectorized, and multiprocessing-enabled framework for factor cleaning, evaluation, exposure decomposition, and portfolio attribution — built on a modern Polars-based architecture for large-scale cross-sectional datasets.

---

## 🚀 Key Features

- ⚡ **High Performance**
  - Nearly fully vectorized architecture powered by Polars
  - Optimized for large-scale cross-sectional panel data
  - Memory-efficient structural safeguards

- 🧩 **Fully Modular Design**
  - Each module can be used independently
  - Seamlessly integrated into custom research pipelines
  - Minimal coupling between components

- 📊 **Comprehensive Factor Research Engine**
  - Cross-sectional IC analysis
  - Horizon autocorrelation
  - Quantile portfolio backtesting
  - Turnover measurement
  - Industry-level attribution
  - Long–short, long-only, and short-only evaluation

- 🧪 **Advanced Factor Cleaning Toolkit**
  - 40+ preprocessing techniques
  - Robust winsorization
  - Regression-based neutralization
  - Polynomial & robust regression options
  - Advanced standardization methods

- 📈 **Exposure & Return Attribution**
  - Systematic exposure decomposition
  - Residual alpha estimation
  - Cumulative attribution curves
  - Interactive Plotly visualizations

- 🕒 **Frequency-Agnostic**
  - Supports intraday, daily, weekly, and high-frequency datasets
  - No structural modifications required

- 🛡 **Look-Ahead Bias Protection**
  - Forward return construction safeguards
  - Rebalancing alignment protection
  - Parameter-level anti-leakage controls

---

## 📦 Installation

```bash
pip install alphapurify

## 📊 Example Workflow

from alphapurify import AlphaPurifier, FactorAnalyzer

# Load your DataFrame
df = ...

# Clean factor
cleaned = (
    AlphaPurifier(df, factor_col="alpha")
    .winsorize(method="mad")
    .neutralize(neutralizer_cols=["size", "industry"])
    .standardize(method="zscore")
    .to_result()
)
