Metadata-Version: 2.4
Name: winnow-ml
Version: 1.0.0
Summary: An intelligent feature selection library for Machine Learning
Author-email: Your Name <your.email@example.com>
Project-URL: Homepage, https://github.com/yourusername/winnow
Project-URL: Bug Tracker, https://github.com/yourusername/winnow/issues
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: numpy>=1.21.0
Requires-Dist: pandas>=1.3.0
Requires-Dist: scikit-learn>=1.0.0
Requires-Dist: shap>=0.40.0
Dynamic: license-file

# 🌾 Winnow

[![Python Version](https://img.shields.io/badge/python-3.8%2B-blue.svg)](https://www.python.org/)
[![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](https://opensource.org/licenses/MIT)
[![PRs Welcome](https://img.shields.io/badge/PRs-welcome-brightgreen.svg)](http://makeapullrequest.com)
[![Maintainability](https://img.shields.io/badge/maintainability-A-brightgreen.svg)](https://github.com/)

**The Intelligent Feature Selection Library for Machine Learning**

Winnow is a high-performance, modular Python library designed to separate the signal from the noise in your data. It provides both classic statistical criteria and state-of-the-art selection techniques like BorutaSHAP and mRMR, all wrapped in a clean, developer-friendly API.

---

## 🎯 Why Winnow?

In data science, more data isn't always better. Irrelevant or redundant features can lead to overfitting, slow training times, and poor model interpretability. Winnow helps you "winnow the wheat from the chaff," leaving you with only the most impactful features.

### ✨ Key Features

*   **⚡ Base Selection**: Correlation-based pruning, variance thresholding, and target relevancy.
*   **🧠 Advanced Techniques**: 
    *   **mRMR**: Minimum Redundancy Maximum Relevance for diverse feature sets.
    *   **BorutaSHAP**: Rigorous statistical selection using shadow features and SHAP values.
    *   **PCA Contribution**: Filter features based on their influence on principal components.
*   **🛠️ Custom Logic**: Easily register your own selection criteria into the pipeline.
*   **📊 Transparency**: Comprehensive reporting of why each feature was kept or removed.

---

## 📦 Installation

```bash
pip install -r requirements.txt
```

---

## 🚀 Quick Start

### Basic Selection

```python
import pandas as pd
from winnow import BaseSelector

# Load your data
df = pd.read_csv("data.csv")

# Initialize and transform
selector = BaseSelector(correlation_threshold=0.85)
result = selector.fit_transform(df, target_column='target')

print(f"Selected features: {result.selected_features}")
```

### Advanced Selection (Boruta + mRMR)

```python
from winnow import AdvancedSelector

selector = AdvancedSelector(
    use_borutashap=True, 
    n_boruta_trials=20,
    use_mrmr=True,
    n_mrmr_features=10
)

result = selector.fit_transform(X, y)
```

---

## 📁 Project Structure

```
winnow/
├── winnow/               # Core source code
│   ├── base_selector.py  # Core selection logic
│   ├── advanced_selector.py # Boruta, mRMR, and PCA
│   └── logger.py         # Centralized logging
├── examples/             # Real-world usage examples
├── tests/                # Robust test suite
└── README.md
```

---

## 📄 License

Winnow is released under the **MIT License**.
