Metadata-Version: 2.4
Name: proxai-ms
Version: 0.1.3
Summary: Mass spectrometry machine learning utilities derived from ProXAI notebooks.
Author: Benjamin Nouri Nigjeh
License: MIT
Keywords: mass spectrometry,proteomics,machine learning,saliency,tensorflow
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Topic :: Scientific/Engineering :: Bio-Informatics
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: numpy>=1.24
Requires-Dist: pandas>=2.0
Requires-Dist: matplotlib>=3.8
Provides-Extra: ml
Requires-Dist: tensorflow>=2.14; extra == "ml"
Requires-Dist: scikit-learn>=1.3; extra == "ml"
Requires-Dist: scipy>=1.11; extra == "ml"
Provides-Extra: raw
Requires-Dist: h5py>=3.10; extra == "raw"
Requires-Dist: tqdm>=4.66; extra == "raw"
Provides-Extra: dev
Requires-Dist: pytest>=8.0; extra == "dev"
Requires-Dist: ruff>=0.4; extra == "dev"
Requires-Dist: build>=1.2; extra == "dev"
Requires-Dist: twine>=5.0; extra == "dev"
Requires-Dist: jupyter>=1.0; extra == "dev"
Requires-Dist: nbformat>=5.10; extra == "dev"
Dynamic: license-file

<p align="center">
  <img src="https://raw.githubusercontent.com/benjaminnigjeh/proxai-ms/main/assets/logo.png" alt="ProXAI-MS Logo" width="220"/>
</p>

<h1 align="center">ProXAI-MS</h1>

<p align="center">
  Transforming MS1 Data into Learnable Representations<br>
  via Gradient-Based Pseudo-MS1 Spectra
</p>

<p align="center">
  <img src="https://img.shields.io/badge/python-3.10+-blue"/>
  <img src="https://img.shields.io/badge/status-active-success"/>
  <img src="https://img.shields.io/badge/license-MIT-green"/>
</p>

---

## 🚀 Overview

**ProXAI-MS** is a machine learning framework that converts MS1 data into **interpretable pseudo-MS1 spectra** using gradient-based saliency mapping.

It enables:
- Binary classification (control vs experiment)
- Gradient-based feature attribution
- Reconstruction of spectra from learned signal importance

---

## 🔥 Key Features

- Train ML models on binned MS1 data  
- Gradient-based explainability  
- Separate **positive vs negative gradients**  
- Convert gradients → **pseudo-MS1 spectra**  
- Supports flexible dataset formats (long or wide)  
- CLI + Python API  

---

## 📦 Installation

### From PyPI

```bash
pip install proxai-ms
```

### From source

```bash
git clone https://github.com/benjaminnigjeh/proxai-ms.git
cd proxai-ms
pip install -e .
```

---

## ⚙️ CLI Usage

Run the full pipeline:

```bash
proxai-ms run \
  --csv "F:\20251110\dataset_rt.csv" \
  --label-column target \
  --bin-column bin \
  --bin-values 15 \
  --control-labels 0 1 2 \
  --experiment-labels 3 4 \
  --out-prefix "F:\20261110\proxai_test"


### 🔹 Arguments

- --csv → Input dataset  
- --label-column → Label column  
- --bin-column → Column for bin grouping  
- --bin-values → Number of bins per sample  
- --control-labels → Control group labels  
- --experiment-labels → Experiment group labels  
- --out-prefix → Output prefix  

---

### 📤 Outputs

- <prefix>_pseudo_ms1_positive.csv  
- <prefix>_pseudo_ms1_negative.csv  
- <prefix>_pseudo_ms1_plot.png  

---

## 🧪 Python API

Run the full ProXAI pipeline directly in Python:

```python
from proxai_ms import run_pipeline

result = run_pipeline(
    csv_path="dataset.csv",
    label_column="target",
    bin_column="bin",
    bin_values=15,
    control_labels=[0, 1, 2],
    experiment_labels=[3, 4],
)
---

## 📊 Input Format

Supports:

### Wide format
Rows = samples, columns = m/z bins

### Long format
- bin column (grouping index)
- intensity values
- label column

---

## 🔬 Core Concept

ProXAI learns signal importance via gradients:

- Positive gradients → experiment signal  
- Negative gradients → control signal  

These are mapped back to spectral space to form:

> **Pseudo-MS1 = learned biochemical representation**

Unlike traditional pipelines:
- No peak picking required  
- No manual feature engineering  
- Fully data-driven representation learning  

---

## 🧱 Project Structure

proxai-ms/

├── assets/  
├── src/proxai_ms/  
│   ├── training.py  
│   ├── explain.py  
│   ├── pipeline.py  
│   └── cli.py  
├── notebooks/  
├── docs/  
├── scripts/  
└── pyproject.toml  

---

## ⚠️ Important Notes

- Disable normalization if gradients collapse to zero  
- Avoid averaging gradients across samples incorrectly  
- Always separate positive and negative gradients before aggregation  

---

## 🛣️ Roadmap

- Deep learning models (CNN / Transformer)  
- Multi-class classification  
- UniDec integration  
- mzML export  
- GUI (ProXAI Desktop)  

---

## 👨‍🔬 Author

Benjamin Nouri Nigjeh  
Proteomics • Machine Learning • Mass Spectrometry  

---

## 📜 License

MIT License
