Metadata-Version: 2.4
Name: pepred
Version: 0.1.0
Summary: PE prediction model (batch scoring and evaluation).
Author-email: Your Name <you@example.com>
License: MIT
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.10
Description-Content-Type: text/markdown
Requires-Dist: numpy
Requires-Dist: pandas
Requires-Dist: scikit-learn
Requires-Dist: torch

# Batch PE Risk Prediction

This tool runs the Pre-Eclampsia (PE) risk prediction model on a **CSV file**. Each row represents a patient, and the script outputs a new CSV with an added `score` column (predicted probability).

---

## 1. Input CSV Format

Your input CSV must contain column names that appear in **Features.csv**.

### **Core numeric features**

* `age` – years
* `ht` – height in **centimeters**
* `wt` – weight in **kilograms**
* `interval` – inter-pregnancy interval
* `last.ga` – gestational age at last delivery (weeks)
* *(full model only)* `ga`, `pappa`, `plgf`, `utpi`, `map`, `plgf.machine`

### **Optional raw-unit features**

Instead of writing the base units, you may provide:

* `ht_ft`, `ht_in` → height in feet and inches *(converted to `ht` in cm)*
* `wt_lb` → weight in pounds *(converted to `wt` in kg)*
* `ga_weeks` → gestational age in weeks *(converted to `ga` in days; full model only)*

If both raw-unit and base-unit columns exist, the script prefers:

* `ht_ft` / `ht_in` over `ht`
* `wt_lb` over `wt`
* `ga_weeks` over `ga`

### **Categorical features**

Use the category names defined in **Features.csv → valid_options** (e.g., `White`, `Spontaneous`, `No`, etc.), or provide numeric codes `0/1/2` matching those options.

Missing values are allowed and will be imputed automatically using the **median** (or median category) from `Features.csv`.

---

## 2. Running From the Terminal

Assuming your script is named `main.py`:

### **Partial model, case 3**

```bash
python main.py \
  --input data/input.csv \
  --output results/output.csv \
  --case 3
```

### **Full model, case 3**

```bash
python main.py \
  --input data/input_full.csv \
  --output results/output_full.csv \
  --case 3 \
  --full
```

---

## 3. Running a Different CSV File

### **Step 1: Add your CSV**

Place your file in the `data/` folder, for example:

```
data/new_patients.csv
```

### **Step 2: Edit `launch.json` (VS Code)**

Modify the arguments:

* Change `data/input.csv` → `data/new_patients.csv`
* Change `results/output.csv` → `results/new_output.csv` (optional)
* Change `3` → `1` if you want to run `--case 1`
* Add `--full` to use the full model

### **Step 3: Run**

In VS Code:

* Open the **Run/Debug** panel
* Select **Python: Run main.py**
* Click **Run**

The output CSV will be saved at the path specified in `--output`, containing all original columns plus the new `score` column.

---
