Metadata-Version: 2.4
Name: dicom_native
Version: 0.2.0
Summary: Feed raw DICOM X-rays into PyTorch and YOLO — preserves 6.1x more intensity data than JPG pipelines. Validated on 71 real DICOMs with Ultralytics YOLOv8.
Author: Anand Bobba
License: MIT
Keywords: dicom,medical imaging,pytorch,yolo,deep learning,radiology
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: pydicom>=2.4
Requires-Dist: numpy>=1.24
Requires-Dist: opencv-python-headless>=4.8
Provides-Extra: torch
Requires-Dist: torch>=2.0; extra == "torch"
Requires-Dist: torchvision>=0.15; extra == "torch"
Provides-Extra: yolo
Requires-Dist: ultralytics>=8.0; extra == "yolo"
Provides-Extra: all
Requires-Dist: torch>=2.0; extra == "all"
Requires-Dist: torchvision>=0.15; extra == "all"
Requires-Dist: ultralytics>=8.0; extra == "all"
Provides-Extra: dev
Requires-Dist: pytest>=7; extra == "dev"
Requires-Dist: pytest-cov; extra == "dev"
Requires-Dist: pydicom[codecs]; extra == "dev"
Dynamic: author
Dynamic: license-file
Dynamic: requires-python

# dicom_native

**Feed raw DICOM X-rays directly into PyTorch U-Net and Ultralytics YOLO — zero disk conversion, full 12-bit/16-bit fidelity.**

---

## Validation Results

Validated on 71 real DICOM files, Ultralytics YOLOv8, Windows 11, Python 3.13, CPU-only.

| Metric                        | Result              |
|-------------------------------|---------------------|
| DICOM files tested            | 71                  |
| Bit depths covered            | 1, 8, 12, 13, 14, 16, 32-bit |
| Photometric types             | 7 (MONO1/2, RGB, YBR, palette) |
| YOLO training epochs          | 30                  |
| read_dicom() calls (training) | 333 (100% .dcm)     |
| Training loss trend           | Decreased all 3     |
| NaN in training               | None                |
| Memory leak (100 iterations)  | 0.7 MB (negligible) |
| Native vs JPG intensity levels| 1,453 vs 238 (6.1x) |
| Crashes                       | 0                   |

---

## Windows / CPU Quick Start

If you are on Windows or a CPU-only machine, use these safe training defaults:

```python
from dicom_native.integrations.yolo import build_dicom_yolo_trainer

trainer = build_dicom_yolo_trainer(
    data="dataset/data.yaml",
    model="yolov8n.pt",
    epochs=30,
    imgsz=640,
    batch=4,
    # These safe defaults are already set automatically:
    #   amp=False        (prevents CPU segfault)
    #   mosaic=0.0       (prevents IndexError on small datasets)
    #   close_mosaic=0   (prevents mosaic re-enabling)
    #   workers=0        (prevents Windows multiprocessing crashes)
)

trainer.train()
```

On a Linux machine with a GPU, override the defaults:

```python
trainer = build_dicom_yolo_trainer(
    data="dataset/data.yaml",
    model="yolov8s.pt",
    epochs=100,
    imgsz=640,
    batch=16,
    extra_args={
        "amp": True,         # safe with CUDA
        "mosaic": 1.0,       # enable mosaic with large datasets
        "workers": 8,        # multi-worker loading on Linux
        "device": "0",       # GPU device
    },
)
```

---

## Why?

Converting DICOMs to `.jpg` / `.png` before training silently destroys data:

| Format | Bit depth | Unique intensity levels |
|--------|-----------|------------------------|
| JPEG   | 8-bit     | 256                    |
| PNG    | 8-bit     | 256                    |
| **DICOM** | **12–16 bit** | **4 096 – 65 536** |

Bone cortex, soft-tissue planes, and early lesions live in those extra bits.  
`dicom_native` reads `.dcm` files **in memory**, applies correct medical preprocessing, and hands tensors straight to your training loop.

---

## Installation

```bash
# Core library
pip install dicom_native

# With PyTorch support
pip install "dicom_native[torch]"

# With YOLO support
pip install "dicom_native[yolo]"

# Everything
pip install "dicom_native[all]"
```

---

## Core concepts

### `read_dicom()` — the I/O engine

Every path through this library starts here.  The function performs four steps in order:

1. **Load** pixel data via `pydicom` (no temp files).
2. **Apply** `RescaleSlope` / `RescaleIntercept` if present in the DICOM header (linearises detector response).
3. **Invert** if `PhotometricInterpretation == MONOCHROME1` (so bright always means high density).
4. **Normalise** to `float32` in `[0, 1]`.

```python
from dicom_native import read_dicom

# NumPy array (H, W) float32
array = read_dicom("chest_pa.dcm")

# PyTorch tensor (1, H, W) float32 — ready for Conv2d
tensor = read_dicom("chest_pa.dcm", output="torch")
```

---

## PyTorch U-Net training

### 1. Prepare your data

```
data/
├── images/
│   ├── patient_001.dcm
│   ├── patient_002.dcm
│   └── ...
└── masks/
    ├── patient_001.png   ← binary segmentation mask
    ├── patient_002.png
    └── ...
```

### 2. Build dataset & dataloader

```python
import torch
from torch.utils.data import DataLoader, random_split

from dicom_native.integrations.pytorch import NativeDicomDataset
from dicom_native.transforms import (
    Compose, Resize, PercentileClip,
    RandomHorizontalFlip, GaussianNoise,
)

# ── Augmentation pipeline (operates on float32 — no quantisation) ──────────
train_tfm = Compose([
    Resize((512, 512)),
    PercentileClip(p_low=1, p_high=99),   # suppress artefacts
    RandomHorizontalFlip(p=0.5),
    GaussianNoise(std=0.02, p=0.3),
])

val_tfm = Compose([
    Resize((512, 512)),
    PercentileClip(p_low=1, p_high=99),
])

# ── Dataset ─────────────────────────────────────────────────────────────────
full_ds = NativeDicomDataset(
    image_dir="data/images",
    mask_dir="data/masks",
    transform=train_tfm,        # swap to val_tfm for validation split
    num_classes=1,              # binary segmentation
)

n_val = int(0.2 * len(full_ds))
n_train = len(full_ds) - n_val
train_ds, val_ds = random_split(full_ds, [n_train, n_val])

train_dl = DataLoader(train_ds, batch_size=4, shuffle=True,  num_workers=4, pin_memory=True)
val_dl   = DataLoader(val_ds,   batch_size=4, shuffle=False, num_workers=2, pin_memory=True)

print(f"Train: {len(train_ds)} | Val: {len(val_ds)}")
```

### 3. Training loop (U-Net example)

```python
import torch
import torch.nn as nn

# Replace with your U-Net implementation, e.g. segmentation_models_pytorch
# pip install segmentation-models-pytorch
import segmentation_models_pytorch as smp

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

model = smp.Unet(
    encoder_name="resnet34",
    encoder_weights="imagenet",
    in_channels=1,          # grayscale DICOM
    classes=1,
).to(device)

optimizer = torch.optim.AdamW(model.parameters(), lr=1e-4)
criterion = smp.losses.DiceLoss(mode="binary")

for epoch in range(1, 51):
    # ── Train ──────────────────────────────────────────────────────────────
    model.train()
    for images, masks in train_dl:
        # images: (B, 1, H, W) float32  |  masks: (B, 1, H, W) float32
        images, masks = images.to(device), masks.to(device)

        preds = model(images)
        loss  = criterion(preds, masks)

        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

    # ── Validate ───────────────────────────────────────────────────────────
    model.eval()
    dice_scores = []
    with torch.no_grad():
        for images, masks in val_dl:
            images, masks = images.to(device), masks.to(device)
            preds = torch.sigmoid(model(images)) > 0.5
            # Compute Dice per batch
            intersection = (preds * masks).sum()
            dice = (2 * intersection) / (preds.sum() + masks.sum() + 1e-6)
            dice_scores.append(dice.item())

    print(f"Epoch {epoch:3d} | Val Dice: {sum(dice_scores)/len(dice_scores):.4f}")
```

---

## Ultralytics YOLO training

### 1. Prepare your data

```
dataset/
├── images/
│   ├── train/
│   │   ├── patient_001.dcm
│   │   └── patient_002.dcm
│   └── val/
│       └── patient_003.dcm
├── labels/
│   ├── train/
│   │   ├── patient_001.txt   ← YOLO format bounding boxes
│   │   └── patient_002.txt
│   └── val/
│       └── patient_003.txt
└── data.yaml
```

**`data.yaml`:**
```yaml
path: /absolute/path/to/dataset
train: images/train
val:   images/val

nc: 2
names: ["fracture", "lesion"]
```

**Label format** (`patient_001.txt`):
```
0  0.512  0.388  0.124  0.096   # class x_c y_c w h  (all normalised to [0,1])
1  0.310  0.701  0.088  0.072
```

### 2. Train

```python
from dicom_native.integrations.yolo import build_dicom_yolo_trainer

trainer = build_dicom_yolo_trainer(
    data="dataset/data.yaml",
    model="yolov8s.pt",          # or "yolo11s.pt"
    epochs=100,
    imgsz=640,
    batch=16,
)

trainer.train()
# Weights saved to runs/detect/train/weights/best.pt
```

### 3. Inference on a new DICOM

```python
from dicom_native.integrations.yolo import dicom_predict

results = dicom_predict(
    model_path="runs/detect/train/weights/best.pt",
    dicom_path="new_patient.dcm",
    conf=0.4,
)

for r in results:
    print(r.boxes.xyxy)   # bounding box coordinates
    r.show()              # display with matplotlib
    r.save("output.jpg")  # save annotated image
```

---

## Transform reference

| Class | Purpose |
|-------|---------|
| `Compose([...])` | Chain multiple transforms |
| `Resize((H, W))` | Bilinear resize for images, nearest for masks |
| `MinMaxNorm()` | Re-normalise to [0, 1] after augmentation |
| `PercentileClip(p_low, p_high)` | Clip + normalise; suppresses implant artefacts |
| `WindowLevel(ww, wc)` | Radiological window/level contrast |
| `RandomHorizontalFlip(p)` | Random L-R flip |
| `RandomVerticalFlip(p)` | Random U-D flip |
| `RandomRotate90(p)` | Random 90°/180°/270° rotation |
| `GaussianNoise(std, p)` | Additive Gaussian noise |

All transforms accept `(image, mask=None)` and return `(image, mask)`.

---

## API reference

### `read_dicom(path, *, output, force_single_channel, dtype)`

| Parameter | Default | Description |
|-----------|---------|-------------|
| `path` | required | Path to `.dcm` file |
| `output` | `"numpy"` | `"numpy"` → `(H,W)` ndarray · `"torch"` → `(1,H,W)` tensor |
| `force_single_channel` | `True` | Average RGB channels if present |
| `dtype` | `np.float32` | Output numeric type |

### `read_dicom_metadata(path)`

Returns a dict of key DICOM tags without loading pixel data (fast).

### `NativeDicomDataset(...)`

| Parameter | Default | Description |
|-----------|---------|-------------|
| `image_dir` | required | Directory of `.dcm` files |
| `mask_dir` | required | Directory of mask files |
| `transform` | `None` | `Compose` pipeline |
| `mask_suffix` | `""` | Extra suffix in mask filename |
| `num_classes` | `1` | `>1` returns one-hot mask `(C,H,W)` |
| `skip_missing_masks` | `False` | Skip unpaired images silently |

### `build_dicom_yolo_trainer(data, model, epochs, imgsz, batch, target_size, extra_args)`

Returns a patched `DetectionTrainer`. Call `.train()` to start.

---

## Design decisions

**Why not subclass `YOLODataset` statically?**  
Ultralytics' internal class hierarchy changes frequently across minor versions.  
Monkey-patching `load_image` on a constructed instance is more robust.

**Why convert back to uint8 for YOLO?**  
The YOLO backbone, Albumentations augmentations, and letterbox all assume `uint8 BGR`.  Passing float32 would silently corrupt augmentations like `RandomBrightnessContrast`.  The conversion preserves relative contrast; absolute 16-bit precision is lost, but the backbone never saw those raw values anyway.

**Why PercentileClip instead of plain MinMax?**  
Metal implants and collimator edges create extreme outliers.  A global min-max would compress 95% of diagnostically relevant tissue into a narrow band.

---

## License

MIT — see `LICENSE`.
