Metadata-Version: 2.4
Name: pyihw
Version: 0.2.0
Summary: Independent Hypothesis Weighting for large-scale multiple testing
Author: Chiraag Gohel
Author-email: Chiraag Gohel <37530396+chi-raag@users.noreply.github.com>
License-Expression: MIT
Requires-Dist: numpy>=2.4.2
Requires-Dist: scipy>=1.17.1
Requires-Python: >=3.13
Description-Content-Type: text/markdown

# pyIHW


Python implementation of [Independent Hypothesis
Weighting](https://bioconductor.org/packages/IHW/) (IHW) by Ignatiadis &
Huber (2016, *Nature Methods*).

IHW improves power in large-scale multiple testing by learning
data-driven weights from an independent covariate while controlling FDR
at a user-specified level.

## Installation

``` bash
pip install pyihw
```

## Quick start

pyIHW ships with DESeq2 results from the
[airway](https://bioconductor.org/packages/airway/) RNA-seq dataset
(Himes et al. 2014):

``` python
import numpy as np
from pyihw import ihw, load_airway, bh_threshold

pvalues, basemean = load_airway()
print(f"{len(pvalues)} hypotheses")
```

    33469 hypotheses

Run IHW with baseMean as the covariate and compare to standard
Benjamini-Hochberg:

``` python
result = ihw(pvalues, basemean, alpha=0.1, rng=np.random.default_rng(42))

t_bh = bh_threshold(pvalues, alpha=0.1)
bh_rejections = int(np.sum(pvalues <= t_bh))

print(f"BH rejections:  {bh_rejections}")
print(f"IHW rejections: {result.n_rejections}")
print(f"Improvement:    +{result.n_rejections - bh_rejections} discoveries")
```

    BH rejections:  4099
    IHW rejections: 4876
    Improvement:    +777 discoveries

## Parameters

``` python
ihw(
    pvalues,
    covariates,
    alpha,
    *,
    covariate_type="ordinal",   # "ordinal" or "nominal"
    nbins="auto",               # number of covariate strata
    nfolds=5,                   # cross-validation folds
    adjustment_type="bh",       # "bh" (FDR) or "bonferroni" (FWER)
    null_proportion=False,      # Storey's pi0 estimation
    rng=None,                   # numpy.random.Generator for reproducibility
)
```

## Reproducibility

Pass an `rng` argument to get deterministic results:

``` python
result = ihw(pvalues, covariates, alpha=0.1, rng=np.random.default_rng(42))
```

## Dependencies

NumPy and SciPy only.

## Acknowledgments

pyIHW is a Python reimplementation of the
[IHW](https://bioconductor.org/packages/IHW/) R/Bioconductor package by
Nikolaos Ignatiadis and Wolfgang Huber. The method is described in:

> Ignatiadis, N., Klaus, B., Zaugg, J.B. et al. Data-driven hypothesis
> weighting increases detection power in genome-scale multiple testing.
> *Nature Methods* 13, 577–580 (2016).
> [doi:10.1038/nmeth.3885](https://doi.org/10.1038/nmeth.3885)

The bundled airway dataset is from Himes et al. (2014), *PLoS ONE* 9(6):
e99625 ([GEO
GSE52778](https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE52778)),
processed through [DESeq2](https://bioconductor.org/packages/DESeq2/).
