Metadata-Version: 2.1
Name: nunchaku
Version: 0.9.0
Summary: Dividing data into linear regions.
Home-page: https://nunchaku.readthedocs.io
License: MIT
Keywords: linear segment,systems biology,bioinformatics,plate readers
Author: Yu Huo
Author-email: yu.huo@ed.ac.uk
Requires-Python: >=3.8,<3.11
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Topic :: Scientific/Engineering :: Bio-Informatics
Requires-Dist: matplotlib (>=3.5.0,<4.0.0)
Requires-Dist: numpy (>=1.20.3,<2.0.0)
Requires-Dist: pandas (>=1.4,<2.0)
Requires-Dist: scipy (>=1.7.3,<2.0.0)
Requires-Dist: sympy (>=1.9,<2.0)
Requires-Dist: tqdm (>=4.65.0,<5.0.0)
Project-URL: Documentation, https://nunchaku.readthedocs.io
Project-URL: Repository, https://git.ecdf.ed.ac.uk/s1856140/nunchaku
Description-Content-Type: text/markdown

# nunchaku: Dividing data into linear regions

`nunchaku` is a Python module for dividing data into linear regions.
It answers two questions:
1. how many linear regions best fit the data without overfitting (by Bayesian model comparison);
2. given the number of linear regions, where the boundaries between them are (by finding the posterior of the boundaries).

## Installation
For users, type in terminal
```
> pip install nunchaku
```

For developers, create a virtual environment and then install with Poetry: 
```
> git clone https://git.ecdf.ed.ac.uk/s1856140/nunchaku.git
> cd nunchaku 
> poetry install --with dev 
```

## Quickstart
Data `x` is a list or a 1D Numpy array, sorted ascendingly; the data `y` is a list or a Numpy array, with each row being one replicate of measurement.
```
>>> from nunchaku.nunchaku import nunchaku, get_example_data
>>> x, y = get_example_data()
>>> nc = nunchaku(x, y, prior=[-5,5]) # load data and set prior of slope
>>> # compare models with 1, 2, 3 and 4 linear regions
>>> num_regions, evidences = nc.get_number(max_num=4)
>>> # get the mean and standard deviation of the boundary points
>>> bds, bds_std = nc.get_iboundaries(num_regions)
>>> # get the information of all regions
>>> info_df = nc.get_info(bds)
>>> # plot the data and the regions
>>> nc.plot(info_df)
```

## Documentation
Detailed documentation is available on [Readthedocs](https://nunchaku.readthedocs.io/en/latest/).

## Citation
A preprint is coming soon.

