Metadata-Version: 2.1
Name: proteomicruler
Version: 0.1.6
Summary: Estimate copy number from deep profile MS experiment using the Proteomic Ruler algorithm from Wiśniewski, J. R., Hein, M. Y., Cox, J. and Mann, M. (2014) A “Proteomic Ruler” for Protein Copy Number and Concentration Estimation without Spike-in Standards. Mol Cell Proteomics 13, 3497–3506.
Home-page: https://github.com/noatgnu/proteomicRuler
License: MIT
Keywords: proteomic,ruler,histone,mass spectrometry
Author: Toan K. Phung
Author-email: toan.phungkhoiquoctoan@gmail.com
Requires-Python: >=3.11,<=3.13
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Requires-Dist: click (>=8.3.0,<9.0.0)
Requires-Dist: pandas (>=2.3.3,<3.0.0)
Requires-Dist: requests (==2.32.4)
Requires-Dist: scipy (>=1.16.2,<2.0.0)
Requires-Dist: seaborn (>=0.13.2,<0.14.0)
Requires-Dist: uniprotparser (>=1.2.1,<2.0.0)
Project-URL: Repository, https://github.com/noatgnu/proteomicRuler
Description-Content-Type: text/markdown

Proteomic Ruler
--

An implementation of the same algorithm from Perseus `Wiśniewski, J. R., Hein, M. Y., Cox, J. and Mann, M. (2014) A “Proteomic Ruler” for Protein Copy Number and Concentration Estimation without Spike-in Standards. Mol Cell Proteomics 13, 3497–3506.` used for estimation of protein copy number from deep profile experiment.

Requirements
--

Python >= 3.9

Installation
--
```bash
pip install proteomicruler
```

Usage
--

In order to use the package, it is required that the input data is loaded into a `pandas.DataFrame` object. The following
basic parameters are also required:
- `accession_id_col` - column name that contains protein accession ids
- `mw_col` - column name that contains molecular weight of proteins
- `ploidy` - ploidy number
- `total_cellular_protein_concentration` - total cellular protein concentration used for calculation of total volume
- `intensity_columns` - list of column names that contain sample intensities

```python
import pandas as pd

accession_id_col = "Protein IDs"
# used as unique index and to directly fetch mw data from UniProt

mw_col = "Mass"
# molecular weight column name

ploidy = 2
# ploidy number

total_cellular_protein_concentration = 200
# cellular protein concentration used for calculation of total volume

filename = r"example_data\example_data.tsv" # example data from Perseus
df = pd.read_csv(filename, sep="\t")

# selecting intensity columns
intensity_columns = df.columns[57:57+16] # select 16 columns starting from column 57th that contain sample intensity



```

If the data does not contain molecular weight information, it is required to fetch it from UniProt.

```python
from proteomicRuler.ruler import add_mw

df = add_mw(df, accession_id_col)
df = df[pd.notnull(df[mw_col])]
df[mw_col] = df[mw_col].astype(float)
```

The Ruler object can be created by passing the `DataFrame` object and the required parameters.

```python
from proteomicRuler.ruler import Ruler

ruler = Ruler(df, intensity_columns, mw_col, accession_id_col, ploidy, total_cellular_protein_concentration) #
ruler.df.to_csv("output.txt", sep="\t", index=False)
```

It is also possible to use the package through the command line interface.

```bash
Usage: ruler [OPTIONS]

Options:
  -i, --input FILENAME          Input file containing intensity of samples and
                                uniprot accession ids
  -o, --output FILENAME         Output file
  -p, --ploidy INTEGER          Ploidy of the organism
  -t, --total-cellular FLOAT    Total cellular protein concentration
  -m, --mw-column TEXT          Molecular weight column name
  -a, --accession-id-col TEXT   Accession id column name
  -c, --intensity-columns TEXT  Intensity columns list delimited by commas
  -g, --get-mw                  Get molecular weight from uniprot
  --help                        Show this message and exit.
```
