Metadata-Version: 2.1
Name: mspypeline
Version: 0.2
Summary: PLACEHOLDER
Home-page: UNKNOWN
License: UNKNOWN
Platform: UNKNOWN
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Description-Content-Type: text/markdown
Requires-Dist: numpy (>=1.17.4)
Requires-Dist: pandas (>=0.25.3)
Requires-Dist: scipy (>=1.3.1)
Requires-Dist: rpy2 (>=2.9.4)
Requires-Dist: tzlocal (>=2.0.0)
Requires-Dist: ruamel-yaml (>=0.15.46)
Requires-Dist: matplotlib (>=3.1.1)
Requires-Dist: matplotlib-venn (>=0.11.5)
Requires-Dist: adjusttext (>=0.7.3)
Requires-Dist: scikit-learn (>=0.22.1)
Requires-Dist: plotly (>=4.6.0)

![full test run](https://github.com/siheming/mspypeline/workflows/full%20test%20run/badge.svg?branch=master)
[![Coverage](https://codecov.io/gh/siheming/mspypeline/branch/master/graph/badge.svg?flag=full-test-run)](https://codecov.io/gh/siheming/mspypeline/branch/master)

![basic test run](https://github.com/siheming/mspypeline/workflows/basic%20test%20run/badge.svg?branch=develop)
[![Coverage](https://codecov.io/gh/siheming/mspypeline/branch/develop/graph/badge.svg?flag=basic-test-run)](https://codecov.io/gh/siheming/mspypeline/branch/develop)

# README
This pipeline can be used to analyze the results of a MaxQuant analysis.

## Requirements
It is recommended to use this pipeline with git and anaconda, which need to be installed if they aren't
already. Proxies need to be set for these tools if they are set up (like in the DKFZ).
The repository can be downloaded for example via
`git clone https://github.com/siheming/mspypeline.git` or other ways.

## Usage
This pipeline can be used via the command line and needs a python
installation with certain packages. A virtual environment is recommended
with all packages specified in the `environment.yml` file. This can be
done for example via:
```bash
conda env create python=3.7 -f environment.yml
```
which can then be activated and deactivated via:
```bash
conda activate mspypeline # activation
conda deactivate  # deactivation
```
When the environment is activated or the default python installation
satisfies the requirements the script can be used via:
```bash
python3 main.py
```
or
```bash
python main.py
```
If the script is started with no further arguments the first prompt will ask for the directory,
the second promp for the yml config file. If the second prompt is cancelled the default yml file is used
To see help for the command line support type:
```bash
python3 main.py --help
```
The arguments that can be specified when using the pipeline are:
- `--dir` the path to the directory that should be analyzed.
When this is not specified a window will open and ask to select a directory
- `--yml-file` the path to a yml file which should be used for analysis.
If the directory contains a config dir with a yml file it will be used
for analysis. Otherwise the user will be asked to select a yml file.
When this is skipped the default yml file will be used instead.
Using the default yml file can also be forced via `--yml-file default`
- `--loglevel` Logging level used during run. Should be from options 
(lowest to highest): DEBUG < INFO < WARNING < ERROR.
- `--has-replicates` do the names of the experiments in the result files include technical replicates. Default is false.

## Dependencies
The pipeline required multiple input files to perform the analysis. They
should be stored in a config dir on the same level as the pipeline script.
The requirements are:
- `ms_analysis_default.yml` a file which contains all defaults for the 
analysis pipeline.
- `go_terms` a directory containing (GO-term) txt files for proteins with which
should be analyzed. This influences the enrichment analysis of the GO-term plot.
- `pathways` a directory containing (pathway) txt files for proteins with which
should be analyzed. This setting impacts descriptive plots and score calculations.

## Support
If additional support is required try googleing, asking a programmer or
contact me via `Simon.Heming@gmx.de`.


