Metadata-Version: 2.4
Name: omixvizpy
Version: 0.1.2
Summary: A Python package for omics data visualization with a focus on PCA plotting
Project-URL: Homepage, https://github.com/Leslie-Lu/omixvizpy
Project-URL: Bug Reports, https://github.com/Leslie-Lu/omixvizpy/issues
Project-URL: Source, https://github.com/Leslie-Lu/omixvizpy
Project-URL: Documentation, https://github.com/Leslie-Lu/omixvizpy#readme
Author-email: Zhen Lu <luzh29@mail2.sysu.edu.cn>
Maintainer-email: Zhen Lu <luzh29@mail2.sysu.edu.cn>
License: MIT License
        
        Copyright (c) 2025 Zhen Lu
        
        Permission is hereby granted, free of charge, to any person obtaining a copy
        of this software and associated documentation files (the "Software"), to deal
        in the Software without restriction, including without limitation the rights
        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
        copies of the Software, and to permit persons to whom the Software is
        furnished to do so, subject to the following conditions:
        
        The above copyright notice and this permission notice shall be included in all
        copies or substantial portions of the Software.
        
        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
        SOFTWARE.
License-File: LICENSE
Keywords: bioinformatics,data visualization,omics,pca,plotting,principal component analysis,visualization
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Bio-Informatics
Classifier: Topic :: Scientific/Engineering :: Visualization
Requires-Python: >=3.8
Requires-Dist: matplotlib>=3.3.0
Requires-Dist: numpy>=1.20.0
Requires-Dist: pandas>=1.3.0
Requires-Dist: seaborn>=0.11.0
Provides-Extra: dev
Requires-Dist: black; extra == 'dev'
Requires-Dist: flake8; extra == 'dev'
Requires-Dist: mypy; extra == 'dev'
Requires-Dist: pre-commit; extra == 'dev'
Requires-Dist: pytest-cov; extra == 'dev'
Requires-Dist: pytest>=6.0; extra == 'dev'
Provides-Extra: docs
Requires-Dist: myst-parser; extra == 'docs'
Requires-Dist: sphinx-rtd-theme; extra == 'docs'
Requires-Dist: sphinx>=4.0; extra == 'docs'
Description-Content-Type: text/markdown

# omixvizpy

A Python package for omics data visualization, particularly focused on Principal Component Analysis (PCA) plotting.

## Features

- **PCA Visualization**: Create comprehensive PCA plots with multiple grouping options
- **Flexible Plotting**: Support for scatter plots and pair plots of principal components
- **Customizable**: Easy-to-use functions with extensive customization options
- **Publication Ready**: High-quality plots suitable for scientific publications

## Installation

### From PyPI (recommended)

```bash
pip install omixvizpy
```

### From Source

```bash
git clone https://github.com/Leslie-Lu/omixvizpy.git
cd omixvizpy
pip install -e .
```

## Quick Start

```python
import omixvizpy

# Plot PCA results with covariates
omixvizpy.plot_pca(
    eigenvec_file="path/to/your/eigenvec.txt",
    covar_file="path/to/your/covariates.csv",
    cov1="Country_of_birth",                    # First covariate
    cov2="Ethnic_background",                   # Second covariate (optional)
    legend_title_cov1="Country of Birth",       # Legend title for first covariate
    legend_title_cov2="Ethnicity",             # Legend title for second covariate
    cov1_levels=["England", "Wales", "Scotland", "Others"],  # Labels for first covariate
    cov2_levels=["White", "Asian", "Black", "Others"],      # Labels for second covariate
    fig_path="output/directory",                # Output directory
    fig1_name="variance_explained",             # Variance plot
    fig2_name="pc1_vs_pc2",                    # PC1 vs PC2 scatter plot
    fig3_name="pca_by_country",                # Pairplot by first covariate
    fig4_name="pca_by_ethnicity",              # Pairplot by second covariate
    fig1_size=(11, 9),                           # Size of variance explained plot
    fig2_size=(12, 12),                         # Size of PC1 vs PC2
    save_figs=True                             # Save figures instead of displaying
)
```

## Function Reference

### `plot_pca`

Create comprehensive PCA visualization plots.

**Parameters:**
- `eigenvec_file` (str): Path to the eigenvec file containing PCA results
- `covar_file` (str): Path to the CSV file containing covariate information
- `cov1` (str): Name of the first covariate column (default: 'Country_of_birth')
- `cov2` (Optional[str]): Name of the second covariate column
- `legend_title_cov1` (str): Title for the first covariate's legend
- `legend_title_cov2` (Optional[str]): Title for the second covariate's legend
- `cov1_levels` (List[str]): Labels for the first covariate's values
- `cov2_levels` (Optional[List[str]]): Labels for the second covariate's values
- `fig_path` (Optional[str]): Directory path where figures will be saved
- `fig1_name` (str): Name for the variance explained plot (default: 'variance_explained')
- `fig2_name` (str): Name for the PC1 vs PC2 scatter plot
- `fig3_name` (str): Name for the pairplot colored by first covariate
- `fig4_name` (str): Name for the pairplot colored by second covariate
- `fig1_size` (Tuple[int, int]): Size of the variance explained plot (default: (11, 9))
- `fig2_size` (Tuple[int, int]): Size of the PC1 vs PC2 scatter plot (default: (12, 12))
- `save_figs` (bool): Whether to save the figures (default: False)

**Returns:**
- Displays interactive plots and optionally saves them as PNG files

## Input Data Format

### Eigenvec File
The eigenvec file should be a tab-separated file with the following columns:
- `eid`: Sample identifier
- `PC1`, `PC2`, `PC3`, etc.: Principal component values

### Covariate File
The covariate file should be a comma-separated (CSV) file with the following structure:
- `eid`: Sample identifier (matching eigenvec file)
- Additional columns for covariates (e.g., `Country_of_birth`, `Ethnic_background`)
  - Values in these columns should correspond to the levels specified in `cov1_levels` and `cov2_levels`
  - The order of levels in `cov*_levels` determines the order in the plot legend

## Requirements

- Python >=3.8
- pandas >=1.3.0
- matplotlib >=3.3.0
- seaborn >=0.11.0
- numpy >=1.20.0

## Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

1. Fork the repository
2. Create your feature branch (`git checkout -b feature/AmazingFeature`)
3. Commit your changes (`git commit -m 'Add some AmazingFeature'`)
4. Push to the branch (`git push origin feature/AmazingFeature`)
5. Open a Pull Request

## License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

## Citation

If you use omixvizpy in your research, please cite:

```
@software{omixvizpy,
  title={omixvizpy: A Python package for omics data visualization},
  author={Zhen Lu},
  year={2025},
  url={https://github.com/Leslie-Lu/omixvizpy}
}
```
