Metadata-Version: 2.1
Name: scMDCF
Version: 1.1.2
Summary: Unified Cross-modality Integration and Inference of Single-Cell Multiomic Data with Deep Contrastive Learning
Home-page: https://github.com/DARKpmm/scMDCF
Author: Yue Cheng
Author-email: chengyue22@mails.jlu.edu.cn
Classifier: Programming Language :: Python :: 3.9
Classifier: Operating System :: OS Independent
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: anndata ==0.10.5.post1
Requires-Dist: array-api-compat ==1.4.1
Requires-Dist: contourpy ==1.2.0
Requires-Dist: cycler ==0.12.1
Requires-Dist: exceptiongroup ==1.2.0
Requires-Dist: filelock ==3.13.1
Requires-Dist: fonttools ==4.49.0
Requires-Dist: fsspec ==2024.2.0
Requires-Dist: get-annotations ==0.1.2
Requires-Dist: h5py ==3.10.0
Requires-Dist: importlib-resources ==6.1.1
Requires-Dist: Jinja2 ==3.1.3
Requires-Dist: joblib ==1.3.2
Requires-Dist: kiwisolver ==1.4.5
Requires-Dist: llvmlite ==0.42.0
Requires-Dist: MarkupSafe ==2.1.5
Requires-Dist: matplotlib ==3.8.3
Requires-Dist: mpmath ==1.3.0
Requires-Dist: natsort ==8.4.0
Requires-Dist: networkx ==3.2.1
Requires-Dist: numba ==0.59.0
Requires-Dist: numpy ==1.26.4
Requires-Dist: nvidia-cublas-cu12 ==12.1.3.1
Requires-Dist: nvidia-cuda-cupti-cu12 ==12.1.105
Requires-Dist: nvidia-cuda-nvrtc-cu12 ==12.1.105
Requires-Dist: nvidia-cuda-runtime-cu12 ==12.1.105
Requires-Dist: nvidia-cudnn-cu12 ==8.9.2.26
Requires-Dist: nvidia-cufft-cu12 ==11.0.2.54
Requires-Dist: nvidia-curand-cu12 ==10.3.2.106
Requires-Dist: nvidia-cusolver-cu12 ==11.4.5.107
Requires-Dist: nvidia-cusparse-cu12 ==12.1.0.106
Requires-Dist: nvidia-nccl-cu12 ==2.19.3
Requires-Dist: nvidia-nvjitlink-cu12 ==12.3.101
Requires-Dist: nvidia-nvtx-cu12 ==12.1.105
Requires-Dist: packaging ==23.2
Requires-Dist: pandas ==2.2.1
Requires-Dist: patsy ==0.5.6
Requires-Dist: pillow ==10.2.0
Requires-Dist: pynndescent ==0.5.11
Requires-Dist: pyparsing ==3.1.1
Requires-Dist: python-dateutil ==2.8.2
Requires-Dist: pytz ==2024.1
Requires-Dist: scanpy ==1.9.8
Requires-Dist: scikit-learn ==1.4.1.post1
Requires-Dist: scipy ==1.12.0
Requires-Dist: seaborn ==0.13.2
Requires-Dist: session-info ==1.0.0
Requires-Dist: six ==1.16.0
Requires-Dist: statsmodels ==0.14.1
Requires-Dist: stdlib-list ==0.10.0
Requires-Dist: sympy ==1.12
Requires-Dist: threadpoolctl ==3.3.0
Requires-Dist: torch ==2.2.1
Requires-Dist: tqdm ==4.66.2
Requires-Dist: triton ==2.2.0
Requires-Dist: typing-extensions ==4.9.0
Requires-Dist: tzdata ==2024.1
Requires-Dist: umap-learn ==0.5.5
Requires-Dist: zipp ==3.17.0

# scMDCF

[![scMDCF badge](https://img.shields.io/badge/scMDCF-python-blue)](https://github.com/DARKpmm/scMDCF)
[![PyPI badge](https://img.shields.io/pypi/v/scMDCF.svg)](https://pypi.org/project/scMDCF/)
[![License](https://img.shields.io/badge/License-MIT-green.svg)](https://opensource.org/licenses/MIT)

`scMDCF` is a python package containing tools for clustering single cell multi-omics data based on cross-modality contrastive learning to learn the common latent representation and assign clustering.

- [Overview](#overview)
- [System Requirements](#system-requirements)
- [Installation Guide](#installation-guide)
- [Usage](#usage)
- [Data Availability](#data-availability)
- [License](#license)

# Overview
Single-cell multi-omics (scMulti-omics) technologies have revolutionized our understanding of cellular functions and interactions by enabling the simultaneous measurement of diverse cellular modalities. However, the inherent complexity, high-dimensionality, and heterogeneity of these datasets pose substantial challenges for integration and analysis across different modalities. To address these challenges, we develop a single-cell multi-omics deep learning model (scMDCF) based on contrastive learning, tailored for the efficient characterization and integration of scMulti-omics data. scMDCF features a cross-modality contrastive learning module that harmonizes data representations across different omics types, ensuring consistency while accommodating conditional entropy to preserve data heterogeneity. Furthermore, a cross-modality feature fusion module is designed to extract common low-dimensional latent representations of scMulti-omics data, effectively balancing the characteristics of these diverse omics data. Extensive empirical studies demonstrate that scMDCF outperforms existing state-of-the-art scMulti-omics models across various types of scMulti-omics data. In particular, scMDCF exhibits progressive capability in extracting cell-type specific peak-gene associations and cis-regulatory elements from SNARE-seq data, as well as in elucidating immune regulation from CITE-seq data. Furthermore, we demonstrate that in the post-BNT162b2 mRNA SARS‐CoV‐2 vaccination dataset, scMDCF successfully annotates specific vaccine-induced B cell subpopulations through integrative and multimodal analysis, uncovering dynamic interactions and regulatory mechanisms within the immune system after vaccination.
![The framework plot of scMDCF](https://github.com/DARKpmm/scMDCF/raw/main/scMDCF.png)

# System Requirements
## Hardware requirements
`scMDCF` package requires only a standard computer with enough RAM to support the in-memory operations.

## Software requirements
### OS requirements
This package is supported for *Linux*. The package has been tested on the following systems:
* Linux: Ubuntu 18.04

### Python Dependencies
`scMDCF` mainly depends on the Python scientific stack.
    numpy
    pytorch
    scanpy
    pandas
    scikit-learn
For specific setting, please see <a href="https://github.com/DARKpmm/scMDCF/blob/main/requirements.txt">requirements</a>.

# Installation Guide
## Install from PyPi
    conda create -n scMDCF_env python=3.9.16
    conda activate scMDCF_env
    pip install scMDCF==1.1.2

# Usage
`scMDCF` is a deep embedding learning method for single-cell multi-omics data clustering, which can be used to:
* CITE-seq dataset clustering. The example can be seen in the <a href="https://github.com/DARKpmm/scMDCF/tree/main/tutorial/main_CITE.py">main_CITE.py</a>
* SNARE-seq (paired RNA-seq and ATAC-seq) dataset clustering. The example can be seen in the <a href="https://github.com/DARKpmm/scMDCF/tree/main/tutorial/main_SNARE.py">main_SNARE.py</a>

# Data Availability
The datasets we used can be download in <a href="https://github.com/DARKpmm/scMDCF/tree/main/dataset">dataset</a>

# License
This project is covered under the **MIT License**.
