Metadata-Version: 2.4
Name: cosg
Version: 1.0.3
Summary: Accurate and fast cell marker gene identification with COSG
Project-URL: Documentation, https://genecell.github.io/COSG
Project-URL: Source, https://github.com/genecell/COSG
Project-URL: Homepage, https://genecell.github.io/COSG
Author-email: Min Dai <dai@broadinstitute.org>
Maintainer-email: Min Dai <dai@broadinstitute.org>
License: BSD-3-Clause
License-File: LICENSE
Classifier: Development Status :: 3 - Alpha
Classifier: Environment :: Console
Classifier: Framework :: Jupyter
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: Natural Language :: English
Classifier: Operating System :: MacOS :: MacOS X
Classifier: Operating System :: Microsoft :: Windows
Classifier: Operating System :: POSIX :: Linux
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Bio-Informatics
Classifier: Topic :: Scientific/Engineering :: Visualization
Requires-Python: >=3.6
Requires-Dist: anndata>=0.8
Requires-Dist: matplotlib>=3.5.2
Requires-Dist: networkx>=2.8.8
Requires-Dist: numpy>=1.17.0
Requires-Dist: pandas>=0.21
Requires-Dist: scanpy>=1.6.0
Requires-Dist: scikit-learn>=0.21.2
Requires-Dist: scipy>=1.4
Requires-Dist: typing-extensions
Provides-Extra: dev
Requires-Dist: pre-commit; extra == 'dev'
Description-Content-Type: text/x-rst

|Stars| |PyPI| |Docs| |Total downloads| |Monthly downloads|

.. |Stars| image:: https://img.shields.io/github/stars/genecell/COSG?logo=GitHub&color=yellow
   :target: https://github.com/genecell/COSG/stargazers
.. |PyPI| image:: https://img.shields.io/pypi/v/cosg?logo=PyPI
   :target: https://pypi.org/project/cosg
.. |Docs| image:: https://readthedocs.org/projects/cosg/badge/?version=latest
   :target: https://cosg.readthedocs.io
.. |Total downloads| image:: https://static.pepy.tech/personalized-badge/cosg?period=total&units=international_system&left_color=black&right_color=orange&left_text=downloads
   :target: https://pepy.tech/project/cosg
.. |Monthly downloads| image:: https://static.pepy.tech/personalized-badge/cosg?period=month&units=international_system&left_color=black&right_color=orange&left_text=downloads/month
   :target: https://pepy.tech/project/cosg

Accurate and fast cell marker gene identification with COSG
=============================================================

Overview
---------

COSG is a cosine similarity-based method for more accurate and scalable marker gene identification.

- COSG is a general method for cell marker gene identification across different data modalities, e.g., scRNA-seq, scATAC-seq, and spatially resolved transcriptome data.

- Marker genes or genomic regions identified by COSG are more indicative and with greater cell-type specificity.

- COSG is ultrafast for large-scale datasets and is capable of identifying marker genes for one million cells in less than two minutes.

The method and benchmarking results are described in `Dai et al. (2022)`_.

Additionally, the R version of COSG is available `here <https://github.com/genecell/COSGR>`_.

Note: we have recently released our python toolkit, `PIASO <https://github.com/genecell/PIASO>`_, in which some methods were built upon COSG, please try out PIASO, thank you!

Documentation
--------------

`COSG documentation <https://genecell.github.io/COSG/>`_.


Release notes
-------------
**Release v1.0.2** (March 5, 2025)


- Added ``plotMarkerDotplot`` and ``plotMarkerDendrogram`` for enhanced marker gene visualization. 

- Introduced support for ``batch_key`` to compute cosine similarities separately across different batches.  

- Enabled calculation of normalized COSG scores for comparing gene expression specificity across cell types or datasets.  

- Resolved a SciPy version deprecation issue related to ``.A`` attribute usage.  

- Fixed a DataFrame manipulation warning.  

- Added verbosity control, allowing users to adjust log output levels.  

**Release v1.0.1** (June 15, 2021)


- First release in PyPI. 

Installation
------------
Stable version:

.. code-block:: bash

   pip install cosg

Development version:

.. code-block:: bash

   pip install git+https://github.com/genecell/COSG.git


Example
---------
Run COSG:

.. code-block:: python
   
   import cosg
   n_gene=30
   groupby='CellTypes'
   cosg.cosg(
      adata,
      key_added='cosg',
      # use_raw=False, layer='log1p', ## e.g., if you want to use the log1p layer in adata
      mu=100,
      expressed_pct=0.1,
      remove_lowly_expressed=True,
      n_genes_user=100,
      groupby=groupby
   )

Draw the dot plot:

.. code-block:: python
   
   sc.tl.dendrogram(adata, groupby=groupby, use_rep='X_pca') ## Change use_rep to the cell embeddings key you'd like to use
   df_tmp=pd.DataFrame(adata.uns['cosg']['names'][:3,]).T
   df_tmp=df_tmp.reindex(adata.uns['dendrogram_'+groupby]['categories_ordered'])
   marker_genes_list={idx: list(row.values) for idx, row in df_tmp.iterrows()}
   marker_genes_list = {k: v for k, v in marker_genes_list.items() if not any(isinstance(x, float) for x in v)}
   
   sc.pl.dotplot(
      adata,
      marker_genes_list,
      groupby=groupby,              
      dendrogram=True,
      swap_axes=False,
      standard_scale='var',
      cmap='Spectral_r'
    )


Output the marker list as pandas dataframe:

.. code-block:: python
   
   marker_gene=pd.DataFrame(adata.uns['cosg']['names'])
   marker_gene.head()

You could also check the COSG scores:

.. code-block:: python
   
   marker_gene_scores=pd.DataFrame(adata.uns['cosg']['scores'])
   marker_gene_scores.head()


Question
---------
For questions about the code and tutorial, please contact Min Dai, dai@broadinstitute.org.


Citation
---------
If COSG is useful for your research, please consider citing `Dai et al. (2022)`_.

.. _Dai et al. (2022): https://academic.oup.com/bib/advance-article-abstract/doi/10.1093/bib/bbab579/6511197?redirectedFrom=fulltext



