Metadata-Version: 2.1
Name: cryoemservices
Version: 0.5.1
Summary: Services for CryoEM processing
Author-email: "Diamond Light Source - Data Analysis et al." <dataanalysis@diamond.ac.uk>
License: BSD 3-Clause License
        
        Copyright (c) 2023, Diamond Light Source
        
        Redistribution and use in source and binary forms, with or without
        modification, are permitted provided that the following conditions are met:
        
        1. Redistributions of source code must retain the above copyright notice, this
           list of conditions and the following disclaimer.
        
        2. Redistributions in binary form must reproduce the above copyright notice,
           this list of conditions and the following disclaimer in the documentation
           and/or other materials provided with the distribution.
        
        3. Neither the name of the copyright holder nor the names of its
           contributors may be used to endorse or promote products derived from
           this software without specific prior written permission.
        
        THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
        AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
        IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
        DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
        FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
        DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
        SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
        CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
        OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
        OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
        
Project-URL: Bug-Tracker, https://github.com/DiamondLightSource/cryoem-services/issues
Project-URL: GitHub, https://github.com/DiamondLightSource/cryoem-services
Keywords: cryoem-services
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: BSD License
Classifier: Natural Language :: English
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: defusedxml
Requires-Dist: gemmi==0.6.5
Requires-Dist: healpy
Requires-Dist: icebreaker-em
Requires-Dist: importlib-metadata
Requires-Dist: ispyb>=10.2.3
Requires-Dist: marshmallow-sqlalchemy
Requires-Dist: mrcfile
Requires-Dist: numpy<2
Requires-Dist: pillow
Requires-Dist: plotly
Requires-Dist: pydantic>=2
Requires-Dist: readlif
Requires-Dist: starfile
Requires-Dist: stomp-py==8.1.0
Requires-Dist: tifffile
Requires-Dist: workflows
Requires-Dist: zocalo>=1
Provides-Extra: dev
Requires-Dist: bump-my-version; extra == "dev"
Requires-Dist: ipykernel; extra == "dev"
Requires-Dist: pre-commit; extra == "dev"
Requires-Dist: pytest; extra == "dev"
Requires-Dist: pytest-datafiles; extra == "dev"
Requires-Dist: pytest-mock; extra == "dev"

# cryoem-services

Services and configuration for cryo-EM pipelines.

This package consists of a number of services to process cryo-EM micrographs,
both for single particle analysis and tomography,
using a range of commonly used cryo-EM processing software.
These services can be run independently to process data,
or as part of a wider structure for performing live analysis during microscope collection.
For live analysis, this package integrates with a package
for transferring and monitoring collected data,
[Murfey](https://github.com/DiamondLightSource/python-murfey),
and a database for storing processing outcomes,
[ISPyB](https://github.com/DiamondLightSource/ispyb-database).

To run these services the software executables being called must be installed.
These do not come with this package.

# Tomography processing

The tomography processing pipeline consists of:

- Motion correction
- CTF estimation
- Tomogram alignment
- Tomogram denoising using [Topaz](http://topaz-em.readthedocs.io)
- Segmentation using [membrain-seg](https://github.com/teamtomo/membrain-seg)

The results of this processing can be opened and continued using
[Relion 5.0](https://relion.readthedocs.io).

# Single particle analysis

The single particle analysis pipeline produces a project
that can be opened and continued using
[CCP-EM doppio](https://www.ccpem.ac.uk/docs/doppio/user_guide.html)
or [Relion](https://relion.readthedocs.io).

The processing pipeline consists of:

- Motion correction
- CTF estimation
- Particle picking
- (Optionally) Ice thickness estimation
- Particle extraction and rebatching
- 2D classification using Relion
- Automated 2D class selection using Relion
- 3D classification using Relion
- 3D Refinement and post-processing
- BFactor estimation by refinement with varying particle count

# Services currently available

The following services are provided for running the pipelines:

- Utility services:
  - **ClusterSubmission**: Submits zocalo wrappers to an HPC cluster
  - **Dispatcher**: Converts recipes into messages suitable for processing services
  - **Images**: Creates thumbnail images for viewing processing outcomes
  - **ISPyB**: Inserts results into an ISPyB database
  - **NodeCreator**: Creates Relion project files for the services run
- Processing services:
  - **BFactor**: Performs the setup for 3D refinement with varying particle count
  - **CrYOLO**: Particle picking on micrographs using [crYOLO](https://cryolo.readthedocs.io)
  - **CTFFind**: CTF estimation on micrographs using [CTFFIND4](https://grigoriefflab.umassmed.edu/ctffind4)
  - **DenoiseSlurm**: Tomogram denoising, submitted to a slurm HPC cluster, using [Topaz](http://topaz-em.readthedocs.io)
  - **Extract**: Extracts picked particles from micrographs
  - **ExtractClass**: Extracts particles from a given 3D class
  - **IceBreaker**: Ice thickness estimation with [IceBreaker](https://github.com/DiamondLightSource/python-icebreaker)
  - **MembrainSeg**: Tomogram segmentation, submitted to a slurm HPC cluster, using [membrain-seg](https://github.com/teamtomo/membrain-seg)
  - **MotionCorr**: Motion correction of micrographs using [MotionCor2](http://emcore.ucsf.edu/ucsf-software) or [Relion](https://relion.readthedocs.io), optionally submitted to a slurm HPC cluster
  - **PostProcess**: Post-processing of 3D refinements using [Relion](https://relion.readthedocs.io)
  - **SelectClasses**: Runs automated 2D class selection using [Relion](https://relion.readthedocs.io) and re-batches the particles from these classes
  - **SelectParticles**: Creates files listing batches of extracted particles
  - **TomoAlign**: Tomogram reconstruction from a list of micrographs using [imod](https://bio3d.colorado.edu/imod) and [AreTomo2](https://github.com/czimaginginstitute/AreTomo2)
  - **TomoAlignSlurm**: Tomogram alignment processing submitted to a slurm HPC cluster

There are also three zocalo wrapper scripts that can be run on an HPC cluster.
These perform 2D classification, 3D classification and 3D refinement
using [Relion](https://relion.readthedocs.io).

# Running services using zocalo

The services in this package are run using
[zocalo](https://github.com/DiamondLightSource/python-zocalo)
and [python-workflows](https://github.com/DiamondLightSource/python-workflows).
To start a service run the `zocalo.service` command and specify the service name.
For example, to start a motion correction service:

```bash
$ zocalo.service -s MotionCorr
```

Once started, these services will initialise and then wait for messages to be sent to them.
Messages are sent through a message broker,
currently [RabbitMQ](http://www.rabbitmq.com) is supported using pika transport in `python-workflows`.
Individual processing stages can be run by sending a dictionary of the parameters,
but the processing pipelines are designed to run through recipes.

A recipe is a specication of a series of steps to carry out,
and how these steps interact with each other.
Recipes for the current processing pipelines are provided in the `recipes` folder.

To run a recipe in python a dictionary needs to be provided consisting of
the recipe name and the parameters expected by the recipe.
The following snippet shows an example of the setup needed.
This will send a message to a running **Dispatcher** service which
prepares the recipe for the processing services.

```python
import workflows.transport.pika_transport as pt

example_message = {
    "recipes": ["em-tomo-align"],
    "parameters": {
        "path_pattern": "micrograph_*.mrc",
        "pix_size": "1",
        ...
    },
}

transport = pt.PikaTransport()
transport.connect()
transport.send("processing_recipe", example_message)
```
