Metadata-Version: 1.1
Name: jobarchitect
Version: 0.5.0
Summary: Tools for batching jobs and dealing with file paths
Home-page: https://github.com/JIC-CSB/jobarchitect
Author: Tjelvar Olsson
Author-email: tjelvar.olsson@jic.ac.uk
License: MIT
Download-URL: https://github.com/JIC-CSB/jobarchitect/tarball/0.5.0
Description: Architect jobs for running analyses
        ===================================
        
        .. image:: https://badge.fury.io/py/jobarchitect.svg
           :target: http://badge.fury.io/py/jobarchitect
           :alt: PyPi package
        
        .. image:: https://readthedocs.org/projects/jobarchitect/badge/?version=latest
           :target: http://jobarchitect.readthedocs.io/en/latest/?badge=latest
           :alt: Documentation Status
        
        - Documentation: http://jobarchitect.readthedocs.io
        - GitHub: https://github.com/JIC-CSB/jobarchitect
        - PyPI: https://pypi.python.org/pypi/jobarchitect
        - Free software: MIT License
        
        
        Overview
        --------
        
        This tool is intended to automate generation of scripts to run analysis on data
        sets. To use it, you will need a data set that has been created (or annotated)
        with `dtool <https://github.com/JIC-CSB/dtool>`_.
        It aims to help by:
        
        1. Removing the need to know where specific data items are stored in a data set
        2. Providing a means to split an analyses into several chunks (file based
           parallelization)
        3. Providing a framework for seamlessly running an analyses inside a container
        
        
        Design
        ------
        
        This project has two main components. The first is a command line tool named
        ``sketchjob`` intended to be used by the end user. It is used to generate
        scripts defining jobs to be run. The second (``_analyse_by_ids``) is a command
        line tool that is used by the scripts generated by ``sketchjob``. The end user
        is not meant to make use of this second script directly.
        
        
        Installation
        ------------
        
        To install the jobarchitect package.
        
        ::
        
            $ cd jobarchitect
            $ python setup.py install
        
        
        Use
        ---
        
        To generate bash scripts for data analysis, first create a common workflow task
        description file. For example::
        
        
        
        Then an example dataset::
        
            $ datatool new dataset
            project_name [project_name]:
            dataset_name [dataset_name]: example_dataset
            ...
        
            $ echo "My example data" > example_dataset/data/my_file.txt
            $ datatool manifest update example_dataset/
        
        Create an output directory::
        
            $ mkdir output
        
        Then you can generate analysis run scripts with::
        
            sketchjob shasum.cwl exmaple_dataset output/
            #!/bin/bash
        
            _analyse_by_ids \
              --cwl_tool_wrapper_path=shasum.cwl \
              --input_dataset_path=example_dataset/ \
              --output_root=output/ \
              290d3f1a902c452ce1c184ed793b1d6b83b59164
        
        Try the script with::
        
            $ sketchjob shasum.cwl exmaple_dataset output/ > run.sh
            $ bash run.sh
            $ cat output/first_image.png
            290d3f1a902c452ce1c184ed793b1d6b83b59164  /private/var/folders/hn/crprzwh12kj95plc9jjtxmq82nl2v3/T/tmp_pTfc6/stg02d730c7-17a2-4d06-a017-e59e14cb8885/first_image.png
        
        Working with Docker
        -------------------
        
        Building a Docker image
        ^^^^^^^^^^^^^^^^^^^^^^^
        
        For the tests to pass, you will need to build an example Docker image, which
        you do with the provided script::
        
            $ bash build_docker_image.sh
        
        Running code with the Docker backend
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        
        By inspecting the script and associcated Docker file, you can get an idea of
        how to build Docker images that can be used with the jobarchitect Docker
        backend, e.g::
        
            $ sketchjob sha1sum.cwl ~/junk/cotyledon_images ~/junk/output --backend=docker --image-name=jicscicomp/jobarchitect
            #!/bin/bash
        
            IMAGE_NAME=jicscicomp/jobarchitect
            docker run  \
              --rm  \
              -v /Users/olssont/junk/cotyledon_images:/input_dataset:ro  \
              -v /Users/olssont/junk/output:/output  \
              -v /Users/olssont/sandbox/cwl_v1/sha1sum.cwl:/tool.cwl:ro \
              $IMAGE_NAME  \
              _analyse_by_ids  \
                --cwl_tool_wrapper_path=/tool.cwl  \
                --input_dataset_path=/input_dataset  \
                --output_root=/output  \
                290d3f1a902c452ce1c184ed793b1d6b83b59164 09648d19e11f0b20e5473594fc278afbede3c9a4
        
Platform: UNKNOWN
