Metadata-Version: 1.1
Name: hcpre
Version: 0.5.0
Summary: Generalized launcher for human connectome project BOLD preprocessing
Home-page: https://github.com/beOn/hcpre
Author: Ben Acland
Author-email: benacland@gmail.com
License: BSD
Description: Installation
        ============
        
        Quick Note
        ----------
        
        For brevity's sake, all instructions assume that you are using a BASH shell.
        If you have made an informed decision to do otherwise, I assume you know
        enough to translate everything into your own environment. If you have been
        forced to do otherwise by some powers that be, poor soul, ask your systems
        administrator (or local friendly nerd) for help, or take this as a learning
        opportunity.
        
        These instructions also assume that your python environment is already set up.
        If that's not the case, you may find it helpful to consult our [cluster setup
        instructions](https://github.com/beOn/hcpre/wiki/Setup-for-the-WUSTL-HPC-Cluster).
        While the instructions there are specific to the High Performance
        Computing cluster at Washington University in St Louis, most of them will
        apply to your environment, as well.
        
        The HCP Pipeline
        ----------------
        
        This code has been tested against HCP Pipeline v3.0RC3 (commit 058c132fc, Tue,
        Jan 14 2014). You'll have to make sure that the this or a later compatible
        version of the HCP code is installed on all of the machines on which you want
        to run the workflow. Installation is fairly easy - as long as you already have
        all of the HCP Pipeline's dependencies installed. Check the HCP Pipeline
        readme.txt for more information on how to get that done, paying special
        attention to FSL and FreeSurfer versions, and installing all of the
        dependencies of gradient_unwarp.py. If you have multiple versions of
        gradient_unwarp.py on your machine - be careful! Make sure that the version
        you call from the command line is the first version found on your python path,
        otherwise you might see some crashes.
        
        One final note: because the HCP Pipelines include some pretty large files,
        your systems admin would probably appriciate it if there weren't a new
        installation for every user. Check around with anyone else who uses the
        systems you're planning to use who might also use the HCP Pipelines. If
        they're already installed, it'll save you some pain.
        
        hcprep (this project)
        ---------------------
        
        This project was developed for python versions > 2.7 and < 3.0, Nipype >
        0.9.1.
        
        To get the latest release, install nipype, then hcpre using pip. The nipype
        installation is a little obnoxious right now. You can either check their
        install documentation, or go ahead and call ```pip install hcpre``` then check
        the errors to see what dependencies you're missing. The first time you run it,
        for example, you may get an error like:
        
        ```
        Need nisext package from nibabel installation - please install nibabel first
        ```
        
        Okay. So call ```pip install nibabel```, then call ```pip install hcpre```
        again. That'll get you the next error. Perhaps it'll be one of these:
        
        ```bash
        RuntimeError: Cannot import package "networkx" - is it installed?
        # or
        RuntimeError: Cannot import package "scipy" - is it installed?
        ```
        
        So, once again, call ```pip install X``` where X is ```networkx```, or
        ```scipy```, or whatever it tells you is missing. I know this stinks, and
        believe me I have tried to make this dependency install cleanly, but had to
        throw in the towel (for now). Numpy and scipy can take a while to build, but
        hopefully this process won't take too long, and eventually this
        command will work:
        
        ```bash
        pip install hcpre
        ```
        
        Once that works, you should get yourself a beer. I sure did.
        
        To install the development version, clone this repository to your machine, and
        update your PATH and PYTHONPATH variables, or run setup.py manually
        
        ```bash
        export PATH=$PATH:/path/to/hcpre/hcpre
        export PYTHONPATH=$PYTHONPATH:/path/to/hcpre
        
        # or...
        
        cd /path/to/hcpre
        python setup.py install
        ```
        
        You can also try using the requirements.txt file to install dependencies using
        pip. Again, you may have to pip install some stuff one-by-one.
        
        ```bash
        pip install -r requirements.txt
        ```
        
        If you're working on a community machine, talk to your systems administrator
        about the contents of requirements.txt, whether or not these dependencies are
        already installed, and any modifications that you may need to make to your
        environment to make sure that they're on your PYTHONPATH.
        
        You'll also need to install mricron, and make sure that its dcm2nii DICOM
        conversion application is on your PATH.
        
        Environment Variables
        ---------------------
        
        The HCP Pipelines make heavy use of environment variables - most of that is
        taken care of by the nipype workflow. But there are still a couple of
        variables that it's important to set correctly: ```$FREESURFER_HOME``` and
        ```$FSLDIR```. What's more, it's important that you call the FSL and
        FreeSurfer setup scripts from your .bashrc or .bash_profile file. Check the
        HCP Pipeline readme for information regarding which particular versions of
        FreeSurfer and FSL their code currently targets.
        
        Running the Pipeline
        ====================
        
        The file that contains the nipype workflow, hcpre.py, can be called as a
        command line script. See the section on configuration, then when you're ready
        to run pass the -r argument, along with any others you choose to use (see
        below).
        
        For help, call:
        
        ```bash
        hcpre.py --help
        ```
        
        Configuring the Pipeline
        ========================
        
        We currently use [configobj](https://pypi.python.org/pypi/configobj/) to write
        and read files. hcpre.py includes some tools to help you build and update
        config files pretty quickly, but since they're plain text you can always open
        them up with a text editor and change them by hand (more on this below).
        
        To build a new config file, call hcpre.py with the -i or --init argument.
        You'll be walked throught the creation of a new config file. You'll want to
        have already downloaded your data, and you should be sure that you have run
        the freesurfer and fsl setup scripts. You'll also need to know the path to the
        HCP Pipeline code. Let's quickly walk through the config steps as they are
        now. I'll take some time to discuss each of the questions, and how to figure
        out the appropriate answers.
        
        Configuration Walkthrough
        -------------------------
        
        We start by initializing a new config file.
        
        ```
        hcpre.py --init
        New config file name [hcp.conf]:
        ```
        
        Decisions decisions - what shall we call the new config file? The square
        brackets mean that whatever they contain is the default value. So if we just
        press return, we'll choose the name "hcp.conf". Sounds good to me, so let's
        just press return. If the file already exists, the script will exit.
        
        ```
        The subjects directory should contain all raw data for all subjects.
        Subjects Directory [./]: /data/nil-external/ccp/MOOD_RISK/DICOMs
        ```
        
        The subjects directory is the lowest directory below which we can find all of
        your experiment's dicoms. So, if you had dicoms sorted into /some/dir/sub_a,
        /some/dir/sub_b, etc., then the subjects directory would be /some/dir. Here,
        I've chosen something appropriate for my current experiment.
        
        ```
        The DICOM template should be a format string for a glob which, when combined
        with an individual subject ID, will get us all of the subject's DICOM files.
        DICOM template [data/raw_dicom/%s/*.dcm]: DR%s/SCANS/*/DICOM/*.dcm
        ```
        
        This is perhaps the trickiest one to answer, so I'm going to walk you through
        it step by step. The goal here is to tell the workflow how to find all of the
        dicoms for a given subject. We do that by giving it what is called a format
        string (google FMI), which allows us to substitute in each subject number in
        place of what is called a string format specifier, in this case '%s'. If that
        all seems like jargon, just follow along and hopefully things will start to
        make sense.
        
        The first step towards finding my format string is understanding how my data
        is organized. Let's take a look. We know my subject folder is /data/nil-
        external/ccp/MOOD_RISK/DICOMs, so let's make a quick exploration of that
        directory's organization:
        
        ```
        $> ls /data/nil-external/ccp/MOOD_RISK/DICOMs
        DR060  DR061  DR064
        $> ls /data/nil-external/ccp/MOOD_RISK/DICOMs/DR060/
        SCANS
        $> ls /data/nil-external/ccp/MOOD_RISK/DICOMs/DR060/SCANS/
        1  10  11  12  13  14  15  16  17  18  19  2  20  21  22  23  24  25  26  27
        28  29  3  30  31  32  33  34  35  36  37  38  39  4  40  5  6  7  8  9
        $> ls /data/nil-external/ccp/MOOD_RISK/DICOMs/DR060/SCANS/1/
        DICOM/     SNAPSHOTS/
        $> ls /data/nil-external/ccp/MOOD_RISK/DICOMs/DR060/SCANS/1/DICOM/
        DR060.MR.Barch_MoodRisk.1.1.20131205.173445.s02tx.dcm    DR060.MR.Barch_MoodRisk.1.3.20131205.173445.19qm3q2.dcm
        DR060.MR.Barch_MoodRisk.1.2.20131205.173445.10iky0u.dcm  scan_1_catalog.xml
        ```
        
        I might do the same looking into other subjects to verify that the
        organization is consistent. If that's not the case, you'll need to do some
        cleanup to make it so, then come back to this point. Assuming that we're
        satisfied on this point for now, we can see that a valid path to a specific
        dicom might be:
        
        ```
        /data/nil-external/ccp/MOOD_RISK/DICOMs/DR060/SCANS/1/DICOM/DR060.MR.Barch_MoodRisk.1.1.20131205.173445.s02tx.dcm
        ```
        
        So how to we get from here to a list of *all* of the experiments dicoms?
        First, we make use of the wildcard, ```*```. Because we use this string as
        what is known as a 'glob,' the character * will be expanded to match any
        number of characters, with a few exceptions (like '/'). So using this, we can
        begin to shrink our string:
        
        ```
        /data/nil-external/ccp/MOOD_RISK/DICOMs/DR060/SCANS/*/DICOM/*.dcm
        ```
        
        So here we've used two globs, one to replace the specific file name (we want
        everything that ends in .dcm), and another to replace the scan number. So now
        what about the subject number? This case is a little special. Since we want
        the script to be able to substitute in specific subject numbers, we use a
        string format specifier here instead of a wildcard. In the case of this data,
        the pattern seems to be ```.../DICOMs/DR<SUBJECT_NUMBER>/SCANS...```, so let's put a
        string format specifier in place of the subject number:
        
        ```
        /data/nil-external/ccp/MOOD_RISK/DICOMs/DR%s/SCANS/*/DICOM/*.dcm
        ```
        
        One last change. We don't need to include the subject directory prepended to
        the DICOM template (in fact, it is important that we do not). So let's get
        that out of there, leaving us with:
        
        ```
        DR%s/SCANS/*/DICOM/*.dcm
        ```
        
        Which is what we hand to the script! Moving right along:
        
        ```
        Subjects should be a comma separated list of subject ids.
        Subject list ['']: 060, 061, 064
        ```
        
        Feel free to provide nothing here. If you want to store a particular list of
        users to whom this config script should be applied, you can supply them here
        or later by hand. You can also specify them on the command line when you call
        hcpre.py using the -s parameter.
        
        After pressing enter, the script will look through all of the DICOM files that
        it can find. If you need to speed this up, I'll leave it as an exercise for
        the reader to figure out how to make it so the script only finds one or two
        subjects worth of data.
        
        ```
        Checking series names (this may take some time) (41 chunks remaining)...
        ```
        
        After it gets to 0, we can start in on the fun stuff. The script will print a
        numbered list of all the Series Descriptions it was able to find. Our job now
        is to tell it how to use that information to feed our data through the HCP
        Pipeline. Let's take a look at what it found in my case:
        
        ```
        Found 25 unique series descriptions.
        -------
        Series:
        -------
        0:  AAHScout
        1:  AAHScout_MPR_cor
        2:  AAHScout_MPR_sag
        3:  AAHScout_MPR_tra
        4:  BIAS_32CH
        5:  BIAS_BC
        6:  BOLD_FACE1
        7:  BOLD_FACE1_SBRef
        8:  BOLD_FACE2
        9:  BOLD_FACE2_SBRef
        10: BOLD_REWARD1
        11: BOLD_REWARD1_SBRef
        12: BOLD_REWARD2
        13: BOLD_REWARD2_SBRef
        14: BOLD_REWARD3
        15: BOLD_REWARD3_SBRef
        16: BOLD_TEST
        17: BOLD_TEST_SBRef
        18: FieldMap
        19: Localizer
        20: Localizer_aligned
        21: SpinEchoFieldMap_AP
        22: SpinEchoFieldMap_PA
        23: T1w_MPR_08mm
        24: T2w_SPC_08mm
        ```
        
        Alright - the first couple of questions are pretty easy. Your SBRef images
        might be called Scout or something else instead:
        
        ```
        Which series do you use for 'bold'?
        [None] or comma separated values 0-24: 6,8,10,12,14
        
        Which series do you use for 'bold_sbref'?
        [None] or comma separated values 0-24: 7,9,11,13,15
        ```
        
        For the next two, you'll often provide the same answer twice:
        
        ```
        Which series do you use for 'fieldmap_magnitude'?
        [None] or comma separated values 0-24: 18
        
        Which series do you use for 'fieldmap_phase'?
        [None] or comma separated values 0-24: 18
        ```
        
        If you collected Spin Echo Fieldmaps, they're probably either PA/AP, or RL/LR.
        Just leave blank responses for those you didn't collect:
        
        ```
        Which series do you use for 'fieldmap_ap'?
        [None] or comma separated values 0-24: 21
        
        Which series do you use for 'fieldmap_lr'?
        [None] or comma separated values 0-24:
        
        Which series do you use for 'fieldmap_pa'?
        [None] or comma separated values 0-24: 22
        
        Which series do you use for 'fieldmap_rl'?
        [None] or comma separated values 0-24:
        ```
        
        T1/T2 are pretty easy:
        
        ```
        Which series do you use for 't1'?
        [None] or comma separated values 0-24: 23
        
        Which series do you use for 't2'?
        [None] or comma separated values 0-24: 24
        ```
        
        The next one, ```polarity_swapped```, requires some explanation. In some
        experiments, you might acquire both RL and LR (or AP and PA) BOLD images.
        We're always trying to improve the list of values that the workflow derives at
        runtime, but for various reasons detecting this switch is difficult to do
        reliably. So we need the user's help. If you acquire images with opposing
        polarities, choose one of them, say ```LR``` if you're RL/LR, or ```PA``` if
        you're AP/PA, to call "swapped." We don't see that in this experiment, so
        we'll leave this blank. But if I did have two of each bold image, one with
        suffix ```_AP``` and one with suffix ```_PA```, I would list here the numbers
        of all of the "swapped" series, ie those that ended with ```_PA```.
        
        ```
        Which series do you use for 'polarity_swapped'?
        [None] or comma separated values 0-24:
        ```
        
        If the FreeSurfer and FSL environment variables are set correctly, the next
        two questions should provide correct default options. This works for me, so I
        don't supply an alternate value:
        
        ```
        Path for FREESURFER_HOME [/usr/local/freesurfer]:
        
        Path for FSLDIR [/usr/share/fsl/5.0]:
        ```
        
        Cool beans. For the next one, I need to know the location of the HCP Pipeline
        code:
        
        ```
        Path to HCP Pipelines dir [ ]: /scratch_cl1/hcp_pipe/Pipelines
        ```
        
        Pop quiz: do you know the resolution of your structural data?
        
        ```
        What is your structural image resolution (mm)?
        [Skip] or one of (.7, .8, 1): .8
        ```
        
        Now it asks if I want to use the default template files for my t1 resolution,
        and the default config files. Yes and yes.
        
        ```
        Use default template files for resolution 0.8 [y]/n?
        
        Use default config files [y]/n?
        ```
        
        Getting close! The next step was added to handle cases in which you may
        acquire more than one Spin Echo Fieldmap. In these cases you have a choice
        between two policies. Either we'll just choose the first set we find in each
        session (AP/PA or RL/LR pair), or we'll do something a little subtler. If you
        want the simple option, just choose 'first.' If you choose 'most_recent,' then
        for each bold we'll either use the SE Fieldmap pair most recently acquired
        prior to that particular BOLD, or if none were acquired before the BOLD run,
        then the first one acquired thereafter. In our case, we want the more complex
        option:
        
        ```
        If you have more than one ep fieldmap set, you may either
        want to use the first, or always use the most recent.
        Which policy would you like to use:
        
        0: first
        1: most_recent
        
        Select 0-1: 1
        ```
        
        This next one is pretty self explanatory:
        
        ```
        If you collect multiple t1 or t2 images, and averaging them yields warped
        results, try blocking structural image averaging.
        
        Block averaging of structural images [y]/n? y
        ```
        
        The last thing that the config setup script does is to make a guess at the
        unwarp direction.
        
        ```
        Very weak guess that your primary unwarp direction is y.
        Did I mention this is a GUESS?
        
        When finished, please open your config file check the value for ep_unwarp_dir.
        ```
        
        At this point, unfortunately, we do not have fully automated derivation of
        unwarpdir in place. So this really is just a guess. You should look over the
        final results carefully, taking care to check for any untoward distortions
        (like swirls, or unlikely overall shapes). If you see any of these things, try
        opening the config file and changing the unwarp direcion from x to y, from -x
        to x, or from y to -y, or any other combination. You might even try z in a
        pinch - but I wouldn't try it first!
        
        That's it for the config script at this point. To re-run the later part
        (optionally skipping the series mapping), call hcpre.py -u or hcpre.py --update.
        
        Config by Hand
        --------------
        
        If you choose to edit the config file by hand, please note that we use
        configobj in unrepr mode, which means that it expects the values to be in
        python format. The means that strings "need to be quoted," lists
        ["must","be","in","brackets"], and other primitives like 2.3 (floats), 2
        (integets), True and False (boolean values) can be used just as you would use
        them in a python script.
        
        Customizing Node Configuration
        ------------------------------
        
        If you want to customize any of the hcp node settings, you can either write
        your own script that creates a HCPrepWorkflow instance, and set them
        programmatically or you can extend the config file. Just before running,
        nifti_wrangler and all of the hcp script wrapper nodes check to see if any of
        their values have been set in the config file. To do this, the workflow first
        checks to see if the config file contains sections with names matching any of
        these nodes, including: nifti_wrangler, pre_freesurfer, freesurfer,
        post_freesurfer, volume_processing, and surface_processing. Then it check to
        see if any of the settings in that section have names that match any of the
        respective node's input parameters. If so, I try to set the value. We can see
        this at work in the default config script, where we supply several arguments
        to nifti_wrangler:
        
        ```
        [nifti_wrangler]
        ep_fieldmap_selection = 'most_recent'
        block_struct_averaging = True
        ep_unwarp_dir = 'y'
        ```
        
        For a full list of the inputs of each node, check out interface_docs.txt in
        the docs directory. This file is generated by looking at the current interface
        specifications, so it should be accurate.
        
        Take note that not all attributes are config-file-settable. Those that are not
        include those that are derived at runtime. This list is undocumented at the
        moment.
        
        Validation
        ----------
        
        Some time in the future, we'll put in some nice config file validation stuff.
        Maybe.
        
        Project Status
        ==============
        
        Alpha release.
        
        This software is not a replacement for knowing what you're doing, and you
        should inspect all of the pipeline's output logs carefully. We make no
        promises whatsoever at this point that the output of this pipeline is worth
        using.
        
        Reporting Bugs
        ==============
        
        Please use the githup isses tab for bug reports and feature requests.
Keywords: connectome preprocessing fmri bold
Platform: UNKNOWN
Classifier: Development Status :: 3 - Alpha
Classifier: Topic :: Scientific/Engineering :: Information Analysis
Classifier: License :: OSI Approved :: BSD License
Classifier: Intended Audience :: Science/Research
