Contributing to VoxelOps

First off, thank you for considering contributing to VoxelOps! We welcome contributions of all kinds, from bug fixes to new features. This guide focuses on how to add a new “runner” to the project.

What is a Runner?

In VoxelOps, a “runner” is a Python function that wraps a command-line tool, typically a Docker-based neuroimaging tool. The goal is to provide a simple, consistent Python interface for these tools.

A runner is composed of three main parts:

  1. The Runner Function: This is the main function that users will call. It takes an inputs object and an optional config object, builds and executes a command, and returns a dictionary of results. (e.g., src/voxelops/runners/qsiprep.py)

  2. The Schemas: These are dataclasses that define the inputs, default parameters, and expected outputs for the runner. They provide type hinting and validation. (e.g., src/voxelops/schemas/qsiprep.py)

  3. Tests: Each runner should have corresponding tests to ensure it works correctly. (e.g., tests/test_runners_qsiprep.py)

Step-by-Step Guide to Adding a New Runner

Let’s say we want to add a new runner for a tool called mytool.

### 1. Create the Schema File

First, create a new file in src/voxelops/schemas/ named mytool.py. In this file, you’ll define three dataclasses:

  • MyToolInputs: Required inputs for your tool, like bids_dir or participant.

  • MyToolDefaults: Default parameters for the tool, like the Docker image name or the number of processors.

  • MyToolOutputs: The expected outputs of the tool, like file paths.

Here’s an example for src/voxelops/schemas/mytool.py:

from dataclasses import dataclass
from pathlib import Path

@dataclass
class MyToolInputs:
    bids_dir: Path
    participant: str
    output_dir: Path | None = None
    work_dir: Path | None = None

@dataclass
class MyToolDefaults:
    docker_image: str = "myorg/mytool:latest"
    nprocs: int = 2

@dataclass
class MyToolOutputs:
    output_file: Path

    @classmethod
    def from_inputs(cls, inputs: MyToolInputs, output_dir: Path) -> "MyToolOutputs":
        return cls(
            output_file=output_dir / f"sub-{inputs.participant}" / "output.txt"
        )

### 2. Create the Runner Function File

Next, create the main runner file in src/voxelops/runners/, also named mytool.py. This file will contain the run_mytool function.

This function should:

  • Accept inputs (MyToolInputs) and optional config (MyToolDefaults).

  • Use helpers from voxelops.runners._base to validate inputs.

  • Construct the full docker run command as a list of strings.

  • Call run_docker from the base module to execute the command.

  • Return the resulting execution dictionary, augmented with inputs, config, and expected outputs.

Here is an example for src/voxelops/runners/mytool.py:

import os
from pathlib import Path
from typing import Dict, Optional, Any

from voxelops.runners._base import (
    run_docker,
    validate_input_dir,
    validate_participant,
)
from voxelops.schemas.mytool import (
    MyToolInputs,
    MyToolOutputs,
    MyToolDefaults,
)

def run_mytool(
    inputs: MyToolInputs, config: Optional[MyToolDefaults] = None, **overrides
) -> Dict[str, Any]:

    config = config or MyToolDefaults()
    for key, value in overrides.items():
        if hasattr(config, key):
            setattr(config, key, value)

    validate_input_dir(inputs.bids_dir, "BIDS")
    validate_participant(inputs.bids_dir, inputs.participant)

    output_dir = inputs.output_dir or (inputs.bids_dir.parent / "derivatives")
    work_dir = inputs.work_dir or (output_dir.parent / "work" / "mytool")
    output_dir.mkdir(parents=True, exist_ok=True)
    work_dir.mkdir(parents=True, exist_ok=True)

    expected_outputs = MyToolOutputs.from_inputs(inputs, output_dir)

    uid = os.getuid()
    gid = os.getgid()

    cmd = [
        "docker", "run", "-ti", "--rm",
        "--user", f"{uid}:{gid}",
        "-v", f"{inputs.bids_dir}:/data:ro",
        "-v", f"{output_dir}:/out",
        "-v", f"{work_dir}:/work",
        config.docker_image,
        "/data", "/out", "participant",
        "--participant-label", inputs.participant,
        "--nprocs", str(config.nprocs),
    ]

    log_dir = output_dir.parent / "logs"
    result = run_docker(
        cmd=cmd,
        tool_name="mytool",
        participant=inputs.participant,
        log_dir=log_dir,
    )

    result["inputs"] = inputs
    result["config"] = config
    result["expected_outputs"] = expected_outputs

    return result

### 3. Add the Runner to the __init__.py

Make your new runner easily importable by adding it to src/voxelops/runners/__init__.py:

# src/voxelops/runners/__init__.py
...
from .mytool import run_mytool
...

And also to the main __init__.py in src/voxelops/__init__.py:

# src/voxelops/__init__.py
...
from .runners import (
    ...
    run_mytool,
)
...
__all__ = [
    ...
    "run_mytool",
]

### 4. Write Tests

Finally, add tests for your new runner. Create a new file tests/test_runners_mytool.py. You should at least test:

  • That the runner function runs without errors (you can mock the subprocess.run call).

  • That the Docker command is built correctly.

  • That input validation works as expected.

Refer to existing tests like tests/test_runners_qsiprep.py for examples.

### 5. Update Documentation

If you’ve added a new runner, add it to the list of available procedures in docs/index.rst and create a new .rst file for your runner in the docs/source/ folder.

Final Words

Once you’ve followed these steps, open a pull request on GitHub. We’ll review your contribution and work with you to get it merged.

Thank you for helping us make VoxelOps better!