Motion correction¶

The first step in processing the raw data is to perform motion correction.

Outputs¶

We want to produce a motion-corrected NIFTI file, with a similar filename structure to the raw data but with the additional BIDS entity desc that has the label mc.

We start by creating a new rule (workflow/rules/mot_correct.smk) that has this output information:

workflow/rules/mot_correct.smk¶

rule mot_correct:
    "Runs motion correction"
    output:
        img="results/sub-{sub_num}/func/sub-{sub_num}_task-{task}_desc-mc_bold.nii.gz",

Inputs¶

Unlike the previous step, motion correction requires an input to be specified (the raw data). While we could specify the path directly, a better approach is to instead reference the output of the rule that produced the input. We can do that by referring to rules.{rule_name}.output.{output_id}, where rule_name is the name that we have given to the rule (here, acquire_func) and output_id is the name that we have given to the output of interest (here, img).

Note

A good heuristic is to aim to only ever specify the full file path (containing wildcards) once within a workflow - and refer to this specification in other rules that require knowledge of the path.

workflow/rules/mot_correct.smk¶

rule mot_correct:
    "Runs motion correction"
    input:
        img=rules.acquire_func.output.img,
    output:
        img="results/sub-{sub_num}/func/sub-{sub_num}_task-{task}_desc-mc_bold.nii.gz",

The motion correction algorithm also needs to know which volume it should use as the reference, to which all the other volumes are registered. Here, we will designate the first volume in the “stopsignal” task acquisition as the reference (“base”) volume. We can tell Snakemake that this is a required input by adding another entry within the input directive:

workflow/rules/mot_correct.smk¶

rule mot_correct:
    "Runs motion correction"
    input:
        img=rules.acquire_func.output.img,
        base=lambda wildcards: rules.acquire_func.output.img.format(
            sub_num=wildcards.sub_num,
            task="stopsignal",
        ),
    output:
        img="results/sub-{sub_num}/func/sub-{sub_num}_task-{task}_desc-mc_bold.nii.gz",

Note that this needs to be an input function, because we want to use the active subject number wildcard value with a fixed task wildcard value. We use the standard Python string formatting function (format) to insert the desired values into the sub_num and task wildcards from the img entry in the output directive of the acquire_func rule.

Parameters¶

In the previous section, we mentioned that the base for the motion correction algorithm was the first volume in the image file (which contains many volumes). We can specify this volume information as a parameter:

workflow/rules/mot_correct.smk¶

rule mot_correct:
    "Runs motion correction"
    input:
        img=rules.acquire_func.output.img,
        base=lambda wildcards: rules.acquire_func.output.img.format(
            sub_num=wildcards.sub_num,
            task="stopsignal",
        ),
    output:
        img="results/sub-{sub_num}/func/sub-{sub_num}_task-{task}_desc-mc_bold.nii.gz",
    params:
        base_volume=0,

This provides us with information that we can use within the mechanism to specify the base volume.

Mechanism¶

Now we need to think about the rule’s mechanism — how the output files are produced.

Container¶

We will be using AFNI to perform the motion correction, so we need to find a container that provides AFNI. A great resource for neuroimaging-related containers is the NeuroDesk package respository. If we go to that site and search for ‘AFNI’, there is an afni_26.0.07 package that we can use. Clicking on the link shows that a Docker container is available at ghcr.io/neurodesk/afni_26.0.07:20260128. That gives us all the information we need to specify the container for the rule, which we add to common.smk:

workflow/rules/common.smk¶

SUB_NUMS = ["10159", "10171", "10189"]
TASKS = ["taskswitch", "stopsignal"]

CONTAINER_SOURCES = {
    "AWS-CLI": "docker://amazon/aws-cli:2.32.21",
    "AFNI": "docker://ghcr.io/neurodesk/afni_26.0.07:20260128",
}

and then to the mot_correct.smk rule:

workflow/rules/mot_correct.smk¶

rule mot_correct:
    "Runs motion correction"
    input:
        img=rules.acquire_func.output.img,
        base=lambda wildcards: rules.acquire_func.output.img.format(
            sub_num=wildcards.sub_num,
            task="stopsignal",
        ),
    output:
        img="results/sub-{sub_num}/func/sub-{sub_num}_task-{task}_desc-mc_bold.nii.gz",
    params:
        base_volume=0,
    container:
        CONTAINER_SOURCES["AFNI"]

Note

The AFNI container is quite large (~8 GB) and so can take quite a while to download. Unfortunately, there is no progress indicator while it is downloading. It is cached though, so it only needs to be downloaded once.

Logging¶

As usual, we need to specify the file to store logging information:

workflow/rules/mot_correct.smk¶

rule mot_correct:
    "Runs motion correction"
    input:
        img=rules.acquire_func.output.img,
        base=lambda wildcards: rules.acquire_func.output.img.format(
            sub_num=wildcards.sub_num,
            task="stopsignal",
        ),
    output:
        img="results/sub-{sub_num}/func/sub-{sub_num}_task-{task}_desc-mc_bold.nii.gz",
    params:
        base_volume=0,
    container:
        CONTAINER_SOURCES["AFNI"]
    log:
        "logs/mot_correct/mot_correct_{sub_num}_{task}.txt"

Script¶

In the previous rule, we directly specified a shell command. Although the command we need to run to do the motion correction is not very complex, we will instead use a Python script to execute the rule. This is implemented using the script directive, which has a value that is the path to the Python file relative to the location of the rule. By convention, scripts are stored in workflow/scripts/; given that this rule is in workflow/rules/, the relative path begins with ../scripts/.

workflow/rules/mot_correct.smk¶

rule mot_correct:
    "Runs motion correction"
    input:
        img=rules.acquire_func.output.img,
        base=lambda wildcards: rules.acquire_func.output.img.format(
            sub_num=wildcards.sub_num,
            task="stopsignal",
        ),
    output:
        img="results/sub-{sub_num}/func/sub-{sub_num}_task-{task}_desc-mc_bold.nii.gz",
    params:
        base_volume=0,
    container:
        CONTAINER_SOURCES["AFNI"]
    log:
        "logs/mot_correct/mot_correct_{sub_num}_{task}.txt"
    script:
        "../scripts/mot_correct.py"

Before looking at the Python script, it is worth thinking about the command that the script will need to execute. For motion correction, we will use the 3dvolreg command with base and prefix named parameters and the input as positional arguments.

When the Python script is executed by Snakemake, it inserts a special object named snakemake that contains useful information for constructing the command. The snakemake object has a log attribute, which contains the information from the log directive in the rule. We start by opening this log file to be able to write to it during execution:

workflow/scripts/mot_correct.py¶

# open the log file for writing
with open(snakemake.log[0], "w") as log_handle:

Now we can build the command, using the snakemake object to obtain the necessary information for the command arguments:

Note

In AFNI, the volume within a file is referenced using square brackets after the path.

workflow/scripts/mot_correct.py¶

# open the log file for writing
with open(snakemake.log[0], "w") as log_handle:

    cmd = [
        "3dvolreg",
        "-base", f"{snakemake.input.base}[{snakemake.params.base_volume}]",
        "-prefix", snakemake.output.img,
        snakemake.input.img,
    ]

It is helpful for record-keeping and debugging to store the command that was executed. We can do that by printing the command to the log file:

workflow/scripts/mot_correct.py¶

# open the log file for writing
with open(snakemake.log[0], "w") as log_handle:

    cmd = [
        "3dvolreg",
        "-base", f"{snakemake.input.base}[{snakemake.params.base_volume}]",
        "-prefix", snakemake.output.img,
        snakemake.input.img,
    ]

    # keep a record of the command in the log file
    print(" ".join(cmd), file=log_handle, flush=True)

Finally, we execute the command using the run function from the built-in Python subprocess package — providing the log file handle so that any content from standard output or standard error is stored within the log file:

workflow/scripts/mot_correct.py¶

import subprocess

# open the log file for writing
with open(snakemake.log[0], "w") as log_handle:

    cmd = [
        "3dvolreg",
        "-base", f"{snakemake.input.base}[{snakemake.params.base_volume}]",
        "-prefix", snakemake.output.img,
        snakemake.input.img,
    ]

    # keep a record of the command in the log file
    print(" ".join(cmd), file=log_handle, flush=True)

    # run the command
    subprocess.run(cmd, stdout=log_handle, stderr=log_handle, check=True)

Resources¶

We can provide an indication of the RAM that will be used by a single invocation of the rule by setting the mem key as part of the resources directive. Here, we will specify that Snakemake should budget for the job using up to 1 GB of RAM.

workflow/rules/mot_correct.smk¶

rule mot_correct:
    "Runs motion correction"
    input:
        img=rules.acquire_func.output.img,
        base=lambda wildcards: rules.acquire_func.output.img.format(
            sub_num=wildcards.sub_num,
            task="stopsignal",
        ),
    output:
        img="results/sub-{sub_num}/func/sub-{sub_num}_task-{task}_desc-mc_bold.nii.gz",
    params:
        base_volume=0,
    container:
        CONTAINER_SOURCES["AFNI"]
    log:
        "logs/mot_correct/mot_correct_{sub_num}_{task}.txt"
    resources:
        mem="1GB",
    script:
        "../scripts/mot_correct.py"

Preparing for execution¶

As usual, our next step is to add the new rule file to the Snakefile:

workflow/Snakefile¶

include: "rules/common.smk"
include: "rules/acquire_anat.smk"
include: "rules/acquire_func.smk"
include: "rules/mot_correct.smk"

rule all:
    input:
        expand(rules.acquire_anat.output.img, sub_num=SUB_NUMS),
        expand(rules.acquire_func.output.img, sub_num=SUB_NUMS, task=TASKS),

Then we need to tell Snakemake that we want to create the output from the motion correction rule, via the all rule. Note that we no longer need to specify the output from the acquire_func rule in all, because the output from acquire_func is needed as an input for mot_correct.

workflow/Snakefile¶

include: "rules/common.smk"
include: "rules/acquire_anat.smk"
include: "rules/acquire_func.smk"
include: "rules/mot_correct.smk"

rule all:
    input:
        expand(rules.acquire_anat.output.img, sub_num=SUB_NUMS),
        expand(rules.mot_correct.output.img, sub_num=SUB_NUMS, task=TASKS),

If you do a test invocation of Snakemake, you will find that Snakemake gives an error:

$ uv run snakemake --dry-run

AmbiguousRuleException:
Rules mot_correct and acquire_func are ambiguous for the file results/sub-10159/func/sub-10159_task-taskswitch_desc-mc_bold.nii.gz.
Consider starting rule output with a unique prefix, constrain your wildcards, or use the ruleorder directive.
Wildcards:
        mot_correct: sub_num=10159,task=taskswitch
        acquire_func: sub_num=10159,task=taskswitch_desc-mc
Expected input files:
        mot_correct: results/sub-10159/func/sub-10159_task-taskswitch_bold.nii.gz results/sub-10159/func/sub-10159_task-stopsignal_bold.nii.gz
        acquire_func:
Expected output files:
        mot_correct: results/sub-10159/func/sub-10159_task-taskswitch_desc-mc_bold.nii.gz
        acquire_func: results/sub-10159/func/sub-10159_task-taskswitch_desc-mc_bold.nii.gz

The error is caused by Snakemake getting confused about how to resolve the wildcard values. From the mot_correct rule, it correctly infers the wildcards (task=taskswitch). However, the task wildcard is inferred incorrectly in the acquire_func rule (task=taskswitch_desc-mc).

The solution is to provide a constraint on the potential wildcard values, using the wildcard_constraints directive; i.e., we tell Snakemake that the task wildcard can only possibly resolve to taskswitch or stopsignal. Given the task wildcard applies over multiple rules, we specify this constraint within the Snakefile:

workflow/Snakefile¶

include: "rules/common.smk"
include: "rules/acquire_anat.smk"
include: "rules/acquire_func.smk"
include: "rules/mot_correct.smk"

wildcard_constraints:
    task="|".join(TASKS)

rule all:
    input:
        expand(rules.acquire_anat.output.img, sub_num=SUB_NUMS),
        expand(rules.mot_correct.output.img, sub_num=SUB_NUMS, task=TASKS),

If you now run Snakemake again, it should correctly resolve the wildcards:

$ uv run snakemake --dry-run

Executing the workflow¶

Finally, you can run Snakemake and execute the workflow:

$ uv run snakemake

Note that it will not need to re-run the acquire_anat and acquire_func rules — it knows that the output from those rules is already present, so it only needs to run the mot_correct rule to produce all the inputs to the all rule.

Note

Snakemake can sometimes fail at this point due to being unable to download the AFNI container. Downloading from the package repository can become throttled and time out.

An alternative approach is to manually acquire the container:

$ mkdir containers
$ apptainer pull containers/afni.sif docker://ghcr.io/neurodesk/afni_26.0.07:20260128

This downloads the container into the path containers/afni.sif. This path can then be referenced in the common.smk rule instead of the URL (i.e., the value of the AFNI key in the CONTAINER_SOURCES dictionary becomes "containers/afni.sif").