Our workflows are composed of various pieces of bioinformatics software that, simply put, read some input, do some work, and write some output. During that middle work part, intermediate files may be created that are used during the work but are no longer needed once the program has completed. Keeping these intermediate files would be wasteful, so Unix-like systems provide a dedicated location where such temporary files can be routinely cleaned up without risking important data.

However, the default temporary directory provided by your system (which de facto is stored at /tmp) may be too small, or frequently full (or both); causing programs to crash with errors such as “No space left on device”. Often, compute clusters will be configured by a system administrator to instead provide a larger, shared temporary location to accommodate many users and their jobs, as this is easier to manage than individual compute nodes having a temporary directory that could potentially fill up and cause programs to crash. In such cases, running programs need to be instructed to place intermediate files elsewhere to avoid errors.

This post is written for users of Singularity who are aware that they have a non-default temporary directory and need to configure Nextflow appropriately. If you’re a system administrator who has been sent this page, you can find a TL;DR at the bottom, otherwise by the end of this guide you’ll know:

What Singularity uses the temporary directories for, and what the TMPDIR and SINGULARITY_TMPDIR environment variables are.
How to configure a Nextflow workflow running with Singularity to use a non-default temporary directory with TMPDIR.
As a bonus: How to use the NXF_SINGULARITY_CACHEDIR environment variable to re-use Singularity images built by Nextflow, saving disk space and workflow time.

These instructions are written with SingularityCE in mind, but we expect them to be relevant to Apptainer users too.

Aside: How did I know you were using Singularity, and why don’t Docker users need to worry about this?

We package up the software required for our workflows inside Docker images, which are effectively snapshots of a computer that have everything required for a workflow (or part of one) preinstalled. Docker and Singularity use these images to start a virtual computer that our workflows run inside, ensuring that the software we’ve packaged runs on your computer just as it did during our testing.

The difference between how Docker and Singularity work is much beyond the scope of this how-to; but the main point here is that Docker containers are automatically configured with their own temporary directory that is not mapped to the system’s temporary directory: meaning that a small or full temporary directory will not cause the programs inside the Docker container to crash. Singularity, however, will make your system’s temporary directory available to the jobs running inside your container by default.

So, if you’re in an environment where you want to use a non-default temporary directory, you will need to provide some additional configuration to avoid potential workflow errors.

Singularity and its temporary directories

Thankfully, a deep dive of Singularity is also beyond the scope of this how-to and you just need to know that among other things, Singularity needs a location for:

Storing intermediate work when running bioinformatics programs inside Singularity containers as part of a workflow execution
Converting our published Docker images into Singularity compatible images (this happens first, but it makes sense to talk about this second)
Storing those converted Singularity images (this will digress us from temporary matters, but more on this later)

We’ll talk about the configuration required for each in turn.

TMPDIR and singularity.runOptions: For intermediate work

Unix-like systems provide ways for people and programs to ask the system for a temporary file or directory. You can see this for yourself: for example on the command line, the mktemp command will return the path of a temporary file for you to use. Neatly, these utilities will honour the TMPDIR environment variable, which is used to indicate where temporary files should be placed instead of whatever the system default is (e.g. /tmp). Users (or their administrators) can override the location of the temporary directory by setting the TMPDIR environment variable and most well-behaved programs will then place their temporary files in the TMPDIR location.

Two things must be configured for intermediate files created by programs during workflow execution to be placed in the desired TMPDIR correctly:

The TMPDIR environmental variable itself
Nextflow must be configured with singularity.runOptions to instruct Singularity to make the TMPDIR location visible to the Nextflow processes running inside Singularity containers

TMPDIR

Your system administrator may have set TMPDIR already, but you can set it yourself in the script or shell before calling nextflow run to run our workflow with:

TMPDIR=/your/path/to/tmp/here

You should consult any documentation you have been provided about your cluster to determine if there is an appropriate TMPDIR to use. If in doubt, consult your system administrator.

singularity.runOptions

Nextflow will forward the value of TMPDIR to all programs inside a workflow, the catch is that Singularity will not automatically make the location of TMPDIR visible to the containers, causing an ominous “failed to create file via template: No such file or directory” error. You may well have been linked to this post by us to explain and resolve this very issue!

To instruct Singularity to make the TMPDIR visible to Nextflow processes running in Singularity containers, you should edit your global Nextflow configuration (which is typically stored at $HOME/.nextflow/config, but see Nextflow’s documentation for further guidance). This file may exist already, or you may have to create it. Either way you will want to add the following line:

singularity.runOptions = "-B \$TMPDIR"

This will adjust Nextflow’s Singularity options to “bind” (i.e. make visible) the location of TMPDIR to all Nextflow processes, resolving the “No such file or directory” error you may have encountered. If you have existing singularity.runOptions in this file, you should append -B \$TMPDIR inside the double quotes. Make sure to maintain \$: this is used to make sure that Nextflow does not expand TMPDIR prematurely to a Nextflow variable.

If you’re on a cluster with a shared temporary directory, you could alternatively ask your system administrator to add a permanent bind in singularity.conf; this will avoid needing to provide singularity.runOptions.

SINGULARITY_TMPDIR: For converting Docker images to Singularity images

Naming things happens to be one of the two most difficult problems in computing (along with cache invalidation and off-by-one errors). With this in mind, you would be forgiven for thinking that the SINGULARITY_TMPDIR environment variable would tell programs running inside Singularity containers what TMPDIR to use. Instead, SINGULARITY_TMPDIR tells Singularity itself what temporary directory it should use.

When running one of our Nextflow workflows with the Singularity executor, any container images that are required for the workflow to run will be automatically downloaded from the internet and converted into Singularity images by Nextflow; usually rather seamlessly. During this process, Singularity will read the Docker image as input, and write a Singularity compatible image as output. Just like our bioinformatics tools, Singularity needs to do some intermediate work to unpack the Docker image and build a Singularity one. Singularity requires at least as much space as the size of the resulting image, which will be several gigabytes at a minimum. If the system’s default temporary directory is small, you may encounter obscure errors during this Singularity build step.

Our recommendation is to set the SINGULARITY_TMPDIR environment variable to your TMPDIR:

SINGULARITY_TMPDIR=$TMPDIR

This should be set after TMPDIR but before calling nextflow run.

NXF_SINGULARITY_CACHEDIR: For storing Singularity images (built via Nextflow)

Orthogonal to temporary directories but relevant to Singularity and disk space, I would be remiss to not mention the problem of storing the Singularity images built when using Nextflow. By default, Nextflow will store the Singularity images it builds in a directory inside the work/ directory created by the workflow you are running. Each time you run a Nextflow workflow, it will have its own work/ directory, which means that each time you run a Nextflow workflow with Singularity, Docker images will be downloaded and converted all over again!

To our rescue comes the NXF_SINGULARITY_CACHEDIR environment variable, which specifies the location that Singularity images should be saved to after they have been built. Before downloading and converting images, Nextflow checks the NXF_SINGULARITY_CACHEDIR for existing images: only downloading and converting images it has not seen before, saving time and disk space. This is particularly useful on a cluster as any user with access to the NXF_SINGULARITY_CACHEDIR location can re-use Singularity images that already exist.

Like the other environment variables, you will set this in your shell or in your script before calling nextflow run:

NXF_SINGULARITY_CACHEDIR=/your/path/to/saved/images/here

You should speak to your system administrator about setting an appropriate location for NXF_SINGULARITY_CACHEDIR that can be shared between users.

TL;DR: A crash course on avoiding crashes with TMPDIR

We’ve covered the reason that Singularity is different to Docker when it comes to temporary files, and what Singularity needs to store on your disk. To recap the configuration required for setting a non-default temporary directory, you’ll want to:

Set your desired temporary directory with TMPDIR somewhere that nextflow run will see it (e.g. your shell or job script)
Instruct Nextflow to bind TMPDIR to Singularity containers with singularity.runOptions in your global Nextflow config
Instruct Singularity to use that temporary directory for converting images with SINGULARITY_TMPDIR
Instruct Nextflow to store converted images for re-use with NXF_SINGULARITY_CACHEDIR

You may want to put these lines in your shell’s rc file (e.g. ~/.bashrc), but if you’re still not sure what these lines do, you should speak to your administrator first:

TMPDIR=/your/path/to/tmp/here
SINGULARITY_TMPDIR=$TMPDIR
NXF_SINGULARITY_CACHEDIR=/your/path/to/saved/images/here

Don’t forget to update your global Nextflow configuration to update the Singularity run options:

singularity.runOptions = "-B \$TMPDIR"

Now you should be all set to run a Nextflow workflow with the Singularity executor and fill up your non-default temporary directory!

Sam Nicholls

Workflow wrangler

Aside: How did I know you were using Singularity, and why don't Docker users need to worry about this?

Singularity and its temporary directories

TL;DR: A crash course on avoiding crashes with TMPDIR

Unexpected results, so now what?

Natalia Garcia

July 02, 2024

3 min

Singularity for bioinformatics

Neil Horner

February 26, 2024

7 min

How to interpret exit codes

Sam Nicholls

October 06, 2023

4 min

How to align your data

Sarah Griffiths

September 29, 2023

4 min

Targeted BRCA Gene Analysis with Oxford Nanopore

Matt Parker

June 19, 2023

4 min

Selecting the correct databases in the wf-metagenomics

Natalia Garcia

May 11, 2023

3 min

Quick Links

Workflows Open Data Contact

How to set your temporary directory for Singularity

Aside: How did I know you were using Singularity, and why don’t Docker users need to worry about this?

Singularity and its temporary directories

TMPDIR and singularity.runOptions: For intermediate work

TMPDIR

singularity.runOptions

SINGULARITY_TMPDIR: For converting Docker images to Singularity images

NXF_SINGULARITY_CACHEDIR: For storing Singularity images (built via Nextflow)

TL;DR: A crash course on avoiding crashes with TMPDIR

Tags

Share

Sam Nicholls

Workflow wrangler

Table Of Contents

Related Posts

How to set your temporary directory for Singularity

Singularity and its temporary directories

.css-3mxrie{box-sizing:border-box;margin:0;min-width:0;display:block;color:var(--theme-ui-colors-heading,#edf2f7);font-weight:bold;-webkit-text-decoration:none;text-decoration:none;margin-bottom:1rem;font-size:1.125rem;position:relative;}TMPDIR and singularity.runOptions: For intermediate work

TMPDIR

singularity.runOptions

SINGULARITY_TMPDIR: For converting Docker images to Singularity images

NXF_SINGULARITY_CACHEDIR: For storing Singularity images (built via Nextflow)

TL;DR: A crash course on avoiding crashes with TMPDIR

Tags

Share

Sam Nicholls

Workflow wrangler

Table Of Contents

Related Posts

TMPDIR and singularity.runOptions: For intermediate work