Your outflow pipelines can run on different backends. There are currently 3 backends in outflow:

  • default

  • parallel

  • slurm

Default backend

The default backend executes tasks sequentially in one python process. Tasks return dictionaries and these are passed to the next tasks as parameters, so object never leave the python context. You can pass objects as big as you want between the tasks.

Parallel backend

The parallel backend executes tasks inside a multiprocessing pool, so independent tasks in two differente branches of a workflow can be executed at the same time. It is especially useful for the MapTask.

Slurm backend

The slurm backend executes MapTasks by submitting a slurm array to a slurm cluster. See SlurmMapTask for details.

Specify a backend

There are multiple ways to define the backend that will be used:

  • in the command line with the argument –backend

  • per command inside their definition

  • inside the config.yml file

Priority is in this order :

command line > command definition > config.yml

The recommended way is to leave backend: default inside the config.yml (or remove the key), and define the preferred backend for a given command using :

class MyCommand(Command):

To better track what is happening, you can still run this command with the default backend using : python my_command --backend default. It is useful for debugging.