Workflow automation and orchestration, on your workstation and in HPC environments using Snakemake#
SkelShop include tools for running extraction pipelines orchestrated using Snakemake. These workflows can be run on a single node, however more typically these would be run in a HPC environment, with a heterogeneous mix of GPU and CPU nodes being orchestrated by SLURM so as to enable skeleton/face extraction from a large video corpus in a reasonable amount of time.
The intended environment for SkelShop to be run is in a SLURM-based HPC environment in a Singularity container. You can run Snakemake on one node (typically a login node, since no heavy computation is performed by this node) and the actual steps will run on different nodes chosen according to a JSON configuration file, all from a single Singularity container. This workflow is enabled by singslurm2 project, which is based on the Snakemake SLURM profile.
Running Snakemake on a single node#
Snakemake can be run on a single node, which might be appropriate if you have a very small video corpus or a lot of time(!)
For example assuming you have followed the manual installation instruction and that you want to use 8 cores:
$ cd /path/to/skelshop
$ poetry run snakemake tracked_all \
--cores 8 \
--config \
VIDEO_BASE=/path/to/my/video/corpus/ \
Running Snakemake on a SLURM cluster#
First set up singslurm2:
$ cd ~
$ git clone --recursive
Now you can download the Docker image with Singularity:
$ singularity pull skelshop.sif docker://
Next, you need to create a JSON file specifying which type of nodes you would like to assign to different rules (steps in the workflow). There is an example for Case Western Reserve University SLURM cluster. See also the SLURM documentation and the SLURM Snakemake profile documentation for information on how to write this file.
You can see the names of the steps in the workflow at any time by running:
$ poetry run snakemake --list
So for example you might:
Download the example cluster configuration.
$ wget
Edit it if need be.
Then run the following command after editing the placeholders (at least
:$ SIF_PATH=$(pwd)/skelshop.sif \ SNAKEFILE=/opt/skelshop/workflow/Snakefile \ CLUSC_CONF=$(pwd)/skels.tracked.clusc.json \ NUM_JOBS=42 \ SING_EXTRA_ARGS="--bind /path/to/my/extra/bind" \ ~/singslurm2/ \ tracked_all \ --config \ VIDEO_BASE=/path/to/my/video/corpus/ \ DUMP_BASE=/path/to/my/dump/directory
Please see the singslurm repository
for more information about the environment variables passed to singslurm2/
Integrating SkelShop into your own pipelines#
In case you are using SkelShop as part of a larger pipeline or want to further
customise your workflow, you should write your own Snakefile. See the
Snakemake documentation. You may like to
use the rules and scripts from SkelShop. In this case the current best approach
is to copy or symlink everything under workflow/rules
and workflow/scripts
into your own workflow
Other HPC utilities#
There are some examples of how to run which are specific to the Case Western Reserve University SLURM cluster in the contrib/slurm directory.