SLURM notes

(31 May 2022) python emacs

The compute cluster at HMS uses SLURM as a job scheduler. It has taken be a good 6 months of working on the cluster on and off to have gotten efficient at using it as a compute cluster and not just a high powered node on the interactive partition.

The painful part of the tooling in bioinformatics is that a lot of the pipelines eventually involve R, and R packages are really not meant to be run as scripts. I am still struggling to find a way of running R scripts with easy debugging from the command line.

Useful bash aliases

alias sint="srun --pty -p interactive -t 0-2:00 --mem 2G -c 2 /bin/bash" 
alias ws="watch squeue -u $(whomai)" 

Snakefile cluster config

{
    "__default__" :
    {
        "n" : 1,
        "c" : 1,
        "p" : "short",
        "mem" : "4G",
        "t" : "0-0:30"
    },
    "metaspades" :
    {
        "c" : 4,
        "mem" : "8G",
        "t" : "0-0:45"
    },
    "cutadapt" :
    {
        "c" : 4,
        "mem" : "2G",
        "t" : "0-1:00"
    },
    "kraken2" :
    {
        "c" : 4,
        "mem" : "64G",
        "t" : "0-0:30"
    },
    "usearch" :
    {
        "t" : "0-1:00"
    },
    "blca" :
    {
        "mem" : "8G",
        "t" : "0-1:00"
    },
    "bowtie2" :
    {
        "c" : 8,
        "mem" : "8G",
        "t" : "0-0:30"
    },
    "blastn" :
    {
        "c" : 6,
        "t" : "0-"
    }
}

And finally this is a useful sbatch template which sends you an email when the job finishes.

    #!/bin/bash
    #SBATCH -p short
    #SBATCH -t 0-1:0:0
    #SBATCH -c 4
    #SBATCH --mem=16G
    #SBATCH -o %j.out ##  %j uses job id
    #SBATCH -e %j.err ##
    #SBATCH --mail-user=you@email.id
    #SBATCH --mail-type=END
    # module load commands
    YOUR COMMAND HERE