SLURM notes
The compute cluster at HMS uses SLURM as a job scheduler. It has taken be a good 6 months of working on the cluster on and off to have gotten efficient at using it as a compute cluster and not just a high powered node on the interactive partition.
The painful part of the tooling in bioinformatics is that a lot of the pipelines eventually involve R, and R packages are really not meant to be run as scripts. I am still struggling to find a way of running R scripts with easy debugging from the command line.
Useful bash aliases
alias sint="srun --pty -p interactive -t 0-2:00 --mem 2G -c 2 /bin/bash"
alias ws="watch squeue -u $(whomai)"
Snakefile cluster config
{
"__default__" :
{
"n" : 1,
"c" : 1,
"p" : "short",
"mem" : "4G",
"t" : "0-0:30"
},
"metaspades" :
{
"c" : 4,
"mem" : "8G",
"t" : "0-0:45"
},
"cutadapt" :
{
"c" : 4,
"mem" : "2G",
"t" : "0-1:00"
},
"kraken2" :
{
"c" : 4,
"mem" : "64G",
"t" : "0-0:30"
},
"usearch" :
{
"t" : "0-1:00"
},
"blca" :
{
"mem" : "8G",
"t" : "0-1:00"
},
"bowtie2" :
{
"c" : 8,
"mem" : "8G",
"t" : "0-0:30"
},
"blastn" :
{
"c" : 6,
"t" : "0-"
}
}
And finally this is a useful sbatch template which sends you an email when the job finishes.
#!/bin/bash
#SBATCH -p short
#SBATCH -t 0-1:0:0
#SBATCH -c 4
#SBATCH --mem=16G
#SBATCH -o %j.out ## %j uses job id
#SBATCH -e %j.err ##
#SBATCH --mail-user=you@email.id
#SBATCH --mail-type=END
# module load commands
YOUR COMMAND HERE