You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add alias myjob="qsub -I -q glean -l nodes=1:ppn=2 -l
walltime=08:00:00" to your ~/.bashrc, then source ~./bashrc.
You can then use myjob to quickly start an interactive job
without needing to remember the details.
Lack of configuration file support, like yaml, json.
Lack of support for jobs on HPC
Snakemake may die in the future, but make should be still alive.
Snakemake:
written in Python, which makes it simple to use
support config files like yaml, json.
support HPC: both PBS and Slurm
How it looks like:
## optional config fileconfigfile: "config.yaml"content="say hi"samples= ["a", "b"]
## all the output files you want to haveruleall:
input:
expand("flag/pre_{s}.done", s=samples)
expand("flag/first_{s}.done", s=samples)
## then set up the rules about how to generate themrulepre:
output:
# s will be infered based on rule all output# here it will be a or b.# snakemake will run s=a and s=b in parallel if possible# touch will be automatically generate the flag file# once the rule is done.touch("flag/pre_{s}.done")
log:
"log/{s}.log"shell:
""" # wildcards.s to get the a / b echo "pre:" {content} {wildcards.s} 2> {log} """rulefirst:
# snakemake will know that it depends on the output of pre# then the rule will run after preinput:
"flag/pre_{s}.done"output:
touch("flag/first_{s}.done")
log:
"log/{s}.log"shell:
""" echo "first:" {content} {wildcards.s} 2> {log} """
Then save the above into a file named demo.snakefile.
snakemake --snakefile demo.snakefile to run the snakemake.
Summary
snakefile
just python syntax + snakemake key words, such as config,
rule, expand, wildcards and so on.
[OPTION but good in practice] config.yaml to claim variables
The input in rule all is used to claim all the final outputs
input and output of rules to organize the depencies of tasks.
wildcards
inferred based on file names
the key mechanism to run multiple jobs in parallel
Use Snakemake to control the jobs in TSCC
Use profile to setup the particular enviroment of HPC