Snakemake - override LSF cluster configuration (bsub) according to rules
Is it possible to define defaults for memory and resources in the cluster configuration file and then override this rule if necessary? resources
Is the field in the rules directly related to the cluster configuration file? Or is this just a fancy way to field params
for readability?
In the following example of how to use the default cluster configuration rule a
, but use custom changes ( memory=40000
and rusage=15000
) in rule b
?
cluster.json:
{
"__default__":
{
"memory": 20000,
"resources": "\"rusage[mem=8000] span[hosts=1]\"",
"output": "logs/cluster/{rule}.{wildcards}.out",
"error": "logs/cluster/{rule}.{wildcards}.err"
},
}
Snakefile:
rule all:
'a_out.txt', 'b_out.txt'
rule a:
input:
'a.txt'
output:
'a_out.txt'
shell:
'touch {output}'
rule b:
input:
'b.txt'
output:
'b_out.txt'
shell:
'touch {output}'
Command to execute:
snakemake --cluster-config cluster.json
--cluster "bsub -M {cluster.memory} -R {cluster.resources} -o logs.txt"
-j 50
I understand that it is possible to define requirements for specific rule requirements in the cluster configuration file, but I would prefer to define them directly in the Snakefile if possible.
Or if there is a better way to implement this please let me know.
source to share
You can directly add resources
to each of your rules:
rule all:
'a_out.txt' , 'b_out.txt'
rule a:
input:
'a.txt'
output:
'a_out.txt'
resources:
mem_mb=40000
shell:
'touch {output}'
rule b:
input:
'b.txt'
output:
'b_out.txt'
resources:
mem_mb=20000
shell:
'touch {output}'
And then you have to remove the parameter resources
from .json
yours so that the command line doesn't override snakefile:
new.cluster.json:
{
"__default__":
{
"output": "logs/cluster/{rule}.{wildcards}.out",
"error": "logs/cluster/{rule}.{wildcards}.err"
},
}
source to share