- Difficulty level: easy
- Time need to lean: 10 minutes or less
- Key points:
- Task options
walltime
,nodes
,cores
,mem
specifies the resources needed for tasks - These options could be adjusted for
trunk_size
andtrunk_workers
options - These options will be presented to task templates in standardized formats
- These options could be overriden by values specified on command line
- Task options
When you submit tasks to a cluster system, you will need to specify resources required for your task. These resources could be specified from command line, but are mostly specified in the submitted jobs as comments.
A typical PBS shell script would look like
#!/bin/bash
#PBS -N 1fafb4489a7b4b47
#PBS -l nodes=1:ppn=1
#PBS -l walltime=01:00:00
#PBS -l mem=2GB
#PBS -o ~/.sos/tasks/1fafb4489a7b4b47.out
#PBS -e ~/.sos/tasks/1fafb4489a7b4b47.err
#PBS -m n
module load R
sos execute 1fafb4489a7b4b47 -v 2 -s force -m interactive
and is expanded from a task_template
similar to this
hosts:
htc:
queue: medium
task_template: |
#!/bin/bash
#PBS -N {job_name}
#PBS -l nodes={nodes}:ppn={cores}
#PBS -l walltime={walltime}
#PBS -l mem={mem//10**9}GB
#PBS -q {queue}
#PBS -o ~/.sos/tasks/{task}.out
#PBS -e ~/.sos/tasks/{task}.err
#PBS -m n
#PBS -v {workdir}
module load R
{command}
The template has the following variables
job_name
: this is typically justtask
command
: this is supplied by SoS, which is asos execute ...
nodes
,cores
,walltime
,mem
: resource parametersworkdir
, which is the current working directory, that will be translated to remote host- Customized variables such as
queue
All these variables will need to be properly specified to successfully generate the task execution script.
The resource options such as
walltime
andcores
will be sent to individual task queues in appropriate format. You do not have to specify all options because task queues can support a subset of these options and some task queues provide default values (and some do not). It is however generally a good idea to specify them all so that your tasks could be executed on all types of task queues.The execution options such as
workdir
,env
,concurrent
specify environments in which tasks will be submitted and executed.
task_template
is expanded with variables defined in
- Command line
- Task options
- Host definition
in that order when it is first found.
As a less known feature, option -q
accepts KEY=VALUE
definitions in addition to the name of a queue. For example,
%run -q queue=long walltime=24:00:00
will specify variables queue
and walltime
with values 'long'
and '24:00:00'
respectively, which will override variables defined in task options and host definitions.
The task
statement accepts arbitrary keyword arguments. SoS will process reserved arguments such as walltime
and mem
and pass the rest directly to task_template
. For example, you can define variables for the template as
task: queue='htc', walltime='24:00:00', queue='long', mem='4G'
Configuration of a host can have any keys, which can be used as default values of the variables. For example, in the aforementioned example, queue
is defined as medium
so a medium
queue on the cluster will be used if it is not defined in task options or command line.
SoS recognizes the following resource-related variables that are commonly used in task templates. Because different cluster systems use different syntax for these variables, SoS accepts a varity of input for these parameters and pass a standard format to templates.
Estimated maximum running time of the task. This parameter will be sent to different task queues and it is up to the task queue to decide if the task would be killed if the task could not be completed within specified walltime
.
walltime
could be specified as a string in the format of HH:MM:SS
where HH
, MM
and SS
are hours, minutes, and seconds, or an integer with units s
(second), m
(minute), h
(hour), or d
(day). SoS converts all input into format HH:MM:SS
when walltime
is passed to task_template
. That is to say, you could use walltime='120m'
or walltime='2h'
and the templates will see `walltime='02:00:00' in both acses.
Number of computing nodes that a task will use, default to 1.
Number of cores on each computing node, which corrsponds to the ppn
option of a PBS system. This option is default to 1 if left unspecified.
The total amount of memory needed across all nodes. The default unit is bytes so you can specify an integer (of bytes) to this option. It is however more convenient to specify it with other units such as megabytes (mem=4000MB
). gigabytes (mem=4GB
) or gibibytes (mem=4GiB
), although all inputs are converted to bytes internally. To use this option in a job_template
, you generally need to use expressions such as {mem//1e9}GB
to convert it to a cluster-specific format.
Options walltime
, mem
, cores
, nodes
defines resources required for a single task. If multiple tasks are combined into a master task with options trunk_size
and trunk_workers
, the resources for the master tasks will be automatically calculated. For example, trunk_size=10
will increase walltime
by 10 folds, and trunk_workers=2
will cut total watlltime
by half, but double cores
and mem
. Please see Combining tasks (options trunk_size
and trunk_workers
) for details.