- Difficulty level: easy to intermediate
- Time need to lean: 30 minutes or less
- Key points:
- Steps can access system configurations through variable
CONFIG
,__version__
- Steps can access workflow and step information through variables
step_name
,step_id
,workflow_id
, andmaster_id
- Steps have step and substep specific variables such as
step_input
,_input
,_depends
,_output
- Steps can use variables defined locally and in global sections
- Steps can inherit variables passed as output of other steps, e.g. from
output_from
- Steps can use variables shared by other steps through
sos_variable
- Steps can access system configurations through variable
SoS steps are isolated in the sense that they have access to limited set of variables and by default do not share variables. This tutorial lists all variables that can be used in a step.
Variables SOS_VERSION
and CONFIG
SOS_VERSION
is a string for version of the SoS interpreterCONFIG
is a dictionary containing content from multiple system configuration files
SoS provides two system variables, one is SOS_VERSION
, which is the version of the SoS interpreter, the other one is CONFIG
, which is a dictionary that contains all the configurations stored in system configuration files, and configurations specified with option -c
(config).
Variable SOS_VERSION
is just a string
Variable CONFIG
is a dictionary and can be much more complex. For this particular system, it contains keys localhost
, hosts
, and user_name
,
and you can access values in this dictionary just like any other dictionary:
Note that CONFIG
only contains content from system configuration files {SOS_DIRECTORY}/site_config.yml
, ~/.sos/hosts.yml
, and ~/.sos/config.yml
. If you have your own configuration in a file in JSON or YMAL format, you can specify it from command line using option -c
. The content of that file will then be available in variable CONFIG
.
For example, using a report
action, the following workflow creates a local.yml
file with content
my_setting: 5
Then, with option -c local.yml
, the content of this file becomes part of CONFIG
and can be used as
CONFIG['my_setting']
Variables step_name
, step_id
, workflow_id
and master_id
step_name
: name of the stepstep_id
: Hash ID of the step, which is determined by the content of the stepworkflow_id
: Hash ID of the workflow in which the step is defined. It would be the ID of the nested workflow if the workflow is nested.master_id
: Hash ID of the entire workflow, regardless if the step is defined in a nested workflow.
SoS passes identifications of workflows and steps as variables during the the execution of a step. For example, in the following example, two steps have different step names and workflow IDs because nested
is a nested workflow, but they share the same master_id
.
These variables are mostly used internally although you can use them to create step- or workflow-specific log messages or output files. step_name
, however, is an exception because it is useful for the creation of steps that can be used by multiple workflows.
For example, if you have a workflow that can handle both human and mouse data, you can define a section that can be used by both workflows. Inside the step, you can use variable step_name
to determine which workflow is being executed and act accordingly:
In SoS, the input
statement mostly creates a step_input
object with provided parameters. That is to say,
input: 'a.txt', 'b.txt', group_by=1
is almost equivalent to
step_input = sos_targets('a.txt', 'b.txt', group_by=1)
and we can use sos_targets
objects directly in an input
statement in more complicated cases.
Global sections of a workflow can be considered as part of the step process. As a matter of fact, because steps and substeps are executed in separate processes, statements in global sections will be executed repeatedly for each step and substep. Consequently, you can use variables or parameters defined in the global sections in each step.
However, because steps are executed separately, although variables defined by the global section are shared by all steps, changing these variables in a step will not affect variables in other steps:
_output
of a step is a sos_target
object. Similar to other SoS targets, you can attach variables to this variable. Interestingly, when _output
is passed to another step and becomes the _input
of a substep, its associated variables will be accessible from the substep.
For example, in the following workflow, an attribute my_i
is assigned to each _output
with value of variable i
. When step 20
inherits the output of step 10
and executes the substeps, my_i
becomes available to each substep.
The same holds for outputs imported by functions named_output
and output_from
. For example, the use of output_from('A')
in the following workflow imports the step_output
and its groups from step A
, and my_i
attached to _input
becomes available in the substeps of default
.
Inherited output variables can be tricky to use
Whereas attaching variables to _output
allows you to attach information to output of a step and pass such information around with them, use of such variables can be tricky because they are defined implicitly by _output.set()
and can appear magical to users who are not familiar with SoS.
Sharing variables between steps
Sharing of variables can be achieved by using shared
section option from the source step and depends: sos_variable(name)
from the destination step. The usage pattern is
[A: shared='var'] var = ... ... [B] depends: sos_variable('var')
Finally, if you really want to access variables created in another step, you will have to explicitly share
that variable between the steps. For example, in the following workflow, a variable var
is defined in step A
, which is exposed to other steps through the shared
step option. In step default
, a depends: sos_variable('var')
is defined to makes sure that step A
is executed before step default
, and transfers variables from A
to default
.
Tutorial How to pass variables between SoS steps describes the shared
section option in more detail.