Edit this page on our live server and create a PR by running command !create-pr in the console panel

Global and local variables

  • Difficulty level: easy to intermediate
  • Time need to lean: 30 minutes or less
  • Key points:
    • Steps can access system configurations through variable CONFIG, __version__
    • Steps can access workflow and step information through variables step_name, step_id, workflow_id, and master_id
    • Steps have step and substep specific variables such as step_input, _input, _depends, _output
    • Steps can use variables defined locally and in global sections
    • Steps can inherit variables passed as output of other steps, e.g. from output_from
    • Steps can use variables shared by other steps through sos_variable

SoS steps are isolated in the sense that they have access to limited set of variables and by default do not share variables. This tutorial lists all variables that can be used in a step.

System variables SOS_VERSION and CONFIG

SoS provides two system variables, one is SOS_VERSION, which is the version of the SoS interpreter, the other one is CONFIG, which is a dictionary that contains all the configurations stored in system configuration files, and configurations specified with option -c (config).

Variable SOS_VERSION is just a string

In [1]:
Out[1]:
'0.21.2'

Variable CONFIG is a dictionary and can be much more complex. For this particular system, it contains keys localhost, hosts, and user_name,

In [2]:
Out[2]:
dict_keys(['localhost', 'dqs-server', 'hosts', 'cutoff', 'user_name'])

and you can access values in this dictionary just like any other dictionary:

In [3]:
Out[3]:
'bpeng1'

Note that CONFIG only contains content from system configuration files {SOS_DIRECTORY}/site_config.yml, ~/.sos/hosts.yml, and ~/.sos/config.yml. If you have your own configuration in a file in JSON or YMAL format, you can specify it from command line using option -c. The content of that file will then be available in variable CONFIG.

For example, using a report action, the following workflow creates a local.yml file with content

my_setting: 5
In [4]:

Then, with option -c local.yml, the content of this file becomes part of CONFIG and can be used as

CONFIG['my_setting']
In [5]:
[#] 1 step processed (1 job completed)

Workflow and step identifications

SoS passes identifications of workflows and steps as variables during the the execution of a step. For example, in the following example, two steps have different step names and workflow IDs because nested is a nested workflow, but they share the same master_id.

In [6]:
[##] 2 steps processed (2 jobs completed)

These variables are mostly used internally although you can use them to create step- or workflow-specific log messages or output files. step_name, however, is an exception because it is useful for the creation of steps that can be used by multiple workflows.

For example, if you have a workflow that can handle both human and mouse data, you can define a section that can be used by both workflows. Inside the step, you can use variable step_name to determine which workflow is being executed and act accordingly:

In [7]:
[#] 1 step processed (1 job completed)

Runtime variables for steps and substeps

step_input and _input

In SoS, the input statement mostly creates a step_input object with provided parameters. That is to say,

input: 'a.txt', 'b.txt', group_by=1

is almost equivalent to

step_input = sos_targets('a.txt', 'b.txt', group_by=1)

and we can use sos_targets objects directly in an input statement in more complicated cases.

Variables defined in global sections

Global sections of a workflow can be considered as part of the step process. As a matter of fact, because steps and substeps are executed in separate processes, statements in global sections will be executed repeatedly for each step and substep. Consequently, you can use variables or parameters defined in the global sections in each step.

In [8]:
[##] 2 steps processed (2 jobs completed)

However, because steps are executed separately, although variables defined by the global section are shared by all steps, changing these variables in a step will not affect variables in other steps:

In [9]:
[##] 2 steps processed (2 jobs completed)

Variables passed from output of other steps

_output of a step is a sos_target object. Similar to other SoS targets, you can attach variables to this variable. Interestingly, when _output is passed to another step and becomes the _input of a substep, its associated variables will be accessible from the substep.

For example, in the following workflow, an attribute my_i is assigned to each _output with value of variable i. When step 20 inherits the output of step 10 and executes the substeps, my_i becomes available to each substep.

In [10]:
[##] 2 steps processed (4 jobs completed)

The same holds for outputs imported by functions named_output and output_from. For example, the use of output_from('A') in the following workflow imports the step_output and its groups from step A, and my_i attached to _input becomes available in the substeps of default.

In [11]:
[##] 2 steps processed (4 jobs completed)

Variables shared from other steps

Finally, if you really want to access variables created in another step, you will have to explicitly share that variable between the steps. For example, in the following workflow, a variable var is defined in step A, which is exposed to other steps through the shared step option. In step default, a depends: sos_variable('var') is defined to makes sure that step A is executed before step default, and transfers variables from A to default.

In [12]:
[##] 2 steps processed (2 jobs completed)

Tutorial How to pass variables between SoS steps describes the shared section option in more detail.