- Difficulty level: intemediate
- Time need to lean: 20 minutes or less
- Key points:
- Variables defined in steps are not accessible from other steps
- Variables can be
shared
to steps that depends on it through targetsos_variable
SoS executes each step in a separate process and by default does not return any result to the master SoS process. Option shared
is used to share variables between steps. This option accepts:
- A string (variable name), or
- A map between variable names and expressions (strings) that will be evaluated upon the completion of the step.
- A sequence of strings (variables) or maps.
For example,
The dict
format of shared
option allows the specification of expressions to be evaluated after the completion of the step, and can be used to pass pieces of step_output
as follows:
When we shared
variables from a step, the variables will be available to the step that will be executed after it. This is why res
and stat
would be accessible from step 20
after the completion of step 10
. However, in a more general case, a step would need to depends on a target sos_variable
to access the shared
variable in a non-forward stype workflow.
For example, in the following workflow, two sos_variable
targets creates two dependencies on steps notebookCount
and lineCount
so that these two steps will be executed before default
and provide the required variables.
When you share a variable from a step with multiple substeps, there can be multiple copies of the variable for each substep and it is uncertain which copy SoS will return. Current implementation returns the variable from the last substep, but this is not guaranteed.
For example, in the following workflow multiple random seeds have been generated, but only the last seed
is shared outside of step 1
and obtained by step 2
.
If you would like to see the variable in all substeps, you can prefix the variable name with step_
, which is a convention for option shared
to collect variables from all substeps.
You can also use the step_*
vsriables in expressions as in the following example:
Here we used group_by='all'
to collapse multiple substeps into 1.
Variables generated by external tasks adds another layer of complexity because tasks usually do not share variables with the substep it belongs. To solve this problem, you will have to use the shared
option of task
to return the variable to the substep: