Edit this page on our live server and create a PR by running command !create-pr in the console panel

Explicit step and workflow dependency

  • Difficulty level: intemediate
  • Time need to lean: 10 minutes or less
  • Key points:
    • A step can explicitly depend on another step with target sos_step(step_name)
    • A step can explicitly depend on another workflow with target sos_step(workflow_name)

A step can depend on the execution of another step that provides the required input or dependent targets of the step. However, if the target to be executed does not produce any output, you can depend on the step itself instead of its outputs.

Use sos_step to depend on a specific step

The sos_step target represents, needless to say, a SoS step. This target provides a straightforward method to specify step dependencies. For example, as you can see from the DAG of the workflow, adding sos_step("init") to the depends statement of step 10 forces the execution of step init before step 10.

In [1]:
[##] 2 steps processed (2 jobs completed)

Dependings on a workflow

Technically sos_step('init') creates a dependency on a workflow init with a single step. This can be extended to workflows with multiple steps. For example, in the following workflow, the default step depends on target sos_step('work') so steps work_1 and work_2 are executed before step default.

In [2]:
[.32m.32m#.32m.32m##] 3 steps processed (3 jobs completed)
> sos_step_wf.dot (859 B):
No description has been provided for this image

If you have learned function sos_run and knows how to execute nested workflows, you might be wondering the difference between

depends: sos_step('work')

and

sos_run('work')

The difference is clear when we look at the output of the following workflow and observe the DAG and the output. When you run a nested workflow, a separate DAG is created and executed inside step default, so the default step is executed, and within this step another workflow work is created and executed. In contrast, the default step of the previous workflow is executed after the work workflow.

In [3]:
INFO: Running default:
INFO: Running work_1:
INFO: work_1 output: result.txt
INFO: Running work_2:
INFO: work_2 output: result.txt.bak
INFO: Workflow default (ID=312e32b9792bd988) is executed successfully with 3 completed steps.
INFO: Workflow DAG saved to sos_run_wf.dot
> sos_run_wf.dot (917 B):
No description has been provided for this image

Depending on one step of a numerically indexed workflow

When you add a step as a dependency to another step, you are introducing a new set of dependencies, and perhaps a new set of nodes to the workflow. For example, because step B depends on step A, adding sos_step('B') to the default step actually adds both steps B and A.

In [4]:
[#.32m.32m##] 3 steps processed (3 jobs completed)

This also works for implicit dependencies of numerically indexed workflows where step dependencies are created by default by the indexes. For example, in the following workflow, steps B_1 and B_2 creates their own output files and do not appear to be related. However, adding step B_2 through sos_step('B_2') will also add step B_1 because B_2 depends on B_1.

In [5]:
[###] 3 steps processed (3 jobs completed)

If this is not what you want, namely you only want to include step B_2, you will need to state explicitly that step B_2 does not have any input. For example, if you add input: None to step B_2, B_1 will no longer be included with B_2.

In [6]:
[##] 2 steps processed (2 jobs completed)