Edit this page on our live server and create a PR by running command !create-pr in the console panel

Workflow-execution magics

  • Difficulty level: easy
  • Time need to lean: 15 minutes or less
  • Key points:
    • SoS workflows can be embedded in Jupyter notebook
    • Magic %run executes workflows defined in the current cell
    • Magic %sosrun executes workflows defined in the entire notebook
    • Magic %runfile executes workflows defined in specified file

SoS Notebook is an IDE for SoS workflow and allows the development and execution of workflows in a Jupyter environment.

Scratch workflow steps

SoS workflow is extended from Python 3.6. You can execute any Python statement using the SoS kernel. That is to say, you can use a SoS kernel just like a Python3 kernel.

For example, the following cell uses SoS to execute a python statement, which is considered as a simple SoS step without header.

In [1]:
This is our first greeting: Hello world

In addition to regular Python statements, you can use SoS-specific syntax, functions, and statements in SoS cells.

For example, the following cell uses statement output to specify step output, and a sh function written in script format.

In [2]:

The statements are executed in a global SoS namespace so variables defined in another cell (greeting) can be used here.

Technically speaking, we have justed executed a single SoS step in a global SoS namespace. Such steps are called scratch steps because they do not contain a header.

In contrast, a formal SoS step is defined as a step with a header. Formal SoS steps and workflows have to be executed by SoS magics or commands. As a matter of fact, nothing will happen if you execute the following cell in jupyter.

In [3]:

Magic %run

Magic %run executes workflows defined in the current cell. SoS starts an external sos process, execute the workflow and displays the output in the notebook. For example, the hello workflow could be executed as follows:

In [4]:
This is our first hello world workflow

The workflow is executed independently and does not share any variables in the SoS kernel. For example, you cannot use variable greeting in the workflow,In

In [5]:
ERROR: [greet-wrong]: [0]: 
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
script_3050659269724098055 in <module>
----> print(f'This is our first greeting: {greeting}')
      

NameError: name 'greeting' is not defined
RuntimeError: Workflow exited with code 1

To pass variables to these workflows, you will have to define variables as parameters and pass them from command line.

In [6]:
> %run --greeting 'Hello world' -v0
[#] 1 step processed (1 job completed)

Note that SoS expands { } in the %sos magic so the actual magic executed was %run --greeting Hello world -v0

Magic %sosrun

A SoS notebook can have multiple workflow sections defined in multiple code cells. These sections constitute the content of the embedded SoS script of the notebook. For example, the following steps, defined in three separate cells, are all part of the embedded SoS script of this notebook.

In [7]:
In [8]:
In [9]:

The easiest way to view the embedded script of a SoS notebook is to use the %preview --workflow magic as follows (The option -n lists the script in the notebook instead of the console panel). As you can see, the embedded script consists of steps from the entire notebook.

In [10]:
#!/usr/bin/env sos-runner
#fileformat=SOS1.0

[hello]
print('This is our first hello world workflow')

[hello-world]
print('This is our first hello world workflow')

[greet-wrong]
print(f'This is our first greeting: {greeting}')

[greet]
print(f'This is our first greeting: {greeting}')

[global]
excel_file = 'data/DEG.xlsx'
csv_file = 'DEG.csv'
figure_file = 'output.pdf'

[plot_1]
sh: expand=True
    xlsx2csv {excel_file} > {csv_file}

[plot_2]
R: expand=True
    data <- read.csv('{csv_file}')
    pdf('{figure_file}')
    plot(data$log2FoldChange, data$stat)
    dev.off()

The %sosrun magic can be used to execute any of the workflows defined in the notebook. For example, the following magic execute the workflow plot defined in the above section. Because multiple workflows are defined in this notebook (hello_world, and plot), a workflow name is required for this magic.

In [11]:
null device 
          1 

Magic %runfile

The third magic to execute SoS workflows in SoS Notebook is to use the %runfile magic, which execute workflows from a specified external file. For example, instead of using magic %sosrun, you can execute the current notebook with magic

In [12]:
null device 
          1 

Command sos

The %sosrun magic calls an external command sos to execute workflows defined in the notebook. Although for the sake of convenience we will use magic %run to execute workflows throughout this documentation, please remember that you can execute the notebook using command sos from command line.

running notebook from command line

Alternatively, you can also write the workflow in a text file (usually with extension .sos) and execute it with command sos run:

running script from command line

Running workflows in background

SoS Notebook usually starts a workflow and waits until the workflow is completed. If the workflow takes a long time to execute, you can send workflows to a queue in which workflows will be executed one by one while you continue to work on the notebook. A status table will be displayed for each queued workflows and log messages and results will continue to send back to SoS Notebook.

In [13]:
0
1
2
3
4