Edit this page on our live server and create a PR by running command !create-pr in the console panel

Named input

  • Difficulty level: easy
  • Time need to learn: 10 minutes or less
  • Key points:
    • Use dictionary or keyword arguments to specify labels of input
    • _input[name] return subset of _input label name

Named inputs

Let us first create a few temporary files as inputs of the examples

In [1]:

in SoS, we usually specify one or more files as input of a SoS steps, and refer to them as variable _input:

In [2]:
a.txt b.txt

Using keyword parameters, you can assign labels to these files and access them separately:

In [3]:
input of the substep is a.txt b.txt
input of the substep with label A is a.txt
input of the substep with label B is b.txt

Note that although _input['A'] and _input['B'] are used to refer to subsets of _input, the variable _input can still be used and refers to all input files.

Named input can be used to pick a subset of input for the specification of step output. For example, in the following print statement, _input["data"], _input["reference"] etc are used to obtain subsets of _input. These subsets of inputs are called named inputs. Here we use group_by='pairlabel' to group step_input["data"]. Please refer to option group_by for details.

In [4]:
Input of step is a.txt ref.txt with labels ['data', 'data', 'reference']

Input data is a.txt
Reference is ref.txt

Output is a.res

Input of step is b.txt ref.txt with labels ['data', 'data', 'reference']

Input data is b.txt
Reference is ref.txt

Output is b.res

In addition to the use of keyword arguments, you can use a dictionary directly to specify inputs with names:

In [5]:

Named input inherited from named output

Input created from named output will inherit their labels, unless the labels are overriden by keyword argument in the input statement.

For example, in the following workflow, step default gets the outputs from step A and B using function output_from(['A', 'B']). Because the default labels for output from steps A and B are A and B respectively, you can differentiate the inputs using _input['A'] and _input['B'].

In [6]:
[###] 3 steps processed (3 jobs completed)

However, if you use keyword arguments in the input statement, the default or inherited labels will be overridden:

In [7]:
[###] 3 steps processed (1 job completed, 2 jobs ignored)