Edit this page on our live server and create a PR by running command !create-pr in the console panel

Input options paired_with and group_with

  • Difficulty level: easy
  • Time need to lean: 10 minutes or less
  • Key points:
    • SoS targets can have arbitrary attributes
    • Options paired_with and group_with makes it easy to attach variables to groups of targets

Option paired_with

Option paired_with assigns attributes to each of the targets in step_input. For example,

In [1]:
mkdir: case: File exists
mkdir: ctrl: File exists
Sample case/A1.bam is of type case
Sample case/A2.bam is of type case
Sample ctrl/A1.bam is of type ctrl
Sample ctrl/A2.bam is of type ctrl

Here the dictionary syntax expands to

paired_with={'mutated': ['case', 'case', 'ctrl', 'ctrl']}

and basically assigns each values to attribute mutated of each target.

Although this example is not particularly exciting, it becomes useful when the step_input is groupped,

In [2]:
Group 0
Sample case/A1.bam is of type case
Sample case/A2.bam is of type case

Group 1
Sample ctrl/A1.bam is of type ctrl
Sample ctrl/A2.bam is of type ctrl

The dictionary syntax can be a little long to type so SoS provides a shortcut

paired_with='name'

which is equivalent to

paired_with=dict(_name=name)

note that SoS created variables already have a leading underscore to differentiate from regular variables.

In [3]:
0: _input=case/A1.bam _mutated=case, _sample_name=A1
1: _input=case/A2.bam _mutated=case, _sample_name=A2
2: _input=ctrl/A1.bam _mutated=ctrl, _sample_name=A1
3: _input=ctrl/A2.bam _mutated=ctrl, _sample_name=A2

Another convenience feature is that SoS creates a step level variable from these attributes so that you can access all values at the same time. That is to say, _mutated is created as a shortcut for

[x._mutated for x in _input]
In [4]:
0: _input=case/A1.bam case/A2.bam _mutated=['case', 'case'], _sample_name=['A1', 'A2']
1: _input=ctrl/A1.bam ctrl/A2.bam _mutated=['ctrl', 'ctrl'], _sample_name=['A1', 'A2']

Values to option paired_with are usually lists of the same length as step_input but it can also be other types such as paths and sos_targets, in this case the iterator variables (e.g. _mutated for mutated) will have the same type as the input variable. For example,

Option group_with

Similar to option paired_with that associate variables to input files, you could associate items of a sequence with each substep. This option is applied after group_by and before for_each, which means the length of the sequence should equal to the number of substeps. and the variables will be the same for each for_each loop. Also similar to option paired_with, option group_with can take a string (name of variable) or a dictionary.

Using the above example, you can assign a label for each group by passing name of a sequence variable

In [5]:
0: _input=case/A1.bam case/A2.bam _mutated=case
1: _input=ctrl/A1.bam ctrl/A2.bam _mutated=ctrl

or a dictionary with variable name and values:

In [6]:
0: _input=case/A1.bam case/A2.bam mutated=case
1: _input=ctrl/A1.bam ctrl/A2.bam mutated=ctrl