Using remote targets

remote targets are targets that reside on remote hosts. They are only handled inside task because only tasks are executed remotely.

All remote targets

The most straigforward way to specify a remote target is through the use of remote() function, which converts a local target to a remote target that will only be resolved during task execution.

For example, the following example specifies both input and output of the step to be on the remote host. The steps would be executed on the remote host with no data synchronization.

In [2]:
1 task completed.

The dd and bnn converters might be confusing. Basically, d obtains the parent directory of the input so dd obtains the grand parent directory (which is R376-8-P8 for the case of R376-8-P8/raw_data/....fastq.gz. The n converter removes the extension from the input filename, and we need to use nn to remove .gz and then .fastq from the filename.

Note that the remote function accepts multiple arguments and lists of inputs so you could apply it to multiple input files in the format of remote('file1', 'file2') if there are multiple input or output files.

Mixed remote and local targets

You can mix remote and local targets in a step. For example, you can pass a local resource file to the remote host and retrieve results from a remote host once the task is completed. It is, however, important to remember that local targets should be relative to local filesystem and remote targets should be relative to remote filesystem.

The following example specifies a local output file to request the result to be transferred back from the remote host once the task is completed. Basically,

  1. The input of the step is remote so it is not handled locally.
  2. The input is resolved to ~/RNASeq/R376-8-P8/raw_data/R376-8-P8_S3_L001_R1_001.fastq.gz on the remote host.
  3. The output is a local file ~/RNASeq/R376-8-P8/QC/R376-8-P8_S3_L001_R1_001_fastqc.html (e.g. /home/user/RNASeq/...), which will be translated to remote host (e.g. /scratch/user/RNASeq/...). It is your responsibility to make sure that the translated output is the output of the task.
  4. After the task is completed, the output file is transferred to local host.
In [4]:
1 task completed.

The above example uses

output: "~/RNASeq/R376-8-P8/QC/R376-8-P8_S3_L001_R1_001_fastqc.html"

to specify output directly. It is possible to use input to specify output as the first two examples, but {_input} is now a remote target and cannot be used directly. The trick is to use a R (resolve) converter to obtain the string representation of {_input} before using it for string interpolation.

In [5]:
1 task completed.