remote
targets are targets that reside on remote hosts. They are only handled inside task
because only tasks are executed remotely.
The most straigforward way to specify a remote target is through the use of remote()
function, which converts a local target to a remote target that will only be resolved during task execution.
For example, the following example specifies both input
and output
of the step to be on the remote host. The steps would be executed on the remote host with no data synchronization.
The dd
and bnn
converters might be confusing. Basically, d
obtains the parent directory of the input so dd
obtains the grand parent directory (which is R376-8-P8
for the case of R376-8-P8/raw_data/....fastq.gz
. The n
converter removes the extension from the input filename, and we need to use nn
to remove .gz
and then .fastq
from the filename.
Note that the remote function accepts multiple arguments and lists of inputs so you could apply it to multiple input files in the format of remote('file1', 'file2')
if there are multiple input or output files.
You can mix remote and local targets in a step. For example, you can pass a local resource file to the remote host and retrieve results from a remote host once the task is completed. It is, however, important to remember that local targets should be relative to local filesystem and remote targets should be relative to remote filesystem.
The following example specifies a local output
file to request the result to be transferred back from the remote host once the task is completed. Basically,
- The
input
of the step isremote
so it is not handled locally. - The
input
is resolved to~/RNASeq/R376-8-P8/raw_data/R376-8-P8_S3_L001_R1_001.fastq.gz
on the remote host. - The
output
is a local file~/RNASeq/R376-8-P8/QC/R376-8-P8_S3_L001_R1_001_fastqc.html
(e.g./home/user/RNASeq/...
), which will be translated to remote host (e.g./scratch/user/RNASeq/...
). It is your responsibility to make sure that the translated output is the output of the task. - After the task is completed, the output file is transferred to local host.
The above example uses
output: "~/RNASeq/R376-8-P8/QC/R376-8-P8_S3_L001_R1_001_fastqc.html"
to specify output directly. It is possible to use input
to specify output
as the first two examples, but {_input}
is now a remote target and cannot be used directly. The trick is to use a R
(resolve) converter to obtain the string representation of {_input}
before using it for string interpolation.