Since vtools association analysis is quite time consuming, meanwhile this job could be easily distributed to run on multiple nodes. Here we show an example to demonstrate how to run vtools association job on cluster using PBS.
The template PBS script is vtools_association_cluster.pbs
in /src/variant_tools
folder. You are supposed to modify the number of nodes, number of cores according to your needs and provide values for PROJECTFOLDER
, COMMAND
, and NUMBER_OF_PROCESSES_PER_NODE
. The current setup is to run main program on one node, then submit calculation tasks to the rest of nodes. Please adjust the bash script to get the node names if needed.
Please make sure the openmpi
module is loaded, so that mpiexec
command could be executed.
Like any other PBS scripts, you first specify the number of nodes and cores assigned to run this job. In this example, we will use four nodes and eight cores from each node. The job will be submitted to a job queue named short.
#!/bin/bash
#PBS -l nodes=4:lowmem:ppn=8,walltime=01:00:00
#PBS -V
#PBS -q short
You need to specify the folder path to the existing vtools project (The folder created by vtools init
command) by assigning the folder path to PROJECTFOLDER
. PROJECTFOLDER could be set as an environmental variable or given a value in the script.
At this step, you should have already imported the data and added necessary annotations. Then you are required to provide the preferred vtools associate command to COMMAND
variable. The vtools associate command line parameters are the same as running this command on local desktop with an additional flag -mpi
to indicate the job will be ran on the cluster.
If you claim to use four nodes, one of the nodes will be used to run main program, the other three nodes will be taken as worker nodes. If you use eight cores per node, then NUMBER_OF_PROCESSES_PER_NODE
could be set to 8.
After PBS script submitted with qsub, the job will be launched to run on the cluster with mpiexec, the communication between main node and worker nodes is handled through zeroMQ. The result could be viewed in output file or saved into database with --to_db
parameter set in associate command.