HpcGridRunner-Sapelo2
Category
Tools
Program On
Sapelo2
Version
1.0.2
Author / Distributor
Please see https://github.com/HpcGridRunner/HpcGridRunner
Description
This tool will help in executing a file full of commands in parallel using a compute cluster.
Running Program
Also refer to Running Jobs on Sapelo2.
- Version 1.0.2, installed in /apps/eb/HpcGridRunner/1.0.2
To use this version of HpcGridRunner, please first load the module with
module load HpcGridRunner/1.0.2
Running a blast job in parallel by splitting the fasta file: You will need to change the YOUR_QUERY_FASTA.fasta to the name of your fasta file. The cmd_template can be change to reflect your blast command. Leave the -query parameter set to __QUERY_FILE__ value and hpc_FASTA_Gridrunner will fill it dynamically. You can also change the values for outfmt, evalue, max_target_seqs and db. Do not change the num_threads value unless you use a conf file that allocates more than one cpu. The grid_conf file specified below will request one CPU core and 3GB of RAM.
#!/bin/bash #SBATCH --partition batch #SBATCH --ntasks=1 #SBATCH --time=48:00:00 #SBATCH --mem=3gb ml hpcgridrunner/1.0.2 ml BLAST+/2.10.1-gompi-2019b hpc_FASTA_GridRunner.pl --grid_conf=$HPCGR_CONF_DIR/sapelo_1c_3G.conf --cmd_template "blastp -query __QUERY_FILE__ -db /db/uniprot/latest/uniprot_sprot -max_target_seqs 1 -outfmt 6 -evalue 1e-5 -num_threads 1" --seqs_per_bin 250 --query_fasta YOUR_QUERY_FASTA.fasta --out_dir blast_result
The blast results for the each of the split group will be in a separate file under the specified output directory. The file name will end in fa.OUT. To gather the blast output you can generally concatenate all the individual outputs. You can use the following command to find and concatenate all the output files.
find OUTPUT_DIR -name '*.fa.OUT' |xargs cat > combined_blast_results.out
Documentation
Please see http://hpcgridrunner.github.io/
Installation
Installed in /apps/eb/HpcGridRunner/1.0.2
System
64-bit Linux