institute of biotechnology >> brc >> bioinformatics >> internal >> biohpc lab: user guide
 

BioHPC Lab:
User Guide

 


BioHPC Lab Software

There is 421 software titles installed in BioHPC Lab. The sofware is available on all machines (unless stated otherwise in notes), complete list of programs is below, please click on a title to see details and instructions. Tabular list of software is available here

Please read details and instructions before running any program, it may contain important information on how to properly use the software in BioHPC Lab.

, 454 gsAssembler or gsMapper, a5, ABruijn, ABySS, AdapterRemoval, Admixtools, Admixture, albacore, Alder, AlleleSeq, ALLMAPS, ALLPATHS-LG, AMOS, AMPHORA, analysis, ANGSD, Annovar, apollo, Arlequin, Atlas-Link, ATLAS_GapFill, ATSAS, Augustus, bamtools, Basset, BayeScan, BBmap, BCFtools, bcl2fastq, BCP, Beagle, Beagle4, Beast2, bedops, BEDtools, bfc, bgc, biobambam, Bioconductor, BioPerl, BioPython, Birdsuite, Bismark, blasr, BLAST, blast2go, BLAT, bmtagger, Boost, Bowtie, Bowtie2, BPGA, breseq, BSseeker2, BUSCO, BWA, canu, CAP3, cBar, CBSU RNAseq, cd-hit, CEGMA, CellRanger, CheckM, Circos, Circuitscape, CLUMPP, Clustal Omega, CLUSTALW, Cluster, cmake, CNVnator, cortex_var, CrossMap, CRT, cuda, Cufflinks, cutadapt, dadi, dadi-1.6.3_modif, dDocent, DeconSeq, deepTools, delly, destruct, DETONATE, diamond, Discovar, Discovar de novo, distruct, Docker, dREG, Drop-seq, dropSeqPipe, dsk, ea-utils, ecopcr, EDGE, EIGENSOFT, EMBOSS, entropy, ermineJ, ete3, exabayes, exonerate, eXpress, FALCON, FALCON_unzip, Fast-GBS, fasta, fastcluster, FastML, fastq_species_detector, FastQC, fastStructure, FastTree, FASTX, fineSTRUCTURE, flash, Flexible Adapter Remover, FMAP, FragGeneScan, freebayes, FunGene Pipeline, GAEMR, GATK, GBRS, GCTA, GEM library, GEMMA, geneid, GeneMark, GeneMarker, Genome STRiP, GenomeMapper, GenomeStudio (Illumina), GenomicConsensus, gensim, germline, GMAP/GSNAP, GNU Compilers, GNU parallel, Grinder, GROMACS, Gubbins, HapCompass, HAPCUT, HAPCUT2, hapflk, HaploMerger, Haplomerger2, HapSeq2, HiC-Pro, HISAT2, HMMER, Homer, HOTSPOT, HTSeq, HUMAnN2, HyPhy, iAssembler, IBDLD, IDBA-UD, IgBLAST, IGV, IMa2, IMa2p, IMAGE, impute2, INDELseek, infernal, InStruct, InteMAP, InterProScan, iRep, java, jbrowse, jellyfish, JoinMap, julia, jupyter, kallisto, Kent Utilities, khmer, LACHESIS, lcMLkin, LDAK, leeHom, LINKS, LocusZoom, longranger, LUCY, LUCY2, LUMPY, MACS, MaCS simulator, MACS2, MAFFT, Magic-BLAST, MAKER, MAQ, MASH, MaSuRCA, Mauve, MaxBin, mccortex, megahit, MEGAN, MEME Suite, MERLIN, MetaBAT, metaCRISPR, MetAMOS, MetaPathways, MetaPhlAn, MetaVelvet, MetaVelvet-SL, Migrate-n, mira, miRDeep2, MISO (misopy), MixMapper, MKTest, MMSEQ, mothur, MrBayes, mrsFAST, msld, MSMC, msprime, MSR-CA Genome Assembler, msstats, MSTMap, mugsy, MultiQC, MUMmer, muscle, MUSIC, muTect, ncftp, Nemo, Netbeans, NEURON, new_fugue, NextGenMap, NGSadmix, ngsDist, ngsF, ngsTools, NGSUtils, Novoalign, NovoalignCS, Oases, OBITools, Orthomcl, PAGIT, PAML, pandas, pandaseq, Panseq, PASA, PASTEC, pbalign, pbh5tools, PBJelly, PBSuite, PeakRanger, PeakSplitter, PEAR, PennCNV, PGDSpider, ph5tools, Phage_Finder, PHAST, PHYLIP, PhyloCSF, phylophlan, PhyML, Picard, Pindel, piPipes, PIQ, Platypus, plink, Plotly, popbam, prinseq, prodigal, progressiveCactus, prokka, pyRAD, Pyro4, PySnpTools, PyTorch, PyVCF, QIIME, QIIME2 q2cli, QTCAT, Quake, QuantiSNP2, QUAST, QUMA, R, RACA, RADIS, RAPTR-SV, RAxML, Ray, Rcorrector, RDP Classifier, REAPR, RepeatMasker, RepeatModeler, RFMix, RNAMMER, rnaQUAST, Roary, Rqtl, Rqtl2, RSEM, RSeQC, RStudio, sabre, SaguaroGW, samblaster, Samtools, Satsuma, Satsuma2, scikit-learn, scythe, selscan, Sentieon, SeqPrep, sgrep, sgrep sorted_grep, SHAPEIT, shore, SHOREmap, shortBRED, SHRiMP, sickle, SignalP, simuPOP, skewer, SLiM, smcpp, SMRT Analysis, snakemake, snap, SNAPP, SNeP, SNPhylo, SOAP2, SOAPdenovo, SOAPdenovo-Trans, SOAPdenovo2, SomaticSniper, sorted_grep, SPAdes, SRA Toolkit, srst2, stacks, stampy, STAR, statmodels, STITCH, Strelka, StringTie, STRUCTURE, supernova, SURPI, sutta, SVDetect, svtools, SweepFinder, sweepsims, tabix, Tandem Repeats Finder (TRF), TASSEL 3, TASSEL 4, TASSEL 5, tcoffee, TensorFlow, TEToolkit, TMHMM, TopHat, traitRate, Trans-Proteomic Pipeline (TPP), TransComb, TransDecoder, transrate, TRAP, treeCl, treemix, trimmomatic, Trinity, Trinotate, tRNAscan-SE, UCSC Kent utilities, UMI-tools, usearch, Variant Effect Predictor, VarScan, vcf2diploid, vcfCooker, vcflib, vcftools, Velvet, VESPA, ViennaRNA, VIP, VirusDetect, VirusFinder 2, VizBin, vsearch, WASP, wgs-assembler (Celera), Wise2 (Genewise), Xander_assembler, yaha

Details for GATK (hide)

Name:GATK
Version:3.8
OS:Linux
About:The Genome Analysis Toolkit or GATK is a software package developed at the Broad Institute to analyze high-throughput sequencing data.
Added:12/13/2011 2:43:50 PM
Updated:11/1/2017 10:59:44 AM
Link:http://www.broadinstitute.org/gsa/wiki/index.php/The_Genome_Analysis_Toolkit
Platform:Ilumina
Notes:The latest version of the program can be launched using the following command:

java -jar /programs/bin/GATK/GenomeAnalysisTK.jar [options]

Older versions of GATK (3.6 and 3.7) are also available, but they are no longer compatible with the Linux version on BioHPC systems and have to be run in the docker environment. For example, to run HaplotypeCaller from version 3.6, follwow these steps:

 

1. Create (as usual) a directory /workdir/myid (replace myid with you own user ID, and copy all input files there.

2. Import the GATK docker image :

 

docker1 import /programs/GenomeAnalysisTK-3.x_docker/ubuntu_oldgatk3.tar

 

3. Create a shell script containing the GATK command(s) you want to run (the script may be created anywhere, for example, in your HOME directory). For example, using a text editor, create a script called run.sh

 

#!/bin/bash

 

GATKDIR=/usr/local/GenomeAnalysisTK-3.6

docker1 run biohpc_myid/ubuntu_oldgatk3 /bin/bash -c "java -Djava.io.tmpdir=/workdir -jar $GATKDIR/GenomeAnalysisTK.jar \
     -T HaplotypeCaller \
     -R /workdir/reference.fa \
     -I /workdir/alignments.bam  \
     --emitRefConfidence GVCF \
     --num_cpu_threads_per_data_thread 8 \
     -o /workdir/output_file.g.vcf"

 

 

Note that in the example above, it is assumed that the input files reference.fa and alignments.bam have been placed in /workdir/myid prior to starting the script. Inside the docker container, /workdir/myid is seen as /workdir - hence the myid part of the path has been omitted from the GATK command fed to docker. The resulting oputput file output_file.g.vcf will be seen (outside of docker) in /workdir/myid. Your input and output files may be organized in a more involved directory structure, but all this structure must be copied under /workdir/myid, otherwise the docker container will not see it. In the GATK command, all paths to input and output files will have to have the myid part omitted.

 

4. Run the script: sh ./run.sh >& run.log &

5. During the run, you can monior the log file run.log as well as any output files produced in /workdir/myid.

6. If you need to kill the run for any reason, you will need to kill the docker container.

 

docker1 ps -a      (In the list of containers, find the CONTAINER ID of your contaainer, e.g., 3d04770dc8bd)

docker1 kill 3d04770dc8bd

 

7. After a finished (or killed) run, regain control of all files created within docker container:

 

docker1 claim

 

 


Notify me if this software is upgraded or changed [You need to be logged in to use this feature]

 

Website credentials: login  Web Accessibility Help