institute of biotechnology >> brc >> bioinformatics >> internal >> biohpc cloud: user guide
 

BioHPC Cloud:
: User Guide

 

 


BioHPC Cloud Software

There are 1095 software titles installed in BioHPC Cloud. The sofware is available on all machines (unless stated otherwise in notes), complete list of programs is below, please click on a title to see details and instructions. Tabular list of software is available here

Please read details and instructions before running any program, it may contain important information on how to properly use the software in BioHPC Cloud.

3D Slicer, 3d-dna, 454 gsAssembler or gsMapper, a5, ABRicate, ABruijn, ABySS, AdapterRemoval, adephylo, Admixtools, Admixture, AF_unmasked, AFProfile, AGAT, agrep, albacore, Alder, AliTV-Perl interface, AlleleSeq, ALLMAPS, ALLPATHS-LG, Alphafold, AMOS, AMPHORA, amplicon.py, AMRFinder, analysis, ANGSD, AnnotaPipeline, Annovar, ant, antiSMASH, anvio, apollo, arcs, ARGweaver, aria2, ariba, Arlequin, ART, ASEQ, aspera, assembly-stats, ASTRAL, atac-seq-pipeline, ataqv, athena_meta, ATLAS, Atlas-Link, ATLAS_GapFill, atom, ATSAS, Augustus, AWS command line interface, AWS v2 Command Line Interface, axe, axel, BA3, BactSNP, bakta, bamsnap, bamsurgeon, bamtools, bamUtil, barcode_splitter, BarNone, Basset, BayeScan, Bayescenv, bayesR, baypass, bazel, BBMap/BBTools, BCFtools, bcl2fastq, BCP, Beagle, Beast2, bedops, BEDtools, bfc, bgc, bgen, bicycle, BiG-SCAPE, bigQF, bigWig, bioawk, biobakery, biobambam, Bioconductor, biom-format, BioPerl, BioPython, Birdsuite, Bismark, Blackbird, blasr, BLAST, BLAST_to_BED, blast2go, BLAT, BlobToolKit, BLUPF90, BMGE, bmtagger, bonito, Boost, Bowtie, Bowtie2, BPGA, Bracken, BRAKER, BRAT-NextGen, BRBseqTools, BreedingSchemeLanguage, breseq, brocc, bsmap, BSseeker2, BUSCO, BUSCO Phylogenomics, BWA, bwa-mem2, bwa-meth, bwtool, cactus, CAFE, caffe, cagee, canu, Canvas, CAP3, caper, CarveMe, catch, cBar, CBSU RNAseq, CCMetagen, CCTpack, cd-hit, cdbfasta, cdo, CEGMA, CellRanger, cellranger-arc, cellranger-atac, cellranger-dna, centrifuge, centroFlye, CFM-ID, CFSAN SNP pipeline, CheckM, CheckM2, chimera, chimerax, chip-seq-pipeline, chromosomer, Circlator, Circos, Circuitscape, CITE-seq-Count, ClermonTyping, clues, CLUMPP, clust, Clustal Omega, CLUSTALW, Cluster, cmake, CMSeq, CNVnator, coinfinder, colabfold, CombFold, compat, CONCOCT, Conda, Cooler, copyNumberDiff, cortex_var, CoverM, crabs, CRISPRCasFinder, CRISPResso, Cromwell, CrossMap, CRT, cuda, Cufflinks, curatedMetagenomicDataTerminal, cutadapt, cuteSV, dadi, dadi-1.6.3_modif, dadi-cli, danpos, DAS_Tool, DBSCAN-SWA, dDocent, DeconSeq, Deepbinner, deeplasmid, DeepTE, deepTools, Deepvariant, defusion, delly, DESMAN, destruct, DETONATE, diamond, dipcall, diploSHIC, discoal, Discovar, Discovar de novo, distruct, DiTASiC, DIYABC, dnmtools, Docker, dorado, DRAM, dREG, dREG.HD, drep, Drop-seq, dropEst, dropSeqPipe, dsk, dssat, Dsuite, dTOX, duphold, DWGSIM, dynare, ea-utils, ecopcr, ecoPrimers, ectyper, EDGE, edirect, EDTA, eems, EgaCryptor, EGAD, EIGENSOFT, elai, ElMaven, EMBLmyGFF3, EMBOSS, EMIRGE, Empress, enfuse, EnTAP, entropy, epa-ng, ephem, epic2, ermineJ, ete3, EukDetect, EukRep, EVM, exabayes, exonerate, ExpansionHunterDenovo-v0.8.0, eXpress, FALCON, FALCON_unzip, Fast-GBS, fasta, FastANI, fastcluster, FastME, FastML, fastp, FastQ Screen, fastq-multx-1.4.3, fastq_demux, fastq_pair, fastq_species_detector, FastQC, fastqsplitter, fastsimcoal2, fastspar, fastStructure, FastTree, FASTX, fcs, feems, feh, FFmpeg, fgbio, figaro, Filtlong, fineRADstructure, fineSTRUCTURE, FIt-SNE, flash, flash2, flexbar, Flexible Adapter Remover, Flye, FMAP, FragGeneScan, FragGeneScan, FRANz, freebayes, FSA, funannotate, FunGene Pipeline, FunOMIC, G-PhoCS, GADMA, GAEMR, Galaxy, Galaxy in Docker, GATK, gatk4, gatk4amplicon.py, gblastn, Gblocks, GBRS, gcc, GCTA, GDAL, gdc-client, GEM library, GEMMA, GeMoMa, GENECONV, geneid, GeneMark, Genespace, genomad, Genome STRiP, Genome Workbench, GenomeMapper, GenomeThreader, genometools, GenomicConsensus, genozip, gensim, GEOS, germline, gerp++, GET_PHYLOMARKERS, gfaviz, GffCompare, gffread, giggle, git, glactools, GlimmerHMM, GLIMPSE, GLnexus, Globus connect personal, GMAP/GSNAP, GNU Compilers, GNU parallel, go-perl, GO2MSIG, GONE, GoShifter, gradle, graftM, grammy, GraPhlAn, graphtyper, graphviz, greenhill, GRiD, gridss, Grinder, grocsvs, GROMACS, GroopM, GSEA, gsort, GTDB-Tk, GTFtools, Gubbins, GUPPY, hail, hal, HapCompass, HAPCUT, HAPCUT2, hapflk, HaploMerger, Haplomerger2, haplostrips, HaploSync, HapSeq2, HarvestTools, haslr, hdf5, hget, hh-suite, HiC-Pro, hic_qc, HiCExplorer, HiFiAdapterFilt, hifiasm, hificnv, HISAT2, HMMER, Homer, HOTSPOT, HTSeq, htslib, https://github.com/CVUA-RRW/RRW-PrimerBLAST, hugin, humann, HUMAnN2, hybpiper, hyperopt, HyPhy, hyphy-analyses, iAssembler, IBDLD, idba, IDBA-UD, IDP-denovo, idr, idseq, IgBLAST, IGoR, IGV, IMa2, IMa2p, IMAGE, ImageJ, ImageMagick, Immcantation, impute2, impute5, IMSA-A, INDELseek, infernal, Infomap, inStrain, inStrain_lite, InStruct, Intel MKL, InteMAP, InterProScan, ipyrad, IQ-TREE, iRep, JaBbA, jags, Jane, java, jbrowse, JCVI, jellyfish, juicer, julia, jupyter, jupyterlab, kaiju, kallisto, Kent Utilities, keras, khmer, kinfin, king, kma, KmerFinder, KmerGenie, kneaddata, kraken, KrakenTools, KronaTools, kSNP, kWIP, LACHESIS, lammps, LAPACK, LAST, lastz, lcMLkin, LDAK, LDhat, LeafCutter, leeHom, lep-anchor, Lep-MAP3, LEVIATHAN, lftp, Liftoff, Lighter, LinkedSV, LINKS, localcolabfold, LocARNA, LocusZoom, lofreq, longranger, Loupe, LS-GKM, LTR_retriever, LUCY, LUCY2, LUMPY, lyve-SET, m6anet, MACE, MACS, MaCS simulator, MACS2, macs3, maffilter, MAFFT, mafTools, MAGeCK, MAGeCK-VISPR, Magic-BLAST, magick, MAGScoT, MAKER, manta, mapDamage, mapquik, MAQ, MARS, MASH, mashtree, Mashtree, MaSuRCA, MATLAB, Matlab_runtime, Mauve, MaxBin, MaxQuant, McClintock, mccortex, mcl, MCscan, MCScanX, medaka, medusa, megahit, MeGAMerge, MEGAN, MELT, MEME Suite, MERLIN, merqury, MetaBAT, MetaBinner, MetaboAnalystR, MetaCache, MetaCRAST, metaCRISPR, metamaps, MetAMOS, MetaPathways, MetaPhlAn, metapop, metaron, MetaVelvet, MetaVelvet-SL, metaWRAP, methpipe, mfeprimer, MGmapper, MicrobeAnnotator, MiFish, Migrate-n, mikado, MinCED, minigraph, Minimac3, Minimac4, minimap2, mira, miRDeep2, mirge3, miRquant, MISO, MITObim, MitoFinder, mitohelper, MitoHiFi, mity, MiXCR, MixMapper, MKTest, mlift, mlst, MMAP, MMSEQ, MMseqs2, MMTK, MobileElementFinder, modeltest, MODIStsp-2.0.5, module, moments, MoMI-G, mongo, mono, monocle3, mosdepth, mothur, MrBayes, mrsFAST, msld, MSMC, msprime, MSR-CA Genome Assembler, msstats, MSTMap, mugsy, MultiQC, multiz-tba, MUMandCo, MUMmer, mummer2circos, muscle, MUSIC, Mutation-Simulator, muTect, MZmine, nag-compiler, nanocompore, nanofilt, NanoPlot, Nanopolish, nanovar, ncftp, ncl, NECAT, Nemo, Netbeans, NEURON, new_fugue, Nextflow, NextGenMap, NextPolish2, nf-core/rnaseq, ngmlr, NGS_data_processing, NGSadmix, ngsDist, ngsF, ngsLD, NGSNGS, NgsRelate, ngsTools, NGSUtils, NINJA, NLR-Annotator, NLR-Parser, Novoalign, NovoalignCS, nQuire, NRSA, NuDup, numactl, nvidia-docker, nvtop, Oases, OBITools, Octave, OMA, Oneflux, OpenBLAS, openmpi, openssl, orthodb-clades, OrthoFinder, orthologr, Orthomcl, pacbio, PacBioTestData, PAGIT, pal2nal, paleomix, PAML, panaroo, pandas, pandaseq, pandoc, PanPhlAn, Panseq, Parsnp, PASA, PASTEC, PAUP*, pauvre, pb-assembly, pbalign, pbbam, pbh5tools, PBJelly, pblat, pbmm2, PBSuite, pbsv, pbtk, PCAngsd, pcre, pcre2, PeakRanger, PeakSplitter, PEAR, PEER, PennCNV, peppro, PERL, PfamScan, pgap, PGDSpider, ph5tools, Phage_Finder, pharokka, phasedibd, PHAST, phenopath, Phobius, PHRAPL, PHYLIP, PhyloCSF, phyloFlash, phylophlan*, PhyloPhlAn2, phylophlan3, phyluce, PhyML, Picard, PICRUSt2, pigz, Pilon, Pindel, piPipes, PIQ, PlasFlow, platanus, Platypus, plink, plink2, Plotly, plotsr, Point Cloud Library, popbam, PopCOGenT, PopLDdecay, Porechop, poretools, portcullis, POUTINE, pplacer, PRANK, preseq, primalscheme, primer3, PrimerBLAST, PrimerPooler, prinseq, prodigal, progenomics, progressiveCactus, PROJ, prokka, Proseq2, ProtExcluder, protolite, PSASS, psmc, psutil, pullseq, purge_dups, pyani, PyCogent, pycoQC, pyfaidx, pyGenomeTracks, PyMC, pymol-open-source, pyopencl, pypy, pyRAD, Pyro4, pyseer, PySnpTools, python, PyTorch, PyVCF, qapa, qcat, QIIME, QIIME2, QTCAT, Quake, Qualimap, QuantiSNP2, QUAST, quickmerge, QUMA, R, RACA, racon, rad_haplotyper, RADIS, RadSex, RagTag, rapt, RAPTR-SV, RATT, raven, RAxML, raxml-ng, Ray, rck, rclone, Rcorrector, RDP Classifier, REAGO, REAPR, Rebaler, Red, ReferenceSeeker, regenie, regtools, Relate, RelocaTE2, Repbase, RepeatMasker, RepeatModeler, RERconverge, ReSeq, RevBayes, RFdiffusion, RFMix, RGAAT, rgdal, RGI, Rgtsvm, Ribotaper, ripgrep, rJava, rMATS, RNAMMER, rnaQUAST, Rnightlights, Roary, Rockhopper, rohan, RoseTTAFold2NA, rphast, Rqtl, Rqtl2, RSAT, RSEM, RSeQC, RStudio, rtfbs_db, ruby, run_dbcan, sabre, SaguaroGW, salmon, SALSA, Sambamba, samblaster, sample, SampleTracker, samplot, samtabix, Samtools, Satsuma, Satsuma2, SCALE, scanorama, scikit-learn, Scoary, scythe, seaborn, SEACR, SecretomeP, self-assembling-manifold, selscan, Sentieon, seqfu, seqkit, SeqPrep, seqtk, SequelTools, sequenceTubeMap, Seurat, sf, sgrep, sgrep sorted_grep, SHAPEIT, SHAPEIT4, SHAPEIT5, shasta, Shiny, shore, SHOREmap, shortBRED, SHRiMP, sickle, sift4g, SignalP, SimPhy, simuPOP, singularity, sinto, sirius, sistr_cmd, SKESA, skewer, SLiM, SLURM, smap, smcpp, smoove, SMRT Analysis, SMRT LINK, snakemake, snap, SnapATAC, SNAPP, SnapTools, snATAC, SNeP, Sniffles, snippy, snp-sites, SnpEff, SNPgenie, SNPhylo, SNPsplit, SNVPhyl, SOAP2, SOAPdenovo, SOAPdenovo-Trans, SOAPdenovo2, SomaticSniper, sorted_grep, spaceranger, SPAdes, SPALN, SparCC, sparsehash, SPARTA, split-fasta, sqlite, SqueezeMeta, SQuIRE, SRA Toolkit, srst2, stacks, Stacks 2, stairway-plot, stampy, STAR, Starcode, statmodels, STITCH, STPGA, StrainPhlAn, strawberry, Strelka, stringMLST, StringTie, STRUCTURE, Structure_threader, Struo2, stylegan2-ada-pytorch, subread, sumatra, supernova, suppa, SURPI, surpyvor, SURVIVOR, sutta, SV-plaudit, SVaBA, SVclone, SVDetect, svengine, SVseq2, svtools, svtyper, svviz2, SWAMP, sweed, SweepFinder, SweepFinder2, sweepsims, swiss2fasta.py, sword, syri, tabix, tagdust, Taiji, Tandem Repeats Finder (TRF), tardis, TargetP, TASSEL 3, TASSEL 4, TASSEL 5, tbl2asn, tcoffee, TensorFlow, TEToolkit, TEtranscripts, texlive, TFEA, tfTarget, thermonucleotideBLAST, ThermoRawFileParser, TMHMM, tmux, Tomahawk, TopHat, Torch, traitRate, Trans-Proteomic Pipeline (TPP), TransComb, TransDecoder, TRANSIT, transrate, TRAP, tree, treeCl, treemix, Trim Galore!, trimal, trimmomatic, Trinity, Trinotate, TrioCNV2, tRNAscan-SE, Trycycler, UCSC Kent utilities, ultraplex, UMAP, UMI-tools, UMIScripts, Unicycler, UniRep, unitig-caller, unrar, usearch, valor, vamb, Variant Effect Predictor, VarScan, VCF-kit, vcf2diploid, vcfCooker, vcflib, vcftools, vdjtools, Velvet, vep, VESPA, vg, Vicuna, ViennaRNA, VIP, viral-ngs, virmap, VirSorter, VirusDetect, VirusFinder 2, vispr, VizBin, vmatch, vsearch, vt, WASP, webin-cli, wget, wgs-assembler (Celera), WGSassign, What_the_Phage, windowmasker, wine, Winnowmap, Wise2 (Genewise), wombat, Xander_assembler, xpclr, yaha, yahs

Details for GATK (If the copy-pasted commands do not work, use this tool to remove unwanted characters)

Name:GATK
Version:3.8.1
OS:Linux
About:The Genome Analysis Toolkit or GATK is a software package developed at the Broad Institute to analyze high-throughput sequencing data.
Added:12/13/2011 2:43:50 PM
Updated:8/29/2018 11:00:05 PM
Link:http://www.broadinstitute.org/gsa/wiki/index.php/The_Genome_Analysis_Toolkit
Platform:Ilumina
Notes:The latest version of the program can be launched using the following command:

java -jar /programs/bin/GATK/GenomeAnalysisTK.jar [options]

To use GATK4, please refer this page: https://cbsu.tc.cornell.edu/lab/userguide.aspx?a=software&i=445#c

 

Older versions of GATK 3 (3.6, 3.7) are also available, but they are no longer compatible with the Linux version on BioHPC systems and have to be run in the docker environment. For current version of GATK 3 (v3.8), certain functions might have problems with CentOS, and you need to run with Ubuntu. We keep a Ubuntu docker image for v3.8 in the same directory.     

Before running GATK v 3.8 directly (i.e., not in docker), you will need to set the JAVA 8 environment by running the following commands (or including them in a script before lanching GATK):

export JAVA_HOME=/usr/local/jdk1.8.0_121
export PATH=$JAVA_HOME/bin:$PATH

 

To run HaplotypeCaller from version 3.6, follwow these steps:

1. Create (as usual) a directory /workdir/myid (replace myid with you own user ID, and copy all input files there.

2. Import the GATK docker image :

 

docker1 import /programs/GenomeAnalysisTK-3.x_docker/ubuntu_oldgatk3.tar

 

3. Create a shell script containing the GATK command(s) you want to run (the script may be created anywhere, for example, in your HOME directory). For example, using a text editor, create a script called run.sh

 

#!/bin/bash

 

GATKDIR=/usr/local/GenomeAnalysisTK-3.6

docker1 run biohpc_myid/ubuntu_oldgatk3 /bin/bash -c "java -Djava.io.tmpdir=/workdir -jar $GATKDIR/GenomeAnalysisTK.jar \
     -T HaplotypeCaller \
     -R /workdir/reference.fa \
     -I /workdir/alignments.bam  \
     --emitRefConfidence GVCF \
     --num_cpu_threads_per_data_thread 8 \
     -o /workdir/output_file.g.vcf"

 

 

Note that in the example above, it is assumed that the input files reference.fa and alignments.bam have been placed in /workdir/myid prior to starting the script. Inside the docker container, /workdir/myid is seen as /workdir - hence the myid part of the path has been omitted from the GATK command fed to docker. The resulting oputput file output_file.g.vcf will be seen (outside of docker) in /workdir/myid. Your input and output files may be organized in a more involved directory structure, but all this structure must be copied under /workdir/myid, otherwise the docker container will not see it. In the GATK command, all paths to input and output files will have to have the myid part omitted.

 

4. Run the script: sh ./run.sh >& run.log &

5. During the run, you can monior the log file run.log as well as any output files produced in /workdir/myid.

6. If you need to kill the run for any reason, you will need to kill the docker container.

 

docker1 ps -a      (In the list of containers, find the CONTAINER ID of your contaainer, e.g., 3d04770dc8bd)

docker1 kill 3d04770dc8bd

 

7. After a finished (or killed) run, regain control of all files created within docker container:

 

docker1 claim

 

 


Notify me if this software is upgraded or changed [You need to be logged in to use this feature]

 

Website credentials: login  Web Accessibility Help