institute of biotechnology >> brc >> bioinformatics >> internal >> biohpc cloud: user guide
 

BioHPC Cloud:
: User Guide

 

 


BioHPC Cloud Software

There are 1114 software titles installed in BioHPC Cloud. The sofware is available on all machines (unless stated otherwise in notes), complete list of programs is below, please click on a title to see details and instructions. Tabular list of software is available here

Please read details and instructions before running any program, it may contain important information on how to properly use the software in BioHPC Cloud.

3D Slicer, 3d-dna, 454 gsAssembler or gsMapper, a5, ABRicate, ABruijn, ABySS, AdapterRemoval, adephylo, Admixtools, Admixture, AF_unmasked, AFProfile, AGAT, agrep, albacore, Alder, AliTV-Perl interface, AlleleSeq, ALLMAPS, ALLPATHS-LG, Alphafold, alphapickle, Alphapulldown, AMOS, AMPHORA, amplicon.py, AMRFinder, analysis, ANGSD, AnnotaPipeline, Annovar, ant, antiSMASH, anvio, apollo, arcs, ARGweaver, aria2, ariba, Arlequin, ART, ASEQ, aspera, assembly-stats, ASTRAL, atac-seq-pipeline, ataqv, athena_meta, ATLAS, Atlas-Link, ATLAS_GapFill, atom, ATSAS, Augustus, AWS command line interface, AWS v2 Command Line Interface, axe, axel, BA3, BactSNP, bakta, bamsnap, bamsurgeon, bamtools, bamUtil, barcode_splitter, BarNone, Basset, BayeScan, Bayescenv, bayesR, baypass, bazel, BBMap/BBTools, BCFtools, BCL convert, bcl2fastq, BCP, Beagle, Beast2, bedops, BEDtools, bettercallsal, bfc, bgc, bgen, bicycle, BiG-SCAPE, bigQF, bigWig, bioawk, biobakery, biobambam, Bioconductor, biom-format, BioPerl, BioPython, Birdsuite, Bismark, Blackbird, blasr, BLAST, BLAST_to_BED, blast2go, BLAT, BlobToolKit, BLUPF90, BMGE, bmtagger, bonito, Boost, Bowtie, Bowtie2, BPGA, Bracken, BRAKER, BRAT-NextGen, BRBseqTools, BreedingSchemeLanguage, breseq, brocc, bsmap, BSseeker2, BUSCO, BUSCO Phylogenomics, BWA, bwa-mem2, bwa-meth, bwtool, cactus, CAFE, caffe, cagee, canu, Canvas, CAP3, caper, CarveMe, catch, cBar, CBSU RNAseq, CCMetagen, CCTpack, cd-hit, cdbfasta, cdo, CEGMA, CellRanger, cellranger-arc, cellranger-atac, cellranger-dna, centrifuge, centroFlye, CFM-ID, CFSAN SNP pipeline, CheckM, CheckM2, chimera, chimerax, chip-seq-pipeline, chromosomer, Circlator, Circos, Circuitscape, CITE-seq-Count, ClermonTyping, clues, CLUMPP, clust, Clustal Omega, CLUSTALW, Cluster, cmake, CMSeq, CNVnator, coinfinder, colabfold, CombFold, Comparative-Annotation-Toolkit, compat, CONCOCT, Conda, Cooler, copyNumberDiff, cortex_var, CoverM, crabs, CRISPRCasFinder, CRISPResso, Cromwell, CrossMap, CRT, cuda, Cufflinks, curatedMetagenomicDataTerminal, cutadapt, cuteSV, dadi, dadi-1.6.3_modif, dadi-cli, danpos, DAS_Tool, DBSCAN-SWA, dDocent, DeconSeq, Deepbinner, deeplasmid, DeepTE, deepTools, Deepvariant, defusion, delly, DESMAN, destruct, DETONATE, dfast, diamond, dipcall, diploSHIC, discoal, Discovar, Discovar de novo, distruct, DiTASiC, DIYABC, dnmtools, Docker, dorado, DRAM, dREG, dREG.HD, drep, Drop-seq, dropEst, dropSeqPipe, dsk, dssat, Dsuite, dTOX, duphold, DWGSIM, dynare, ea-utils, ecopcr, ecoPrimers, ectyper, EDGE, edirect, EDTA, eems, EgaCryptor, EGAD, eggnog-mapper, EIGENSOFT, elai, ElMaven, EMBLmyGFF3, EMBOSS, EMIRGE, Empress, enfuse, EnTAP, entropy, epa-ng, ephem, epic2, ermineJ, ete3, EukDetect, EukRep, EVM, exabayes, exonerate, ExpansionHunterDenovo-v0.8.0, eXpress, FALCON, FALCON_unzip, Fast-GBS, fasta, FastANI, fastcluster, FastME, FastML, fastp, FastQ Screen, fastq-multx-1.4.3, fastq_demux, fastq_pair, fastq_species_detector, FastQC, fastqsplitter, fastsimcoal2, fastspar, fastStructure, FastTree, FASTX, fcs, feems, feh, FFmpeg, fgbio, figaro, Filtlong, fineRADstructure, fineSTRUCTURE, FIt-SNE, flash, flash2, flexbar, Flexible Adapter Remover, Flye, FMAP, FragGeneScan, FragGeneScan, FRANz, freebayes, FSA, funannotate, FunGene Pipeline, FunOMIC, G-PhoCS, GADMA, GAEMR, Galaxy, Galaxy in Docker, GATK, gatk4, gatk4amplicon.py, gblastn, Gblocks, GBRS, gcc, GCTA, GDAL, gdc-client, GEM library, GEMMA, GeMoMa, GENECONV, geneid, GeneMark, Genespace, genomad, Genome STRiP, Genome Workbench, GenomeMapper, Genomescope, GenomeThreader, genometools, GenomicConsensus, genozip, gensim, GEOS, germline, gerp++, GET_PHYLOMARKERS, gfaviz, GffCompare, gffread, giggle, git, glactools, GlimmerHMM, GLIMPSE, GLnexus, Globus connect personal, GMAP/GSNAP, GNU Compilers, GNU parallel, go-perl, GO2MSIG, GONE, GoShifter, gradle, graftM, grammy, GraPhlAn, graphtyper, graphviz, greenhill, GRiD, gridss, Grinder, grocsvs, GROMACS, GroopM, GSEA, gsort, GTDB-Tk, GTFtools, Gubbins, gunc, GUPPY, hail, hal, HapCompass, HAPCUT, HAPCUT2, hapflk, HaploMerger, Haplomerger2, haplostrips, HaploSync, HapSeq2, HarvestTools, haslr, hdf5, hget, hh-suite, HiC-Pro, hic_qc, HiCExplorer, HiFiAdapterFilt, hifiasm, hificnv, HISAT2, HMMER, Homer, HOTSPOT, HTSeq, htslib, https://github.com/CVUA-RRW/RRW-PrimerBLAST, hugin, humann, HUMAnN2, hybpiper, hyperopt, HyPhy, hyphy-analyses, iAssembler, IBDLD, idba, IDBA-UD, IDP-denovo, idr, idseq, IgBLAST, IGoR, IGV, IMa2, IMa2p, IMAGE, ImageJ, ImageMagick, Immcantation, impute2, impute5, IMSA-A, INDELseek, infernal, Infomap, inStrain, inStrain_lite, InStruct, Intel MKL, InteMAP, InterProScan, ipyrad, IQ-TREE, iRep, JaBbA, jags, Jane, java, jbrowse, JCVI, jellyfish, juicer, julia, jupyter, jupyterlab, kaiju, kallisto, Kent Utilities, keras, khmer, kinfin, king, kma, KMC, KmerFinder, KmerGenie, kneaddata, kraken, KrakenTools, KronaTools, kSNP, kWIP, LACHESIS, lammps, LAPACK, LAST, lastz, lcMLkin, LDAK, LDhat, LeafCutter, leeHom, lep-anchor, Lep-MAP3, LEVIATHAN, lftp, Liftoff, Lighter, LinkedSV, LINKS, localcolabfold, LocARNA, LocusZoom, lofreq, longranger, Loupe, LS-GKM, LTR_retriever, LUCY, LUCY2, LUMPY, lyve-SET, m6anet, MACE, MACS, MaCS simulator, MACS2, macs3, maffilter, MAFFT, mafTools, MAGeCK, MAGeCK-VISPR, Magic-BLAST, magick, MAGScoT, MAKER, manta, mapDamage, mapquik, MAQ, MARS, MASH, mashtree, Mashtree, MaSuRCA, MATLAB, Matlab_runtime, Mauve, MaxBin, MaxQuant, McClintock, mccortex, mcl, MCscan, MCScanX, medaka, medusa, megahit, MeGAMerge, MEGAN, MELT, MEME Suite, MERLIN, merqury, MetaBAT, MetaBinner, MetaboAnalystR, MetaCache, MetaCRAST, metaCRISPR, metamaps, MetAMOS, MetaPathways, MetaPhlAn, metapop, metaron, MetaVelvet, MetaVelvet-SL, metaWRAP, methpipe, mfeprimer, MGmapper, MicrobeAnnotator, microtrait, MiFish, Migrate-n, mikado, MinCED, minigraph, Minimac3, Minimac4, minimap2, mira, miRDeep2, mirge3, miRquant, MISO, MITObim, MitoFinder, mitohelper, MitoHiFi, mity, MiXCR, MixMapper, MKTest, mlift, mlst, MMAP, MMSEQ, MMseqs2, MMTK, MobileElementFinder, modeltest, MODIStsp-2.0.5, module, moments, MoMI-G, mongo, mono, monocle3, mosdepth, mothur, MrBayes, mrsFAST, msdial, msld, MSMC, msprime, MSR-CA Genome Assembler, msstats, MSTMap, mugsy, MultiQC, multiz-tba, MUMandCo, MUMmer, mummer2circos, muscle, MUSIC, Mutation-Simulator, muTect, MZmine, nag-compiler, nanocompore, nanofilt, NanoPlot, Nanopolish, nanovar, ncbi_datasets, ncftp, ncl, NECAT, Nemo, Netbeans, NEURON, new_fugue, Nextflow, NextGenMap, NextPolish2, nf-core/rnaseq, ngmlr, NGS_data_processing, NGSadmix, ngsDist, ngsF, ngsLD, NGSNGS, NgsRelate, ngsTools, NGSUtils, NINJA, NLR-Annotator, NLR-Parser, Novoalign, NovoalignCS, nQuire, NRSA, NuDup, numactl, nvidia-docker, nvtop, Oases, OBITools, Octave, OMA, Oneflux, OpenBLAS, openmpi, openssl, orthodb-clades, OrthoFinder, orthologr, Orthomcl, pacbio, PacBioTestData, PAGIT, pal2nal, paleomix, PAML, panaroo, pandas, pandaseq, pandoc, PanPhlAn, Panseq, Parsnp, PASA, PASTEC, PAUP*, pauvre, pb-assembly, pbalign, pbbam, pbh5tools, PBJelly, pblat, pbmm2, PBSuite, pbsv, pbtk, PCAngsd, pcre, pcre2, PeakRanger, PeakSplitter, PEAR, PEER, PennCNV, peppro, PERL, PfamScan, pgap, PGDSpider, ph5tools, Phage_Finder, pharokka, phasedibd, PHAST, phenopath, Phobius, PHRAPL, PHYLIP, PhyloCSF, phyloFlash, phylophlan*, PhyloPhlAn2, phylophlan3, phyluce, PhyML, phyx, Picard, PICRUSt2, pigz, Pilon, Pindel, piPipes, PIQ, PlasFlow, platanus, Platypus, plink, plink2, Plotly, plotsr, Point Cloud Library, popbam, PopCOGenT, PopLDdecay, Porechop, poretools, portcullis, POUTINE, pplacer, PRANK, preseq, primalscheme, primer3, PrimerBLAST, PrimerPooler, prinseq, prodigal, progenomics, progressiveCactus, PROJ, prokka, Proseq2, ProtExcluder, protolite, PSASS, psmc, psutil, pullseq, purge_dups, pyani, PyCogent, pycoQC, pyfaidx, pyGenomeTracks, PyMC, pymol-open-source, pyopencl, pypy, pyRAD, Pyro4, pyseer, PySnpTools, python, PyTorch, PyVCF, qapa, qcat, QIIME, QIIME2, QTCAT, Quake, Qualimap, QuantiSNP2, QUAST, quickmerge, QUMA, R, RACA, racon, rad_haplotyper, RADIS, RadSex, RagTag, rapt, RAPTR-SV, RATT, raven, RAxML, raxml-ng, Ray, rck, rclone, Rcorrector, RDP Classifier, REAGO, REAPR, Rebaler, Red, ReferenceSeeker, regenie, regtools, Relate, RelocaTE2, Repbase, RepeatMasker, RepeatModeler, RERconverge, ReSeq, RevBayes, RFdiffusion, RFMix, RGAAT, rgdal, RGI, Rgtsvm, Ribotaper, ripgrep, rJava, rMATS, RNAMMER, rnaQUAST, Rnightlights, Roary, Rockhopper, rohan, RoseTTAFold2NA, rphast, Rqtl, Rqtl2, RSAT, RSEM, RSeQC, RStudio, rtfbs_db, ruby, run_dbcan, sabre, SaguaroGW, salmon, SALSA, Sambamba, samblaster, sample, SampleTracker, samplot, samtabix, Samtools, Satsuma, Satsuma2, SCALE, scanorama, scikit-learn, Scoary, scythe, seaborn, SEACR, SecretomeP, self-assembling-manifold, selscan, Sentieon, seqfu, seqkit, SeqPrep, seqtk, SequelTools, sequenceTubeMap, Seurat, sf, sgrep, sgrep sorted_grep, SHAPEIT, SHAPEIT4, SHAPEIT5, shasta, Shiny, shore, SHOREmap, shortBRED, SHRiMP, sickle, sift4g, SignalP, SimPhy, simuPOP, sina, singularity, sinto, sirius, sistr_cmd, SKESA, skewer, SLiM, SLURM, smap, smcpp, smoove, SMRT Analysis, SMRT LINK, snakemake, snap, SnapATAC, SNAPP, SnapTools, snATAC, SNeP, Sniffles, snippy, snp-sites, SnpEff, SNPgenie, SNPhylo, SNPsplit, SNVPhyl, SOAP2, SOAPdenovo, SOAPdenovo-Trans, SOAPdenovo2, SomaticSniper, sorted_grep, spaceranger, SPAdes, SPALN, SparCC, sparsehash, SPARTA, split-fasta, sqlite, SqueezeMeta, SQuIRE, SRA Toolkit, srst2, stacks, Stacks 2, stairway-plot, stampy, STAR, Starcode, statmodels, stellarscope, STITCH, STPGA, StrainPhlAn, strawberry, Strelka, stringMLST, StringTie, STRUCTURE, Structure_threader, Struo2, stylegan2-ada-pytorch, subread, sumatra, supernova, suppa, SURPI, surpyvor, SURVIVOR, sutta, SV-plaudit, SVaBA, SVclone, SVDetect, svengine, SVseq2, svtools, svtyper, svviz2, SWAMP, sweed, SweepFinder, SweepFinder2, sweepsims, swiss2fasta.py, sword, syri, tabix, tagdust, Taiji, Tandem Repeats Finder (TRF), tardis, TargetP, TASSEL 3, TASSEL 4, TASSEL 5, tax_myPHAGE, tbl2asn, tcoffee, telescope, TensorFlow, TEToolkit, TEtranscripts, texlive, TFEA, tfTarget, thermonucleotideBLAST, ThermoRawFileParser, TMHMM, tmux, Tomahawk, TopHat, Torch, traitRate, Trans-Proteomic Pipeline (TPP), TransComb, TransDecoder, TRANSIT, transrate, TRAP, tree, treeCl, treemix, Trim Galore!, trimal, trimmomatic, Trinity, Trinotate, TrioCNV2, tRNAscan-SE, Trycycler, UCSC Kent utilities, ultraplex, UMAP, UMI-tools, UMIScripts, Unicycler, UniRep, unitig-caller, unrar, usearch, valor, vamb, Variant Effect Predictor, VarScan, VCF-kit, vcf2diploid, vcfCooker, vcflib, vcftools, vdjtools, Velvet, vep, VESPA, vg, Vicuna, ViennaRNA, VIP, viral-ngs, virmap, VirSorter, VirusDetect, VirusFinder 2, vispr, VizBin, vmatch, vsearch, vt, WASP, webin-cli, wget, wgs-assembler (Celera), WGSassign, What_the_Phage, wiggletools, windowmasker, wine, Winnowmap, Wise2 (Genewise), wombat, Xander_assembler, xpclr, yaha, yahs

Details for BRAKER (If the copy-pasted commands do not work, use this tool to remove unwanted characters)

Name:BRAKER
Version:3
OS:Linux
About:Uses genomic and RNA-Seq data to automatically generate full gene structure annotations in novel genome
Added:9/1/2019 8:49:03 PM
Updated:10/17/2023 1:09:11 PM
Link:https://github.com/Gaius-Augustus/BRAKER
Notes:

Braker3

As braker3 is being updated frequently, please build your own image. Details can be found in  braker3 web site: https://github.com/Gaius-Augustus/BRAKER#container

#build singularity image

mkdir /workdir/$USER
cd /workdir/$USER
singularity build braker3.sif docker://teambraker/braker3:latest

# make a copy of  the Augustus config outside the container

cd /workdir/$USER

singularity run --bind $PWD braker3.sif cp -r /opt/Augustus/config $PWD

#run annotation

#keep your data files under /workdir/$USER
cd /workdir/$USER

singularity exec -C --env AUGUSTUS_CONFIG_PATH=$PWD/config --bind $PWD --pwd $PWD braker3.sif braker.pl [other parameters] 

#see this page for details of [other parameters]. For example, if you have protein sequences, the other parameters are "--genome=genome.fa --prot_seq=proteins.fa". If you have STAR aligned RNA seq bam files, "--species=yourSpecies --genome=genome.fasta --bam=file1.bam,file2.bam". 

# you might want to save a copy of the config directory, which contains your new species model that can be used for future predictions.

#see this page to get orthodb fasta files.

To run v2.1.6

#Register and download GeneMark software and license key

  • Go to web site GeneMark 
  • Check "GeneMark-ES/ET/EP ver 4.71_lic" and "LINUX 64 kernel 3.10 - 5" next to it.
  • Fill out the form and click "I agree to the terms"
  • Download the file:  "gm_key_64.gz" and put them under /workdir/$USER/

#Prepare Braker2 software

cd /workdir/$USER/

cp -r /programs/braker2-2.1.6/* ./

tar xvfz gmes_linux_64.tgz

zcat gm_key_64.gz > gmes_linux_64/.gm_key

#run command example

#run in "screen" session
#the "braker2" command is an executable singularity image file which can be run like "braker.pl"
#make sure gmes_linux_64, config and input files are present in current directory where the command is executed.
#change "--cores NUMBER" in the command based on CPU core availability on your server;
cd /workdir/$USER

singularity run -C --bind $PWD --pwd $PWD ./braker2 --genome=myGenomeAssembly.fa --bam=myRNAseqAlignment.bam --softmasking --workingdir=outPutDirectory --cores 8  &

 

************************************************* END OF INSTRUCTIONS**************************************************

If you want to install braker2 by yourself, following instructions here:

  • The Docker image is provided by Biocontainer;
  • Click here to get latest version tag;

Prepare software

1. Download and build Singularity image (replace "2.1.6--hdfd78af_5" with latest version tag) 

cd /workdir/$USER

singularity pull braker-2.1.6.sif docker://quay.io/biocontainers/braker2:2.1.6--hdfd78af_5

2. Prepare Genemark software (Genemark is free for academic users, but it requires you to register.)

cd /workdir/$USER

# Copy the software file gmes_linux_64.tar.gz here;
# Copy the key file gm_key_64.gz here;

tar xvfz gmes_linux_64.tar.gz

zcat gm_key_64.gz > $HOME/.gm_key

# Fix the shebang line of PERL scripts
cd gmes_linux_64
./change_path_in_perl_scripts.pl "/usr/bin/env perl"

# (Optional) Verify that Genemark is properly installed
./check_install.bash

# ProtHint distributed with Genemark on 3/11/2021 does not work with the container.
# Replace the scripts with latest ProtHint from github  
cd /workdir/$USER
git clone https://github.com/gatech-genemark/ProtHint.git
cp ProtHint/bin/* gmes_linux_64/ProtHint/bin/

3. Copy the Augustus config directory from inside container to outside container, so that it is writable by you. 

./braker-2.1.6.sif cp -r /usr/local/config /workdir/$USER

4. (Optional) Testing your installation

  • 4.1 Download the braker2 testing data set and script
git clone https://github.com/Gaius-Augustus/BRAKER.git

cd BRAKER/example
wget http://topaz.gatech.edu/GeneMark/Braker/RNAseq.bam
  • 4.2 Run testing data
#singularity must be launched from the directory parental to genemark and augustus config directory.
cd /workdir/$USER

#start singularity braker2 shell
./braker-2.1.6.sif

#set environment variables based on PATH on host machine
export PROTHINT_PATH=/workdir/$USER/gmes_linux_64/ProtHint/bin
export GENEMARK_PATH=/workdir/$USER/gmes_linux_64
export AUGUSTUS_CONFIG_PATH=/workdir/$USER/config

#these two environmental variables must be set as /usr/local/bin (path internal to container)
export AUGUSTUS_BIN_PATH=/usr/local/bin
export AUGUSTUS_SCRIPTS_PATH=/usr/local/bin

cd /workdir/$USER/BRAKER/example/tests

#testing RNA-seq input as training data
./test1.sh 

#testing protein sequence input as training data
./test2.sh 

#once done, exit singularity shell
exit

Your test run results are in example/tests/test1 and example/tests/test2. Compare them with the results provided by the developer (example/results).

5. Now you are ready to run Braker to annotate your genome. Run it in "screen" persistent session

cd /workdir/$USER
mkdir /workdir/$USER/run1

export PROTHINT_PATH=/workdir/$USER/gmes_linux_64/ProtHint/bin
export GENEMARK_PATH=/workdir/$USER/gmes_linux_64
export AUGUSTUS_CONFIG_PATH=/workdir/$USER/config
export AUGUSTUS_BIN_PATH=/usr/local/bin
export AUGUSTUS_SCRIPTS_PATH=/usr/local/bin

./braker-2.1.6.sif braker.pl --genome=myGenome.fa --bam=myRNAseq.bam --softmasking --workingdir=/workdir/$USER/myRun1 --cores 48

After the job is done, the result files are in /workdir/$USER/myRun1 .

If you plan to run this version of Braker later again. Here is a list of software files/directories under /workdir/$USER that you need to keep.

  • braker-2.1.6.sif
  • gmes_linux_64
  • config

########################End of instructions######################################################

To run the previous version v2.1.5  (Using docker image built by the BioHPC team)

The braker software is implemented as a docker image. One of the component GeneMark ET/ES/EP requires license. You will need to register and download the software and license file.

Current version:

  • Ubuntu 18.04
  • Braker: 2.1.5
  • Augustus: between 3.3.3 and 3.3.4 (github master branch on 10/16/2020)

#get docker image

docker1 pull biohpc/braker2

#prepare input

Create a data directory under /workdir/$USER/. Put following items in the directory:

  • Genome assembly in fasta;
  • RNA-seq bam;
  • genemark software directory: gmes_linux_64
  • genemark license: .gm_key  (file name starts with ".")

# run command

docker1 run --rm -v /workdir/$USER/mydata:/data  biohpc/braker2 sh -c ". /root/source.sh; braker.pl --species=mysp --genome=mygenome.fa.masked --bam=RNA.sorted.bam --softmasking --cores=24"

 


 

 


Notify me if this software is upgraded or changed [You need to be logged in to use this feature]

 

Website credentials: login  Web Accessibility Help