DNAnexus Blog - Product updates, industry insights, opinions and references. From the team powering the Genomics Revolution.

Meet the new DNAnexus and Its Extensible Genomics Toolbox - Inside DNAnexus

Written by Brigitte Ganter | Feb 14, 2013 10:23:26 AM

This week we continue our look at unique facets of the new DNAnexus  with a focus on the “Extensible Genomics Toolbox” — that is, the platform’s ability to allow bioinformaticians to tailor their analyses through custom Apps and workflows. The Apps provided within the platform serve as a starting toolkit, but users can also build new Apps from scratch or modify and combine existing ones to create truly custom pipelines.

Customization is one of the most important components when it comes to data analysis. Depending on the question that has spurred their experiments, researchers have a broad range of data analysis needs. The tools, algorithms, or annotations they find relevant will vary greatly from one experiment to the next — no single solution works for all of them.

DNAnexus’ goal is to provide a turn-key platform with a comprehensive menu of built-in functionality while also providing the freedom to add new capabilities as you need them. You can use what is already available in the constantly growing Apps library, which includes a rich set of industry-recognized tools for data QC, DNA resequencing, and RNA-seq, such as FastQC, RSeQC, BWA, GATK, SAMtools, Picard, SomaticSniper, Tophat, Cufflinks, and many more. In addition to these tools, you can take advantage of an expanding set of integrated reference genomes and variant databases, including dbSNP and COSMIC. All the Apps provided are open source. In addition, we provide a large set of useful example applets that can be used as a starting point for developers to build their own Apps.

For example, if you need your own reference genome and annotation database for annotating and interpreting your data, DNAnexus provides the functionality to let you upload and integrate your own proprietary reference genome into your custom workflow. Combine and configure multiple tools — whether they’re provided by DNAnexus or built in your own lab — into best-practice workflows that can be used in your lab or shared with collaborators within or across institutions.

The flexibility of the new DNAnexus platform allows you to run Linux programs written in any language. You can develop your own parallelized tools using APIs and the SDK for Bash shell, Python, C++, and Java (with more coming soon). In addition, you can now automate your batch data analysis via scripting using the command line, or with an easy-to-use, drag-and-drop web interface with dynamic validation of App compatibility to let you know whether you have selected the proper input file type for a specified App.

The new DNAnexus platform will allow upload and storage of any file type via the API. To achieve this and to take full advantage of the programmatic capabilities of the new platform, DNAnexus will automatically convert certain file types into objects optimized for fast programmatic access. These new objects are called Genomic Tables or simply gtables, and they can be generated with file-conversion Apps provided by DNAnexus. As a result, users can store and retrieve any file type from their account workspace in its original form. File types supported for import and export are FASTA, FASTQ, SAM, BAM, VCF, BED, GFF, GTF, and WIG (more soon).

The rich genomics toolbox also contains an HTML5-based integrated and interactive Genome Browser which lets you create custom tracks to view your data alongside reference data without any additional downloads or plugins. You can immediately leverage the Genome Browser and stream data as needed across the internet. Add from our included reference data sets and variant databases, such as dbSNP and COSMIC, or whatever data you choose to upload.

All together, this results in an environment that makes it possible for you to design, script, and fully automate custom workflows — filtering data, querying massive data sets, and handling batch analysis with ease thanks to the extensible genomics toolbox within the configurable cloud.

This ability to create Apps and workflows tailored to the needs of your project and lab is the foundation of the platform’s ability to facilitate “Instant Collaboration” — another core capability that we’ll discuss in next week‘s close-up look at the new DNAnexus. Haven’t taken it for a test drive yet? Take advantage of our free beta trial period and sign up for an account here to explore it for yourself.