Meet the new DNAnexus and its Instant Collaboration Environment

The life sciences field has a long and respected tradition of collaboration among researchers. Genomics as we know it today kicked off with one of the biggest biology collaborations of all time, the Human Genome Project.

This has led to a highly cooperative mindset among many participants in the community, and it’s a mentality that we at DNAnexus share and embrace. A successful collaboration is more than just a mindset, though. It requires infrastructure of all sorts: today’s partnerships benefit from technology advances such as Skype, instant messaging, FaceTime, Google Docs, and more.

genomics platformThese collaborations often involve many participants with a range of backgrounds and expertise, such as bioinformatics, medicine, microbiology, molecular biology, and more. The new DNAnexus includes a set of features to facilitate research projects for teams within and across organizations. As a bioinformatician, for example, you can upload your data, build Apps, create custom workflows, and then share all of it with your research partners, whose access and permissions you control. Because you have the ability to define the project and design the workflow yourself, your collaborators — who might be clinicians or biologists with little to no expertise in bioinformatics — will have easy access to the entire project via an intuitive web interface to analyze and visualize their data.

With the new DNAnexus, you not only enable non-bioinformaticians to run your custom analysis tools and best-practice workflows, but you’ll also be able to eliminate data transfer, format conversion, and other incompatibilities that currently slow down even the most efficient collaborative efforts. This platform offers a secure and reliable environment through which you can instantly collaborate with team members without the hassle of data synchronization and shipping hard drives.

Here are a few key features of the new platform that are especially useful for collaborations:

Instant access from anywhere: When you’re part of a team that could be spread out across an organization, country, or even the world, you need to have easy access to your data from anywhere, at any time. The beauty of using a cloud-based platform is that it offers just that: peace of mind that your data will be ready and waiting for you whenever and wherever you need it. One of the main values of the new DNAnexus is collaboration support without the need to transfer files between collaborating sites. All the data is in one location and can be accessed by any permitted person from anywhere.

Intuitive permission definitions for users: As the project leader, you’ll be able to use the new DNAnexus to set access permissions for the other users of your project. Some people may be able to just view or download the data, while others can be made contributors, allowing them to manage and run analyses on the data — it’s all up to you, the administrator.

share genomic data

Enterprise-grade data security: Just as much as you want the right people to have access to your project data, you don’t want other people seeing it. With its extensive background in data and cloud security, the DNAnexus team has built this platform with enterprise and user-controlled permission for data, analysis tools, and workflow sharing. Your data is not only stored in high-end physical data centers, but also fully encrypted at rest and in transfer.

If you haven’t yet tried the new DNAnexus for yourself, what are you waiting for? Sign up here for your free beta testing account. And check back on this blog for our next in-depth look, when we will discuss security and compliance.

Meet the new DNAnexus and Its Extensible Genomics Toolbox

genomics platformThis week we continue our look at unique facets of the new DNAnexus  with a focus on the “Extensible Genomics Toolbox” — that is, the platform’s ability to allow bioinformaticians to tailor their analyses through custom Apps and workflows. The Apps provided within the platform serve as a starting toolkit, but users can also build new Apps from scratch or modify and combine existing ones to create truly custom pipelines.

Customization is one of the most important components when it comes to data analysis. Depending on the question that has spurred their experiments, researchers have a broad range of data analysis needs. The tools, algorithms, or annotations they find relevant will vary greatly from one experiment to the next — no single solution works for all of them.

extensible genomics toolboxDNAnexus’ goal is to provide a turn-key platform with a comprehensive menu of built-in functionality while also providing the freedom to add new capabilities as you need them. You can use what is already available in the constantly growing Apps library, which includes a rich set of industry-recognized tools for data QC, DNA resequencing, and RNA-seq, such as FastQC, RSeQC, BWA, GATK, SAMtools, Picard, SomaticSniper, Tophat, Cufflinks, and many more. In addition to these tools, you can take advantage of an expanding set of integrated reference genomes and variant databases, including dbSNP and COSMIC. All the Apps provided are open source. In addition, we provide a large set of useful example applets that can be used as a starting point for developers to build their own Apps.

genomics apps

For example, if you need your own reference genome and annotation database for annotating and interpreting your data, DNAnexus provides the functionality to let you upload and integrate your own proprietary reference genome into your custom workflow. Combine and configure multiple tools — whether they’re provided by DNAnexus or built in your own lab — into best-practice workflows that can be used in your lab or shared with collaborators within or across institutions.

The flexibility of the new DNAnexus platform allows you to run Linux programs written in any language. You can develop your own parallelized tools using APIs and the SDK for Bash shell, Python, C++, and Java (with more coming soon). In addition, you can now automate your batch data analysis via scripting using the command line, or with an easy-to-use, drag-and-drop web interface with dynamic validation of App compatibility to let you know whether you have selected the proper input file type for a specified App.

genomics workflows

The new DNAnexus platform will allow upload and storage of any file type via the API. To achieve this and to take full advantage of the programmatic capabilities of the new platform, DNAnexus will automatically convert certain file types into objects optimized for fast programmatic access. These new objects are called Genomic Tables or simply gtables, and they can be generated with file-conversion Apps provided by DNAnexus. As a result, users can store and retrieve any file type from their account workspace in its original form. File types supported for import and export are FASTA, FASTQ, SAM, BAM, VCF, BED, GFF, GTF, and WIG (more soon).

The rich genomics toolbox also contains an HTML5-based integrated and interactive Genome Browser which lets you create custom tracks to view your data alongside reference data without any additional downloads or plugins. You can immediately leverage the Genome Browser and stream data as needed across the internet. Add from our included reference data sets and variant databases, such as dbSNP and COSMIC, or whatever data you choose to upload.

dnanexus genome browser

All together, this results in an environment that makes it possible for you to design, script, and fully automate custom workflows — filtering data, querying massive data sets, and handling batch analysis with ease thanks to the extensible genomics toolbox within the configurable cloud.

This ability to create Apps and workflows tailored to the needs of your project and lab is the foundation of the platform’s ability to facilitate “Instant Collaboration” — another core capability that we’ll discuss in next week‘s close-up look at the new DNAnexus. Haven’t taken it for a test drive yet? Take advantage of our free beta trial period and sign up for an account here to explore it for yourself.


Meet the new DNAnexus and its Configurable Cloud Infrastructure

dnanexus betaIt’s been a busy first week since we launched the beta of the new DNAnexus, our cloud-based DNA analysis platform designed for bioinformaticians. We’ve been blown away by the number of people who have signed up for the program and provided a lot of very positive and constructive feedback. We encourage all of our beta users to continue to comment on their experience. Request access today and see for yourself what it is all about.


configurable cloud infrastructureThis week we’d like to highlight one of the core capabilities of the new DNAnexus platform, the configurable cloud infrastructure, which lets you take full advantage of Amazon’s scalable and cost-effective Web Services. It not only allows you to scale your computational and data storage needs to any level, it is also fully scriptable and allows you to create an analysis solution that fits your specific needs. The benefit is eliminating capacity planning since you can now store and process any data on demand and only pay for what you use.


At DNAnexus we have always used the pay-as-you-go model for computational and storage services; this will continue with the new DNAnexus. The benefit of a pay-as-you-go approach is that you can cost-effectively address your needs today and scale up or down as those needs change. Whether you are familiar with or new to sequence data analysis, you can immediately get started with your data analysis projects without any setup costs or capacity planning risks — regardless of how many samples you might have. This is because the new platform, with its configurable infrastructure, processes samples in parallel, resolving resource contention issues among different teams.


When we set out to build the new platform, one of the most common requests we heard was for a fully configurable solution — allowing bioinformaticians and computational analysts the ability to run custom programs, tune compute performance through parallelization, and more. All of this is now possible with the new platform, through well-documented APIs and SDK, as these allow rich scripting for any data management, analysis, visualization, or reporting desires.


configurable genomics platform

Another advantage of this new infrastructure is that you can now manage and manipulate your data not only via the web interface, but also through the command-line, which is compatible with Linux and Mac OS X. The open and flexible new DNAnexus platform, with its SDK language support, allows you to run any tool in any language and perform platform operations through API bindings in Python, C++, Java, and the Bash Shell. This allows you to fully automate entire workflows from sequencing data upload to analysis and report generation. You may also create best practices workflows that can be easily shared with non-bioinformaticians within or across institutions.


In the weeks to come, we’ll explore the many additional capabilities of the new DNAnexus (e.g., the “Extensible Genomics Toolbox”, “Instant Collaboration”, and “Security and Compliance”). In the meantime, please take advantage of our beta program and sign up for your own account and explore firsthand what the new DNAnexus has to offer.