Load Up on Caffeine … AGBT Is Almost Here

View from the Marcos Island Marriott, the AGBT venue

We’re gearing up for the Super Bowl of the next-gen sequencing field – the Advances in Genome Biology and Technology (AGBT) meeting held annually in Marco Island, Fla. In a typical year, there would be major announcements from the established sequencing vendors at this event, but given that Life Technologies and Illumina already went public with their big news at JP Morgan, and the Roche bid for Illumina will likely still be playing out, the big stories from this year’s meeting will probably revolve around major research findings, technology applications, and what’s going on with the sequencing upstarts. (Oxford Nanopore, for example, will be announcing plans to commercialize its instrument later this year and providing attendees a sneak peek. GnuBio will also be presenting on its desktop sequencer, the iGnuIT 1000.)

As usual, this year’s agenda is chock full of thought-provoking presentations, including a talk by DNAnexus co-founder Arend Sidow, who will be presenting on the use of deep whole-genome sequencing to monitor breast cancer progression (Thursday, Feb. 16, at 4:35pm).

We’ll be there to meet with colleagues, customers, and potential collaborators. We’ll also be presenting two posters on current DNAnexus projects. If you’ll be there, we encourage you to stop by — find out more about us, get a demo, have some wine and cheese, you name it. Here’s a quick preview of what we’ll be showcasing:

  • Candidate Gene Variants in “Micronesian” Autosomal Recessive Aplastic Anemia – Brigitte Ganter, Majed Dasouki, S. Abhyankar, M. Furness, R. Calado
    This work was done with collaborators at the University of Kansas Medical Center and National Heart, Lung, and Blood Institute (NHLBI). In the project, researchers performed exome sequence and nucleotide-level variation analyses for two siblings with aplastic anemia, a condition where bone marrow does not produce sufficient new cells to replenish blood cells. The results led to the identification of 12 candidate homozygous variants in 9 different genes. In this poster, we’ll discuss how DNAnexus was used to identify these variants and characterize their potential role in aplastic anemia.
  • Expanding and Enhancing Access to the Sequence Read Archive (SRA) Through a Complementary New Web-Based Mirror – Brigitte Ganter, Evan Worley, Bing Xia, Andreas Sundquist
    As we announced last October, we teamed up with Google to develop a complementary hosted mirror of NCBI’s Sequence Read Archive (SRA). Through a typical user scenario, we will discuss the underlying data processing pipeline, key features of the new web-based interface and how researchers can use it to quickly identify and browse datasets of interest, link-out to PubMed references, and integrate data into follow-on analysis workflows.

Preserving and Enhancing an Important Community Resource

Today, DNAnexus is pleased to announce the launch of our hosted SRA site!

The DNAnexus SRA site is a hosted version of NCBI’s Sequence Read Archive (SRA). As the most comprehensive archive of publicly available next-generation sequencing data, the SRA is an important resource to researchers around the world. The SRA remains the single best resource of useful sequence data from research initiatives such as the 1,000 Genomes Project and institutions like the Broad Institute, Washington University, and the Wellcome Trust Sanger Institute.

DNAnexus has created a mirrored site of this resource by teaming up with Google, to provide access to all publicly accessible datasets for specific studies, experiments, samples, and runs that are currently available via the NCBI SRA website. (Note: Currently these data do not include the analysis data and the Trace Archive repository.)

The New Interface

In addition to maintaining free access to the SRA database, we have taken this opportunity to improve the experience of using and accessing these data. The new web-based user interface was built using the latest cloud-based technologies and genomic data standards. Central to this effort were the many conversations we had with researchers about how they search and interact with data of this type. Their feedback was the basis of our development plan, which drew on our own experience in developing web-based sequence data analysis solutions as well as Google’s big data expertise.

Searching and Browsing

Our main goal in developing the new interface was to vastly enhance the way you find data of interest, understand sample-to-project associations, and download files for subsequent analysis in your tool of choice.

The most significant difference you will notice is the new web-based searching and browsing interface. The new search tool allows you to simultaneously look across multiple data annotations and keywords for objects of interest that are embedded in the SRA database. Each search returns a ranked list of results with relevant metadata for easy follow-on browsing.

We have developed a number of features to simplify how you can scan results and quickly narrow in on relevant data. We are particularly excited about the links to published data. PubMed references now permit users to link directly to journals for descriptions of samples, experiments, studies, or runs as they appeared in the referenced publication.

Once you have identified samples of interest, you can easily download them. In addition to the SRA standard format, we have also made it possible to download these data in the more popular FASTQ format.

For more details on the sra.dnanexus.com functionality and how the website works, please visit the SRA FAQ.

Transforming Data into Real Insights Using DNAnexus

Since the SRA primarily contains raw sequence data, the ability to import them into a platform such as DNAnexus is essential for further analyses. For example, by uploading your results into DNAnexus you can access tools that will map your data to a reference genome so you can better understand data quality, a critical step in determining whether to move forward with the data. DNAnexus also allows you to analyze and visualize these data as a standalone dataset or in conjunction with other data already in the system, using our interactive web-based Genome Browser.

Analyze and Visualize SRA Data for Free

For the next 30 days, you can import SRA data directly into DNAnexus at no cost. If you already have a DNAnexus account, simply log in and import your SRA data. If you are not yet a user of DNAnexus, you can sign up for a free trial account and import your data. Once logged in, you can perform mapping, RNA-seq, ChIP-seq, variant analysis, and data visualization on your SRA data for a total of two years.

Special note for our users from academic institutions… We have just reduced the standard academic pricing by half!