SOT: Still Early Days for Next-Gen Sequencing in Molecular Toxicology

The Society of Toxicology’s 51st annual meeting was held this week right in our back yard. Since I am a longtime member, I headed up to the Moscone Convention Center in San Francisco to check it out. The Annual Meeting and ToxExpo were packed; almost 7,500 people and more than 350 exhibitors.

SOT isn’t like the sequencing-focused meetings I’ve been attending since I joined DNAnexus, but it’s actually home turf for my own research background in toxicogenomics. This year’s meeting sponsors included a number of pharmas and biotechs, from Novartis and Bristol-Myers Squibb to Amgen and Syngenta. Scientific themes at the conference ranged from environmental health to clinical toxicology to regulatory science and toxicogenomics. Next-gen sequencing is still in its infancy in the world of molecular toxicology, which is still dominated by microarray expression experiments. There were very few posters showing applications of NGS data in toxicogenomics — the ones that did tended to be centered around microRNAs — but a lot of the people I had conversations with have recently started running sequencing studies to eventually retire microarray type experiments.

I found Lee Hood’s opening presentation particularly interesting because he focused on the need to combine data from various technology platforms and institutions all over the world. He talked about his P4 vision, of course — the idea that medicine going forward will have to be predictive, personalized, preventive, and participatory. He also included great gems about fostering a cross-disciplinary culture, mentioning genome sequencing of families, the human proteome, and mining genomic data together with phenotypic and clinical data.

Lee Hood. Photo Copyright Chuck Fazio

Another exciting talk that was well received came from Joe DeRisi at the University of California, San Francisco. He presented work analyzing hundreds of honey bee samples with microarrays combined with DNA and RNA sequencing. Using an internally developed de novo assembler called PRICE (short for Paired-Read Iterative Contig Extension; freely available on his website), his team identified a number of different organisms associated with the sequence data of the honey bee samples, including different viruses, phorides, and parasites. At this moment it’s not clear what is causing the honey bee population decline; it appears that there are multiple factors contributing to the phenomenon. It is great to see that DeRisi and team will continue working in this area.

Last but not least, Scott Auerbach from the National Toxicology Program announced the release of the previously commercial toxicogenomics database DrugMatrix to the public for free (announced earlier this year, but now officially made public). With this release, DrugMatrix is now the largest scientific and freely available toxicogenomic reference database and informatics system. The data included is based on rat organ toxicogenomic profiles for 638 compounds; DrugMatrix allows an investigator to formulate a comprehensive picture of a compound’s potential for toxicity with greater efficiency than traditional methods. All of the molecular data stems from microarray experiments, but Auerbach and team are now investigating what it will take to move from microarrays to RNA-seq experiments and how to integrate the different types of data. They are currently performing a pilot on a subset of compounds with the same RNA used for the microarray experiments. Their challenge, as he sees it, lies in the interpretation and validation of the newly generated RNA-seq data: what qualifies one platform as superior to the other? Since they are interested in the biology and in generating drug classifiers, one way of looking at it is to assess which platform is the basis for better classifiers based on sensitivity and specificity thresholds. It will be interesting to see whether the RNA-seq data-based classifiers will be comparable or superior to microarray classifiers.

AGBT in Review: Highlights and High Hopes for Data

Last week’s Advances in Genome Biology and Technology (AGBT) meeting was every bit the fast-paced roller coaster ride we were anticipating. As expected, there were no major leaps announced by the established vendors, although Illumina, Life Tech’s Ion Torrent, and Pacific Biosciences all had a big presence at the conference.

View from my hotel room: I got lucky with an ocean front room

The biggest splash by far came from Oxford Nanopore Technologies, which emerged from stealth mode with a talk from Chief Technology Officer Clive Brown. The company’s technology sequences DNA by detecting electrical current as the strand moves through a nanopore. Brown said the technology had been used successfully to sequence the phi X genome (a single 10 KB read got the sense and antisense strands) and the lambda genome (a 48 KB genome also covered in a single pass). Brown reported raw read error rate of 4 percent, mostly caused by the DNA strand oscillating in the nanopore instead of moving smoothly through it. Other significant features: the nanopore can read RNA directly, detect methylation status, and be used directly from a sample (such as blood) – no prep required.

What I thought was most interesting, though, was that at a meeting known for being wall-to-wall sequencing technology, this year’s event really focused more on two arenas: clinical genomics and data analysis. The conference kicked off with a session on clinical translation of genomics, with speakers including Lynn Jorde from the University of Utah and Heidi Rehm from Harvard. Both talked about the key challenges in data analysis and interpretation, with Rehm in particular stressing the need for a broadly accessible data platform with clinical-grade information that could be ranked with confidence level and would pull data together from a variety of disparate sources. Notably, the clinical talks generally were limited by small sample sizes, and sometimes wound up with results that were inconclusive in recommending a particular course of treatment. That’s to be expected in the early stages of moving sequence data into a clinical environment, of course, but it also underscores the opportunities here once low-cost sequencing becomes widely available.

The trend was clear: data, data, data. And the only way to make the most of all that data will be to pave the way to an environment where information can be accessed and shared easily, with as many tools as possible to interrogate, analyze, and validate it.

Load Up on Caffeine … AGBT Is Almost Here

View from the Marcos Island Marriott, the AGBT venue

We’re gearing up for the Super Bowl of the next-gen sequencing field – the Advances in Genome Biology and Technology (AGBT) meeting held annually in Marco Island, Fla. In a typical year, there would be major announcements from the established sequencing vendors at this event, but given that Life Technologies and Illumina already went public with their big news at JP Morgan, and the Roche bid for Illumina will likely still be playing out, the big stories from this year’s meeting will probably revolve around major research findings, technology applications, and what’s going on with the sequencing upstarts. (Oxford Nanopore, for example, will be announcing plans to commercialize its instrument later this year and providing attendees a sneak peek. GnuBio will also be presenting on its desktop sequencer, the iGnuIT 1000.)

As usual, this year’s agenda is chock full of thought-provoking presentations, including a talk by DNAnexus co-founder Arend Sidow, who will be presenting on the use of deep whole-genome sequencing to monitor breast cancer progression (Thursday, Feb. 16, at 4:35pm).

We’ll be there to meet with colleagues, customers, and potential collaborators. We’ll also be presenting two posters on current DNAnexus projects. If you’ll be there, we encourage you to stop by — find out more about us, get a demo, have some wine and cheese, you name it. Here’s a quick preview of what we’ll be showcasing:

  • Candidate Gene Variants in “Micronesian” Autosomal Recessive Aplastic Anemia – Brigitte Ganter, Majed Dasouki, S. Abhyankar, M. Furness, R. Calado
    This work was done with collaborators at the University of Kansas Medical Center and National Heart, Lung, and Blood Institute (NHLBI). In the project, researchers performed exome sequence and nucleotide-level variation analyses for two siblings with aplastic anemia, a condition where bone marrow does not produce sufficient new cells to replenish blood cells. The results led to the identification of 12 candidate homozygous variants in 9 different genes. In this poster, we’ll discuss how DNAnexus was used to identify these variants and characterize their potential role in aplastic anemia.
  • Expanding and Enhancing Access to the Sequence Read Archive (SRA) Through a Complementary New Web-Based Mirror – Brigitte Ganter, Evan Worley, Bing Xia, Andreas Sundquist
    As we announced last October, we teamed up with Google to develop a complementary hosted mirror of NCBI’s Sequence Read Archive (SRA). Through a typical user scenario, we will discuss the underlying data processing pipeline, key features of the new web-based interface and how researchers can use it to quickly identify and browse datasets of interest, link-out to PubMed references, and integrate data into follow-on analysis workflows.