Join DNAnexus for a Lunchtime Workshop at ASHG

Whether you join us for lunch to explore the dark matter of ENCODE or visit our booth (#507) to check out the latest updates and get a demo of DNAnexus, we’d love to catch up with you at ASHG. Plus, be entered to win an iPod Shuffle, when introducing yourself at the DNAnexus booth.


kundaje encodeWe are honored to have Dr. Anshul Kundaje, research scientist at MIT, review his latest ENCODE consortium paper and share insights into gene regulation. By looking at 119 transcription factors and regulatory proteins, Kundaje’s team found that chromatin diversity at the regulatory level is the norm, rather than the exception.


During this lunchtime talk you’ll discover how Kundaje and team used DNAnexus to process and map some 5 billion reads and identified chromatin patterns via a newly developed probabilistic mapping tool, the Clustered AGgregation Tool (CAGT). The clustering approach yielded dramatically different results compared to the standard method of averaging chromatin marks across populations. Importantly, Kundaje’s team was able to identify about 25 “metapatterns,” or signatures that represent the diversity of modifications across binding sites and cell types. These distinct patterns will be valuable for other scientists examining chromatin modifications and making inferences about what those changes are doing in the genome.



• Wednesday, November 7th, 12:45 – 2:15 pm
• Moscone Center, Room 301 (Esplanade Level)
• Boxed lunch & refreshments will be provided!



Brigitte Ganter, Ph.D., Director of Product Marketing, DNAnexus

Ubiquitous Heterogeneity and Asymmetry of the Chromatin Environment at Human Regulatory Elements
Anshul Kundaje, Ph.D., Research Scientist MIT

DNAnexus: A Collaborative and Scalable Data Technology Platform
Andreas Sundquist, Ph.D., CEO and Co-founder, DNAnexus


Explore the data yourself; you’ll find the 20 samples accessible via DNAnexus in our Public Data folder, labeled Encode.


Collaborative Research Was the Big Winner at Bio-IT World Europe


Earlier this month I attended the 4th annual Bio-IT World Europe Conference & Expo in Vienna, where I found that the enthusiasm for high-performance, cloud-based computing from the scientific community is higher than ever. I was thrilled to see that there is more demand for resources to help scientists and bioinformaticians store, manage, and analyze their data — particularly in ways that facilitate collaboration among larger groups. There seems to be quite a bit of money spent in Europe on cloud-based/open source tools with the goal to support and advance genomics research. The money comes from research funds, but also from the commercial sector. That’s especially interesting since Europe’s overall funding situation seems a bit shaky, yet it is great to see that there is enough funding on the research side.


Indeed, collaboration was a central theme at the conference this year. A keynote presentation from Yike Guo, a professor in computing science at Imperial College London, focused on the Innovative Medicines Initiative (IMI), a public-private partnership bringing together biopharmaceutical companies, hospitals, universities, and others to help bring safer, more effective medicines to patients. Guo oversees a project called eTRIKS, or the European Translational Information & Knowledge Management Services, which received €24 million from IMI to build a cloud-based platform to improve collaboration among IMI members, including big pharma companies and academic institutions. The effort is open source, and Guo’s team hopes to have a prototype ready for testing in a few months.


In another keynote session, Paul Flicek, principal investigator and head of the vertebrate genomics team at EMBL’s European Bioinformatics Institute, spoke about evaluating cloud-based computing as part of his work with Ensembl, the 1,000 Genomes Project, and ENCODE. He termed it, “interacting with the cloud through the lens of Ensembl.” Flicek made the important point that the ultimate goal isn’t amassing sequence data, such as aligned reads, variation calls, and genome browser viewings, but rather to extract knowledge from that data to improve our understanding of biology and disease. He uses cloud services from Amazon to take advantage of its entire infrastructure, to distribute the data, and to provide genome annotation of more than 50 species.


I was also really interested in a talk from Veit Ulishoefer, who presented an update on the Pistoia Alliance. This group was formed a few years ago by informatics experts at some of the leading pharmaceutical companies who wanted to share precompetitive information to streamline the drug discovery process at all of their companies. Today, the group is made up of pharma companies, publishers, and academic institutes, among others. Ulishoefer spoke about a recent competition called Sequence Squeeze, hosted by Pistoia, to find the best compression tool for sequence data. The winning entry came from James Bonfield, a researcher at the Wellcome Trust Sanger Institute, which can be accessed through SourceForge.


It was great to see that so many of these collaborative projects were driven by pharma, which isn’t necessarily known for having a share-and-share-alike mentality. If even these highly competitive corporations can find ways to work together, that gives me great hope that such alliances will help usher in an improved understanding of diseases and more effective medicines. Here at DNAnexus, we strongly believe in the central pillar of collaboration, a major focus of ours and well supported with the core capability of the cloud.


At Beyond the Genome Conference, Lessons on Data Analysis and Clinical Studies


A few of us from DNAnexus had the privilege of attending Beyond the Genome 2012, a conference organized by BioMedCentral and held at Harvard Medical School. The meeting, now in its third year, continued its trend of attracting top-notch speakers, including keynotes from Baylor’s Richard Gibbs and Stuart Schreiber from the Broad Institute.


From the first speaker, Gabor Marth from Boston College, it became clear that one of the major hurdles now facing scientists was not DNA sequencing, as has been true in years past, but processing the data. This has led to a situation where many groups are writing their own algorithms to perform the same functions — a widely recognized problem in allocating resources in the most productive way. Scientists encouraged each other to stop reinventing the wheel, and also to ensure that bioinformatics tools can be used and reported on easily by biologists. That message resonated with us, as we have long championed the concept of a central data resource where excellent algorithms would be accessible to anybody. It’s gratifying to see that the same principle is gaining acceptance throughout academia as easy-to-use, reproducible data analysis becomes the real challenge in the sequencing process.



We also saw a string of fantastic talks on clinical sequencing. Sharon Plon from Baylor gave a very insightful “lessons learned” talk about their first year of clinical exome sequencing. The biggest pain point in the process was not sequencing, data analysis, insurance reimbursement, or finding patients in need; it was figuring out what to report to patients and how to do it. This underscores the need to bring genetic counselors, ethicists, and doctors into the conversation early to give guidance on what until recently has been a purely research-based endeavor. Dr. Plon and Joris Veltman from Radboud University presented several amazing case studies where sequencing had identified the cause of disease and allowed the patient to make steps to improve their lives, as well as informing the family about risk of recurrence.  We look forward to hearing many more success stories.


Of course, cancer studies were a noteworthy trend at the conference. We heard research on cancer genome evolution, epigenetic modification, sifting causative mutations from neutral, and the general effects of genome organization in three dimensions. But it was clear that integrating the information that’s being generated from all these techniques will be a big challenge. To get even deeper insights into human cancers, we’ll need to bring together the computational tools that we’ve already built and also bring together people from different scientific, medical, and social disciplines to apply that information intelligently. The good news is that this is already starting to happen, and we at DNAnexus are excited to be in a position to offer help as this approach gains traction.