Cancer Genomes Dataset Now Hosted on Amazon Web Services

Today, Amazon Web Services (AWS) and the Ontario Institute for Cancer Research (OICR) made available on Amazon’s Simple Storage Service (S3) the International Cancer Genomes Consortium (ICGC) Pan-Cancer dataset, with more than 2,400 consistently analyzed whole genomes from over 1,100 unique ICGC donors.

Hundreds of terabytes of genome sequence alignments and variant information are now hosted on AWS and can be used to explore the genomic basis of cancer, accelerate research, and develop more targeted therapies. In addition to the raw data, the dataset contains a total of nearly 4 million identified mutations and known differences between tumor and normal genes.

We are thrilled by this news, which reduces the technical barriers to accessing and working with ICGC data. Those DNAnexus users who are also authorized ICGC researchers will have the convenience and speed of data access without needing to transfer it from far-flung repositories or to their local infrastructure, a process that previously could take months and required substantial compute resources. This will allow the rich set of DNAnexus tools and pipelines, and the easy tool development environment of DNAnexus to be applied to this cancer data, all with security, compliance, and practically unlimited scalability.

“The DNAnexus Platform provides an environment where our own data will be able to live with ICGC data, performing sophisticated analysis to extract knowledge and the ability to share with other researchers from around the world in real time,” said Steve Rozen, PhD, Director of Duke-NUS Center for Computational Biology.

ICGC researchers can apply for cloud data access through the ICGC DACO and use their access token with a DNANexus app wrapping the ICGC DCC storage client. The controlled-access data transferred from S3 can then be either processed ephemerally or stored in a secure DNAnexus project for later use. Learn more at our ICGC_fetcher GitHub repository and DNAnexus project, or schedule a scientific consultation with our team.


With ICGC data now hosted on AWS, researchers at institutions large and small will be able to access this large dataset easily on the DNAnexus Platform, easing the technical and computational barriers for cancer genomics analysis and data sharing. We’re committed to further the collective understanding of cancer at the genomic level and move science forward. Stay tuned for more!

Highlights from Festival of Genomics California 2016

PrecisionFDA, Explore Your Genome Giveaway, & Treadmills

Last week, we had the pleasure of attending Festival of Genomics California. We hope you were able to catch some of the truly inspiring talks from the plenary speakers, such as Karen Nelson from J. Craig Venter Institute, Manolis Kellis from MIT Computational Biology Group, and Carlos Bustamante from Stanford University School of Medicine.

Our big story at the conference was about precisionFDA – an open source platform to advance regulatory science about NGS-based analytic tools and datasets. DNAnexus is under contract with the FDA Office of Health Informatics to assist the FDA in preparation for a December 15th launch by developing the precisionFDA portal and assisting in the engagement of the genomics community. At the Festival, we organized a panel moderated by Omar Serang and George Asimenos who led a discussion involving Deanna Church, Personalis; Michael Eberle, Illumina; Kevin Jacobs, 23andMe; and Justin Zook, NIST.

We kicked off a closed panel discussion with a video overview of precisionFDA.

George showed screenshots of the precisionFDA platform, highlighting examples of how the community will be able to use the platform, including the variant call comparison tool that can compare results between a test dataset and a benchmark/reference dataset, and separate out true positives, false positives and false negatives.

From there, the panel delved into a lively discussion. Deanna pointed out that only roughly 1500 out of the 4000 genes that have some clinical relevance are represented in their entirety in the Genome in a Bottle/National Institute of Standards (NIST) and Technology NA12878 dataset. Therefore, the precisionFDA community needs to spend time expanding that sample, not just adding more samples to the community’s database. Michael stressed the importance of pedigrees in validating variants in a reference sample.

Kevin made a passionate speech for how the community needs to focus on conventions. How can we possibly enter an era of precision medicine if we are still unable to agree on where the human genome exons are located? Researchers use computers and algorithms to calculate the results with binary formats, however the reality is the computer is only doing what the human researcher tells it to do, and since humans are not very precise, how can we be confident in our results? Justin stressed the importance that NIST should not be the standards-making body for dictating these standards, they need to be developed and agreed upon by a community.

In addition to its variant call comparison functionality, the precisionFDA platform will allow participants to conduct in-silico experiments using software and publish their conclusions in the form of electronic notes that can be placed with their experiments and reference the data and analyses. Using this feature, participants have the opportunity to showcase to the community their views on certain “hot” topics such as “the utility of simulations”.

Be sure to visit and follow us on @precisionFDA to get the latest. The platform will be accepting applications for membership in the community beginning December 15th!

Explore Your Genome Giveaway!
In addition, we also want to thank Sure Genomics for joining us in our booth to demo their product and discuss how the DNAnexus Platform analyzes their customer’s whole genome sequencing data. During the festival Sure Genomics and DNAnexus announced two lucky winners that will have their genomes sequenced and analyzed by Sure Genomics.  Each winner will receive a complimentary Personal DNA Kit, personal DNA report mapping key markers, and one-year free storage.   

Day 1 Winner:  Alain C., Insilico
Day 2 Winner:  Jennifer P., Kapa Biosystems

Congratulations Alain & Jennifer!

Returning Winners of Race the Helix Treadmill Challenge
FoG CA Race the HelixBlazing fast. That’s the only way DNAnexus runs, whether in the cloud or on the treadmill. DNAnexus repeated its June victory in Boston to win the Race the Helix Treadmill Challenge in California.

We also want to thank all those that stopped by our booth. We look forward to seeing everyone again next year at Festival of Genomics Boston.

Festival of Genomics: Not Your Typical Genomics Conference

Whole Genome Sequencing Giveaway, FDA’s New Approach to
Science & Treadmill Challenge

FoG CA logoFront Line Genomics is bringing Festival of Genomics to California this week (Nov 3-5th). Their inaugural event, which was held in Boston in June, was a great success and we are looking forward to the festivities taking place in our own backyard. Festival of Genomics will include a stellar lineup of speakers, plus posters, workshops, exhibitors, and networking opportunities. DNAnexus returns as defending champion of the “Race the Helix” treadmill challenge, a fundraising event for the Greenwood Genetic Center.

If you’re headed to Festival of Genomics California, please visit the DNAnexus booth (#15) to learn about our latest research and development project with the FDA’s Office of Health Informatics to build precisionFDA, an open source platform to advance regulatory science around NGS-based analytic tools and datasets. Learn more.

Sure Genomics logoIn collaboration with Sure Genomics, we’ll be giving away one FREE full genome sequence each day! Stop by the DNAnexus booth (#15) between 1-2pm to learn more about Sure Genomics, their new personal genome sequencing service and interactive analytics platform, and be entered to win.  More details below.

DNAnexus and Sure Genomics will sequence and analyze the whole genomes of two lucky winners ($2500 value)!

Sure Genomics Hour (DNAnexus Booth #15)
• Wednesday, November 4th at 1:00pm – 2:00pm
• Thursday, November 5th at 1:00pm – 2:00pm

Learn how Sure Genomics, a solutions provider for personal DNA analytics for health, fitness, and medical reports, has adopted the DNAnexus Cloud Platform to support its full genome sequencing offering. Sure Genomics beta opens early December and starts taking reservations, genome sequencing starts at the beginning of Q1 2016.

Each winner will receive a complimentary Personal DNA Kit, personal DNA report mapping key markers, and one year free storage. STOP BY TO LEARN MORE & ENTER TO WIN!

Details on the giveaway rules:

  • Entry will be open during Sure Genomics Hour November 4-5th from 1-2pm at the DNAnexus booth (#15).
  • Simply drop of your business card in the bowl at the DNAnexus booth and receive a sneak peek of the Sure Genomics product.
  • No purchase necessary and you don’t have to be an existing DNAnexus customer.
  • Winners will be announced and contacted by Sure Genomics after the Festival.

DNAnexus at Festival of Genomics California

 Genomics in the Age of Ubiquitous, Cheap Sequencing
Presenter:  Andrew Carroll, PhD, Director of Science, DNAnexus
When:  Wednesday, November 4th at 3:00pm
Where:  Stage G

Panel Discussions
Panel Discussion: Preparing (for) 1,000,000 Whole Genomes
When: Wednesday, November 4th at 2:00pm
Where: Stage G

Will Salerno, Sr Staff Scientist, Next-Gen Sequencing Informatics Human Genome Sequencing Center, Baylor College of Medicine

Matthew Trunnell, CIO and VP of Center Information Technology and Services, Fred Hutchinson Cancer Research Center
Jeffrey Reid, Sr Director, Head of Genome Informatics, Regeneron Genetics Center
Ali Torkamani, Director of Genome Informatics, The Scripps Translational Science Institute
Andrew Carroll, Director of Science, DNAnexus
Deniz Kural, CEO, Seven Bridges Genomics

PrecisionFDA Panel Discussion:  Advancing Precision Medicine by Enabling a Collaborative Informatics Community
When: Thursday, November 5th at 3:45pm
Where: Tech Forum Stage

Omar Serang, Chief Cloud Officer, DNAnexus
George Asimenos, Director of Strategic Projects, DNAnexus

Deanna Church, Senior Director of Genomics and Content, Personalis
Michael Eberle, Associate Director, Scientific Research, Illumina
Kevin Jacobs, Director of Laboratory R&D, 23andMe
Justin Zook, Research Scientist, National Institute of Standards & Technology

Learn about the development of an open-source platform for community contribution to NGS-related regulatory science.

Talks Powered by DNAnexus:

Title: Genome Informatics & Cloud Computing – the Internet Isn’t Just for Cat Pictures Anymore!
Presenter: Jeffrey Reid, Sr Director, Head of Genome Informatics, Regeneron Genetics Center
When: Thursday, November 5th at 5:00pm
Where:  Stage G

Visit us in booth #15 at Festival of Genomics!

F-U-N (Yes, there will be a treadmill on the exhibition floor!)
The DNAnexus Team swept the Race the Helix treadmill challenge at the last Festival of Genomics conference in June; and we want a chance to R-E-P-E-A-T. Come cheer on team DNAnexus Thursday 11:30 am, November 4th as we defend our title. 100% of funds raised by this event will go directly to the Greenwood Genetic Center Trust.