In medical school, perhaps the most indispensible texts were the “Ridiculously Simple” series – Clinical Anatomy Made Ridiculously Simple, Acid-Base Made Ridiculous Simple, etc. While you probably wouldn’t want to operate or dialyze based only on the knowledge in these short books, they nevertheless offered accessible overviews to complex and often intimidating topics.
In this spirit – and in response to questions from friends and family who regularly ask, “What does DNAnexus do” – I thought I might offer this short post.
What Is DNAnexus?
DNAnexus is a platform – basically, a sophisticated software program – that makes it easier for users to do three things, each in a secure and compliant fashion:
- Analyze large amounts of raw genetic data
- Share and collaborate around large amounts of data (including but not limited to genetics)
- Integrate genetic data with other types of data, such as data from electronic medical records or imaging data, to advance science and to improve clinical care
Let’s take these one at a time.
(1) Analysis Of Raw Sequencing Data
The basic idea here is that the machines that are used to read DNA sequence are incredibly powerful, but don’t generate a book of information that starts at the beginning of the first chromosome and concludes at the end of the last one. Rather, most sequencing machines spit out phrases of about 100 letters, phrases randomly located anywhere in the 3 billion letter book that is the human genome. A computer must figure out where each individual phrase fits in the book, and must also determine whether there are any typos. This can be a computationally intensive task, but DNAnexus provides a way to do this efficiently, by dividing the task into multiple parallel streams each of which can be tackled by a powerful computer.
The computers DNAnexus tends to use are run by Amazon (more precisely, by Amazon Web Services, or AWS), and our use of them is an example of what’s known as “cloud computing” because the computers operate from a massive, dedicated central facility, rather than from a user’s own institution. One advantage of using cloud computing is it’s very much “on demand” – i.e. you have essentially unlimited access to as many computers as you need, and you only pay for the computers that you actually use, and only when you are actually using them.
(2) Distributed Collaboration
Progress in both science and medicine can be accelerated when data can be easily shared. When there are large volumes of data, as is increasingly the case in research and clinical realms, this can be a real problem. Remarkably, the most common method of large-scale data sharing today is probably FedEx’ing hard drives between institutions. What DNAnexus enables is for a distributed team of researchers or clinicians to all have access to the same data at the same time; by bringing together the data, the experts, and the tools for analysis, DNAnexus facilitates collaboration and accelerates knowledge turns.
DNAnexus is ideally suited to power consortia, whether NIH investigators (as in the case with our work with CHARGE in the area of cardiovascular disease or our work with ENCODE in the area of genetic annotation), diagnostic companies (our work on precisionFDA), translational research partnerships (our work with Regeneron and Geisinger Health system), or a public/private partnership of cancer researchers (our work with ITOMIC led by University of Washington’s Tony Blau).
The ability to support distributed innovation also enables DNAnexus to provide global support for companies like Natera that send kits to sequencing labs worldwide, but collect and analyze the data centrally using DNAnexus.
(3) Integration With Other Data Types
The insights that may be available in genetic data are often revealed only when the information is considered and analyzed in the context of other data types, such as data from electronic health records (EHR) or imaging data (such as radiology images or pathology images). Integrating genetic and EHR data is fundamental to the drug discovery work of Regeneron, for example. In the same way our partners can easily access and efficiently utilize the fundamental tools of genetic analysis on our platform, so too can they access and utilize the tools required for integrating genetic data with other data types. DNAnexus is adding tools constantly, based on the needs expressed by our partners.
Guided by the visionary partners with whom we are privileged to work, DNAnexus continues to enhance our tools around each of these three areas: DNA analysis, distributed collaboration, and integration with other data types. We are constantly seeking opportunities to leverage the technology we’ve developed, as well as innovative leaders looking to bring the power of our platform to bear in original and impactful ways.