More Rapid Responses to Rare Disease

Families facing rare diseases know a thing or two about patience. They often face diagnostic odysseys that take years. Even when answers are forthcoming, therapies might not be. 

At DNAnexus, we’re proud to partner with several organizations to provide more rapid rare disease discovery, drug development, and delivery.   

One of these is Ultragenyx, a biopharmaceutical company based in Novato, CA, which is committed to bringing novel products to patients for the treatment of serious and debilitating rare and ultra-rare genetic diseases. The company has rapidly built a diverse portfolio of approved therapies and product candidates aimed at addressing diseases with high unmet medical need, yet clear underlying biology.

Delivering Safe and Effective Therapies

Ultragenyx is in the business of time. Its strategy is predicated upon time and cost-efficient drug development, with the goal of delivering safe and effective therapies to patients with the utmost urgency.

Thanks to DNAnexus Titan, we were able to provide Ultragenyx with a more streamlined way to unlock the power of multi-omics data and accelerate its discovery process. The unified NGS analysis platform brought its data and pipelines together in one secure environment for enhanced data analysis, collaboration, visualization, and sharing of results.

With the right infrastructure in place, Ultragenyx simplified the complexities of secondary analysis infrastructure, allowing researchers to focus on what’s important – rapid rare disease discovery

“You save the bioinformatician’s time, compute time, and therefore decrease turnaround time. This enables R&D researchers to answer questions and get rare disease treatments to market faster. You can’t put a price tag on that.”

Associate Director of Bioinformatics, Ultragenyx

Data Sharing & Accelerated Rare Disease Discovery

Data sharing is key when researching a rare disease with many subtypes driven by diverse and distinct genetic alterations. Our partnership with St. Jude Children’s Research Hospital and Microsoft has resulted in a cloud-based, data-sharing ecosystem that has proved to be a model for harmonized genetic data and collaboration across the pediatric cancer community.

The more samples researchers are able to analyze, the more power they have for genomic discovery and clinical correlative analysis. The St. Jude Cloud has the power of more than 1.25 petabytes of data, and has already enabled many discoveries, such as new insight into a rare C11orf95 fusion in ependymoma and the classification of 135 pediatric cancer subtypes by gene expression profiling and map mutational signatures.

By allowing data to be authenticated, tracked, and monitored in a single, secure, and compliant system, our platforms reduce many of the logistical difficulties that researchers might otherwise face. 

We’re happy to support large-scale, global collaborations that are accelerating genetic discovery and providing actionable insights that will help the not-so-rare 300 million people impacted by genetic diseases get help sooner, rather than later. 

How Regeneron Bypasses Bottlenecks to Iterate at the Scale and Speed of Science

How Regeneron Bypasses Bottlenecks to Iterate at the Scale and Speed of Science

At the Regeneron Genetics Center (RGC), scientists are uncovering important genomic variants involved in human health and disease and enabling important research into novel drugs and therapies.  RGC receives 500,000 samples per year and generates about 500 billion reads per hour. To date, the center has sequenced over 1.5 million samples and created one of the largest catalogs of human genetic variation.

To gain insights from all that data, RGC needed infrastructure capable of capturing and handling large quantities of genome and phenotype information. In a recent webinar, William Salerno, RGC’s Senior Director of Genome and Sequencing Informatics discussed the RGC’s infrastructure and how the center built a system capable of meeting its current and future data analysis needs.

The RGC accomplishes its data analysis using a combination of local infrastructure, cloud computing, and the DNAnexus Platform. RGC uses the platform to run various production and analytical workflows and for its data management and sharing needs. One component of the RGC platform is GLnexus, software that RGC researchers developed with DNAnexus and other partners that enables large-scale data merging. RGC researchers have tested it on over a million exome samples so they are confident that it scales to meet their needs.

RGC needed a solution that includes metadata capture and pipeline version control that enables extensive logging and troubleshooting. The platform provides a comprehensive security framework for keeping genomics data safe and secure to support the analysis and processing of genomics data. Salerno highlighted one example where RGC created an autonomous cloud environment for a partner that needed to analyze genomic and phenotypic data related to COVID-19. RGC was able to get the environment up and running in two days, and the partner was able to easily import data into the cloud and control who could access it.

For scientists looking to build the infrastructure that can support large-scale genomics, Salerno highlighted some key factors to consider during the webinar. There are the costs associated with the platform. This includes those for the physical infrastructure but also costs for audits, quality control, system redundancy, troubleshooting policies, managed services, and disaster recovery. Another important factor to consider is how metadata will be generated and captured on the platform.

The RGC is committed to ensuring that its work is equitable, open access and transparent. To that end they make open-source versions of the genome analysis pipelines that they use on the Titan Platform available to the scientific community. 

To learn more about how RGC’s platform is enabling scalable genome research, download the whitepaper or listen to the webinar

The Hybrid Hackathons of the Future — now with Librarians!


Hannah Gunderman, Data, Gaming, and Popular Culture Librarian, Carnegie Mellon University Libraries

Ben Busby, Director, Solution Science and Principal Scientist, DNAnexus


With the world still reckoning with the impacts of the COVID-19 pandemic, one thing that has remained constant is the need to change how people collaborate and communicate ideas, often shifting to remote and virtual formats.  The COVID-19 pandemic accelerated the rate at which hackathons are hosted in a virtual format. Remote hackathons have the potential to mirror the personally and professionally transforming experiences conveyed by in-person events to those who can not travel due to financial, physical, or environmental constraints. Remote hackathons allow the intellectual wealth of scientists in these countries to be applied to the important topics and goals of the hackathon, while supporting their health and safety through virtual participation. We hope that hackathons will retain a hybrid model to maximize the scientific contributions of both in-person and remote participants. 

Why are hackathons important?

Hackathons allow for concentrated, focused effort on a task or goal by bringing together scientific experts in a particular discipline, such as structural variants, or united by a common goal, such as ending neurofibromatosis.  Some hackathons solve thorny problems, make life easier for practitioners of specific disciplines, or push the boundaries of what a particular scientific field can do.  That said, hackathons not only produce content (usually software), but ideally also actively facilitate education and networking. Those who participate often have professionally transformative experiences that can lead to a wider scientific network, job opportunities, and increased confidence in their coding and research skills. 

Hackathons largely follow the model of “disruptive innovation” by serving as a prototyping layer across scientific organizations, producing new ideas and technologies that the community can then assess for value in their larger goals and initiatives. The prototypes that emerge either push the envelope of what is possible with biomedical informatics, or make day-to-day bioinformatics easier.  While the code isn’t necessarily persistent, these proof-of-concepts are intended for the community to build upon. Hackathons foster an environment with “buzz,” an economic geography concept referring to the serendipitous sharing of creative ideas that happens when people engage in face-to-face interactions. The last year has taught us that these benefits from hackathons are also afforded through hybrid or fully-remote formats, providing hope for a positive future of hybrid hackathons in scientific advancement and discovery. 

How do hackathons benefit the participants?

Not only do hackathons have an undeniable benefit to the broader scientific community, but, they also can provide transformative and impactful experiences for the participants themselves. These experiences largely revolve around the areas of confidence-building, educational development, and camaraderie. 

As described earlier, the “buzz” created in hackathon environments helps advance the sharing of creative and innovative ideas. Through this exchange of ideas, participants can advance their journeys in computational problem-solving and modern software development techniques. In the bioinformatics space, there are many beginner data scientists who are still learning foundational skills in computation and scientific collaboration. Hackathons, whether remote or in-person, offer a concentrated space for beginner data scientists to advance their skills in both of these areas alongside more established bioinformatics researchers. Not only does this afford educational benefits to these participants, but it can also increase their confidence as scientists who can contribute to important research endeavours. 

Finally, hackathons also create the opportunity for participants to forge close personal friendships and bonds, which can lead to long-term collaborations and network-building. 

Participants often find themselves in intensely challenging and time-limited environments as they race to accomplish the goals of the hackathon, and going through these transformative experiences together can lead to strong friendships and connections that span beyond the bounds of the hackathon itself. This is not limited to in-person hackathons, however: video-conferencing software such as Zoom and collaborative tools such as Slack allow participants to interact with each other and build both interpersonal and professional connections. 

A Retrospective Look Into CMU-DNAnexus Virtual “Genomic Data to the Clinic” Hackathon

The CMU-DNAnexus Virtual “Genomic Data to the Clinic” Hackathon (June 1st – June 4th 2021) was focused on bringing complex genomic data into the clinic.  Specifically, we focused on integrating Expressed Variants, Polygenic Risk Scores, Structural Variants and T-Cell Receptors into an Electronic Medical Record readable format using OMOP and worked on a clinically presentable interface. Remote support was offered by librarians from Carnegie Mellon University Libraries who have specialties in data management, bioinformatics, and information sciences. This support included collating important resources found by hackathon participants (such as tools, software, literature, etc.) into a single spreadsheet for easy access, reviewing the hackathon manuscript for syntax and readability, and preparing the manuscript for submission to BioHackrXiv. Communication platforms such as Zoom and Slack can offer ways to stay in touch and facilitate collaboration during a remote hackathon, but information can still get lost in translation in environments where we can’t see each other face-to-face. Librarians are trained in the information sciences and well-positioned to assist in keeping information organized and accessible during a remote or hybrid hackathon. 

Participants not only effectively used online collaboration tools to create innovative workflows and deliverables supporting the goals of the hackathon, but also used tools such as Slack to develop interpersonal friendships. Much of the same dynamic energy and “buzz” felt during an in-person hackathon was also felt in this virtual space and the experience has already led to some promising future collaborations and scientific endeavors, including an accepted proposal for a presentation at the 2021 annual meeting of the American Society of Human Genetics that will share the scientific findings from this hackathon. 

Upcoming hybrid hackatons

Although the pandemic is experiencing a long tail, we can still begin to envision what our post-pandemic future may look like, taking the lessons we have learned from navigating our remote environment for the past several months. One of the lessons we can bring into a post-pandemic future is that hackathons with a virtual option can help us create more equitable and diverse intellectual spaces for tackling the most pressing issues we face in bioinformatics. Moving forward, hackathons should take a hybrid model and allow for both in-person and remote participation, while allowing more team leads the sequestration they need to fully focus their energies on these efforts instead of juggling both work and the hackathon.

Further, leveraging the support of librarians in the hackathon space can lead to a more organized, cohesive, and collaborative experience for participants. This is particularly true for fully remote or hybrid hackathons, where clear communication channels are crucial for all participants. Librarians can help facilitate collaboration and coordination between remote and in-person participants, and help collate resources (such as tools, software, and literature) found during the course of the hackathon.  

We are excited to see what the future of hybrid hackathons holds for our field at large, and the scientific discoveries that will result from these events.  Below are some upcoming hackathons you can follow or get involved in!

Everything is bigger in Texas: Pan-Structural Variation hackathon in the Cloud! 

October 10-13, 2021, hosted by the Baylor College of Medicine

BioHackathon Europe

November 8-12, 2021, hosted by ELIXIR Europe

CMU-DNAnexus Hybrid “Genomic Data to the Clinic” Hackathon

March 9-11, 2022, hosted by CMU Libraries (stay tuned for more details!)

We also recommend keeping an eye on future events and initiatives hosted by the DEMON network, an international network for applying data science and AI to dementia!

Keep an eye on this link for more information about these and other events: