Addressing the Complex Storage and Archival Needs of DNA Sequencing Data

DNA Data Archive

Computational biologist and large-scale computational DNA expert, Eric Schatz, estimates that by the year 2025 we will amass between 100 million and 2 billion sequenced human genomes.1 That’s a massive amount of data to use for the purpose of improving human health, but there’s a bit of a catch: we need to find creative solutions for storing these data. These solutions must be economical and must promote rapid retrieval of data when needed for analyses.

Typically, cloud providers, such as Amazon Web Services (AWS) and Microsoft Azure, use a tiered pricing structure. Data needed frequently command a higher storage fee than those needed infrequently and placed in cloud archives, or cold storage.

At DNAnexus, we are committed to supporting the sequencing community with creative solutions, which is why we are proud to announce the upcoming rollout of our new archival service.

The DNAnexus archival service provides a cost-effective and secure way to store files that do not need to be accessed frequently. More importantly, even though the files may be moved to cold-storage, the DNAnexus Platform continuously maintains the data provenance and keeps the meta-information of those files, such as tags and property key-value pairs, searchable. With the DNAnexus archival service, you can locate the files–whether they are archived or live–simply by querying their meta-information. 

With this feature, you can archive individual files, folders, or entire projects. You can also easily unarchive one or more files, folders, or projects when they need to make the data available for further analyses.

Currently, the DNAnexus Archival Service is available via the application program interface (API) in AWS regions only, and you must have a license.

  • To learn more about archiving and unarchiving, click here.
  • To request a license to use this feature, contact sales@dnanexus.com

1. Fleishman G. The Data Storage Demands of Genome Sequencing Will Be Enormous. MIT Technology Review. https://www.technologyreview.com/s/542806/how-do-genome-sequencing-centers-store-such-huge-amounts-of-data/. Published October 26, 2015. Accessed October 3, 2019.