The Hundred Year Study? Newborn Sequencing Grants Bring Opportunity for Long-Term Data Analysis

NIHThe announcement last month that the National Human Genome Research Institute and National Institute of Child Health and Human Development were awarding $25 million to study genome sequencing for newborns was welcome news to the genomics community — and will serve as a great opportunity to understand the long-term implications of analysis and storage of DNA data.

After all, in addition to the clinical and ethical implications of conducting genome sequencing from birth, there are a host of logistical questions, including how that data will be managed for the 90 or 100 years that many of these newborns will live?

Projects like these offer incredible opportunity to think about lifelong, clinically useful genomic data. We anticipate the need for storage and infrastructure that is far more dynamic than we’re used to today, with our flash drives or hard drives or DVDs. If you consider data gathered just 30 years ago, very few of the media on which that was all stored are even accessible with current technology – anyone remember ZIP drives and floppy disks? Continuing innovations in media not only render older storage technologies obsolete, but all too often they are completely incompatible with each other.

For these new projects that will sequence thousands of individuals from newborns to adults, it’s simply not realistic to expect a team of scientists to manually shift data every few years across several different types of media, to keep these important genome sequences easily accessible. That’s why we think cloud computing will be the best answer for programs like this one. Cloud providers already make it their business to use the latest and greatest technology, and they have entire teams of experts who spend their days making sure data will remain live and readily accessible in the long term.

Cloud computing services can also ensure ongoing clinical compliance and rock-solid security, two critical needs for data sets like these newborn genomes. And it can scale up easily and cost-effectively as demand for newborn genome sequencing soars in the coming years, providing non-redundant, secure, readily-accessible resources.

We look forward to seeing the results of these valuable new studies, and to participating in the discussion as the community thinks about best practices for interpreting and managing data that may need to be maintained for a century or more.