Skip to content

Inside DNAnexus

Product updates, industry insights, opinions and references. From the team powering the Genomics Revolution.

One Simple Solution for Ten Simple Rules

plos computational biologyLike many in the systems biology space, we have been longtime fans of Philip Bourne’s Ten Simple Rules articles since the first one was published in PLoS Computational Biology back in 2005. (“Ten Simple Rules for Getting Published,” October 2005.)

The latest installment is especially near and dear to us at DNAnexus: “Ten Simple Rules for Reproducible Computational Research,” written by Geir Kjetil Sandve, Anton Nekrutenko, James Taylor, and Eivind Hovig. (And edited by Bourne, of course.) The writers begin with the premise that there is a growing need in the community for standards around reproducibility in research, noting that negative trends in paper retractions, clinical trial failures, and papers omitting necessary experimental details have been getting more attention lately.

“This has led to discussions on how individual researchers, institutions, funding bodies, and journals can establish routines that increase transparency and reproducibility,” Sandve et al. write. “In order to foster such aspects, it has been suggested that the scientific community needs to develop a ‘culture of reproducibility’ for computational science, and to require it for published claims.”

The rules begin with the lessons you learned when you got your first lab notebook — “Rule 1: For Every Result, Keep Track of How It Was Produced” — and progress to more complex mandates — “Rule 6: For Analyses That Include Randomness, Note Underlying Random Seeds.”

What really stood out for us was that all of these guidelines are addressed by best practices in cloud computing. For example, when we built our new platform, we implemented strict procedures to ensure auditability of data — the system automatically tracks what you did to get a result, ensures version control, serves as an archive of the exact analytical process you used, and stores the raw data underlying analyses. Utilizing a cloud-based pipeline also offers true reproducibility because you can always perform the same analysis again (using the specific version of your pipeline) or make your pipeline publicly accessible, granting anyone else the ability to rerun the analysis.

Be sure to check out all 10 rules, and feel free to take a tour of the DNAnexus platform to see how it can help you achieve reproducibility in your own computational research.

About DNAnexus

DNAnexus the leader in biomedical informatics and data management, has created the global network for genomics and other biomedical data, operating in 33 countries including North America, Europe, China, Australia, South America, and Africa. The secure, scalable, and collaborative DNAnexus Platform helps thousands of researchers across a spectrum of industries — biopharmaceutical, bioagricultural, sequencing services, clinical diagnostics, government, and research consortia — accelerate their genomics programs.

The DNAnexus team is made up of experts in computational biology and cloud computing who work with organizations to tackle some of the most exciting opportunities in human health, making it easier—and in many cases feasible—to work with genomic data. With DNAnexus, organizations can stay a step ahead in leveraging genomics to achieve their goals. The future of human health is in genomics. DNAnexus brings it all together.