It’s been an exciting time for DNAnexus since launching our company at the recent Bio-IT World Conference & Expo in Boston. We’ve spoken with many of you about your experience using DNAnexus and received great feedback, much of which is already finding its way into future releases.
There is certainly skepticism out there, and plenty of negative experiences. Vivien Marx wrote a great story in BioInform (Full disclosure: we were interviewed for the article) highlighting the ongoing debate over cloud computing, and gives examples of real problems people in the field have experienced. The challenges of using cloud are of course not unique to computational biology, and have been discussed for years, for example in this excellent report from the UC Berkeley RAD lab. The term “cloud” conjures up concerns about data transfer issues, security and control, platform lock-in, difficulty managing amorphous compute resources, the reliability of those resources, and over-crowding.
To address this skepticism, let’s first agree upon what we mean by “cloud” because the term is used by some to describe anything that runs in your web browser, while to others it’s just a fashionable marketing tool for IT infrastructure. Our definition for cloud is an elastic and scalable infrastructure for compute, storage, and networking. Elastic means that we can grow or shrink our use of those resources at any time. Scalable means there’s always room to grow your infrastructure. These two traits of cloud computing are incredibly powerful: Do you have 100 jobs to run? Launch 100 compute nodes and run them all in parallel. Pay the same as running them in a serial fashion, but finish in 1/100th the time. Need to store 10 Terabytes of data for a 6-month project? No problem, it’s available, just pay for 60 TB-months of storage. And when the day comes that you need to run 10,000 compute nodes or store 10 Petabytes of data, you don’t have to worry about building out a datacenter – the cloud will scale to those levels!
DNAnexus’ use of the cloud mirrors this: we’ve built a web-based platform on top of the cloud to harness its power. All the sequence analysis and data management tools are available to you through your web browser, and we transparently manage all the cloud resources. Moving data around the cloud, figuring out where and how to store it reliably, launching compute nodes and coordinating their work – all this happens below the surface. We present an intuitive interface to you that removes all the challenges of using the cloud, while passing through all the benefits – tremendous scalability on-demand. Is it possible to build it without the cloud? Yes, but we wouldn’t be able to amortize the infrastructure costs over the thousands of people working with similar data. We wouldn’t be able to charge you a low per-sample cost.
Take a look for yourself. Sign up for a free account today and tell us what you think. Is the cloud hype? Or is it an innovative approach to next-gen DNA sequence analysis and data management?