Skip to content

UK Biobank RAP Researcher Spotlight: March 2023

The Monthly Researcher Spotlight is our section highlighting the exciting work of the UK Biobank Research Analysis Platform user community. If you would like to be featured, email

This was simultaneously published in the March 2023 UK Biobank RAP Newsletter. You can sign up for future installments here.

This month's Spotlight features Simone Rubinacci, who was a member of the team led by Olivier Delaneau, which utilized SHAPEIT5 to phase the 200k WGS samples from the UK Biobank. You can find more information on the data release & view the recording of him discussing his work from our April Community Meetup. You can also read his recent publications performing imputation and rare variant phasing on the UK Biobank WGS data.


Simone Rubinacci
Postdoctoral fellow
Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School


What are the focus and discovery highlights of your research?

My research has been focused on developing methods to retrieve information from noisy whole-genome sequencing or SNP array data, often involving haplotype phasing. As an example, I worked on genotype imputation of low-coverage sequencing data, developing a software called GLIMPSE that solved a computational constraint in the field. This is what I am passionate about: producing scalable methods that work on the largest genetic datasets to take advantage of the big data revolution we are experiencing in human genetics.

What are some of the key questions that you are looking to answer using UK Biobank data?

The whole-genome sequencing data of the UK Biobank offers an unprecedented opportunity to look at the effect of extremely rare variants to complex diseases. Also, the extraordinary accuracy of haplotype phasing methods gives us the opportunity to make new exciting discoveries, such as the possibility to look at compound heterozygous and parent-of-origin effects for hundreds of traits.

How has the UK Biobank Research Analysis Platform (UKB-RAP) helped you perform your research?

At first, I was hesitant to move to a cloud platform, as even a small mistake can lead to an expensive bill for the lab. However, as a more experienced user, I think the UKB-RAP often is a better alternative and offers new opportunities compared to local clusters. For example, a single WDL file shared between many users can be used as a “personal imputation server”. Overall, the UKB-RAP offers a flexible and reliable environment for genetic research.

Any tools or tutorials that you have developed that would be useful for the UKB-RAP community?

Me and colleagues at the University of Lausanne recently developed a tool for phasing the entire UK Biobank, called SHAPEIT5. This tool is designed to obtain fast and accurate phasing for rare variants. Another software I have been working on recently is GLIMPSE2, which can perform low-coverage WGS imputation from the phased UKB resource for less than 0.1 GBP per genome. These tools have been fully integrated into the UKB-RAP, alongside a curated release of the data (you can find tutorials on how to dx compile the applets and run the whole process here).

About DNAnexus

DNAnexus the leader in biomedical informatics and data management, has created the global network for genomics and other biomedical data, operating in 33 countries including North America, Europe, China, Australia, South America, and Africa. The secure, scalable, and collaborative DNAnexus Platform helps thousands of researchers across a spectrum of industries — biopharmaceutical, bioagricultural, sequencing services, clinical diagnostics, government, and research consortia — accelerate their genomics programs.

The DNAnexus team is made up of experts in computational biology and cloud computing who work with organizations to tackle some of the most exciting opportunities in human health, making it easier—and in many cases feasible—to work with genomic data. With DNAnexus, organizations can stay a step ahead in leveraging genomics to achieve their goals. The future of human health is in genomics. DNAnexus brings it all together.