In " UKB-RAP Researcher Spotlight "
The Monthly Researcher Spotlight is our section highlighting the exciting work of the UK Biobank Research Analysis Platform user community. If you would like to be featured, email email@example.com.
This was simultaneously published in the March 2023 UK Biobank RAP Newsletter. You can sign up for future installments here.
This month's Spotlight features Simone Rubinacci, who was a member of the team led by Olivier Delaneau, which utilized SHAPEIT5 to phase the 200k WGS samples from the UK Biobank. You can find more information on the data release & view the recording of him discussing his work from our April Community Meetup. You can also read his recent publications performing imputation and rare variant phasing on the UK Biobank WGS data.
Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School
What are the focus and discovery highlights of your research?
My research has been focused on developing methods to retrieve information from noisy whole-genome sequencing or SNP array data, often involving haplotype phasing. As an example, I worked on genotype imputation of low-coverage sequencing data, developing a software called GLIMPSE that solved a computational constraint in the field. This is what I am passionate about: producing scalable methods that work on the largest genetic datasets to take advantage of the big data revolution we are experiencing in human genetics.
What are some of the key questions that you are looking to answer using UK Biobank data?
The whole-genome sequencing data of the UK Biobank offers an unprecedented opportunity to look at the effect of extremely rare variants to complex diseases. Also, the extraordinary accuracy of haplotype phasing methods gives us the opportunity to make new exciting discoveries, such as the possibility to look at compound heterozygous and parent-of-origin effects for hundreds of traits.
How has the UK Biobank Research Analysis Platform (UKB-RAP) helped you perform your research?
At first, I was hesitant to move to a cloud platform, as even a small mistake can lead to an expensive bill for the lab. However, as a more experienced user, I think the UKB-RAP often is a better alternative and offers new opportunities compared to local clusters. For example, a single WDL file shared between many users can be used as a “personal imputation server”. Overall, the UKB-RAP offers a flexible and reliable environment for genetic research.
Any tools or tutorials that you have developed that would be useful for the UKB-RAP community?
Me and colleagues at the University of Lausanne recently developed a tool for phasing the entire UK Biobank, called SHAPEIT5. This tool is designed to obtain fast and accurate phasing for rare variants. Another software I have been working on recently is GLIMPSE2, which can perform low-coverage WGS imputation from the phased UKB resource for less than 0.1 GBP per genome. These tools have been fully integrated into the UKB-RAP, alongside a curated release of the data (you can find tutorials on how to dx compile the applets and run the whole process here).