The Monthly Researcher Spotlight is our section highlighting the exciting work of the UK Biobank Research Analysis Platform user community. If you would like to be featured, email community@dnanexus.com.
This was simultaneously published in the March 2024 UK Biobank RAP Newsletter. You can sign up for future installments here.
This month's Spotlight features Dr. Gareth Hawkes, a researcher studying how regulatory regions of the genome impact human phenotypes.
Dr. Gareth Hawkes
NIHR BRC Translational Research Fellow
University of Exeter Medical School
What are the focus and discovery highlights of your research?
My research aims to understand how regulatory regions of the genome impact complex, common human phenotypes. Much work has been done to disentangle the biological pathways involved in coding variant associations, but less has been done to understand the non-coding genome because of the huge complexity of annotating and analysing it. The release of whole-genome sequencing (WGS) data for 500,000 participants in UK Biobank, paired with measurements of 3,000 circulating protein levels, has allowed us to identify thousands of non-coding associations, with some effect sizes on par with coding variation. One of our biggest discoveries so far has been to identify an allelic series of rare non-coding variants in an enhancer which appear to independently increase height by up to 6cm, next to a gene with no observed coding consequence. Our discoveries are some of the first aggregates of rare non-coding variants identified for common phenotypes.
What are some of the key questions that you are looking to answer using UK Biobank data?
We believe that the UKB WGS data may allow us to at least partially answer the ‘missing heritability’ conundrum, where common-variant imputation and exome sequencing have struggled. Our primary goal would be to find a novel drug target for a common, complex disease, such as obesity, which only becomes clear through regulatory pathways. To this end, we are interested in demonstrating how WGS can be used to fine map known common variant loci, 90% of which lie in non-coding regions of the genome.
How has the UK Biobank Research Analysis Platform (UKB-RAP) helped you perform your research?
The UKB-RAP has provided the computational infrastructure to perform whole-genome sequencing analysis on 1000’s of phenotypes. Cumulatively requiring nearly 30PB of data, which can be accessed almost instantaneously with no storage cost is a huge boon to my research. As more and more cohorts come online, and we find ways to interface between them, I can only see the UKB-RAP’s usefulness growing exponentially.
Any tools or tutorials that you have developed that would be useful for the UKB-RAP community?
I would recommend our paper which describes the impact that genetic variants in the non-coding genome can have on common complex phenotypes. We will shortly be making the computational framework we have developed for non-coding whole-genome analysis – stay tuned!