Refining GWAS Results Using Machine Learning

Genome-wide association studies (GWAS) present a viable approach for researchers to identify genetic variations associated with a particular trait. GWAS have already identified several single nucleotide polymorphisms associated with diabetes, Parkinson’s disease, amongst others. However, these comprehensive studies frequently identify large numbers of genetic variants associated with the phenotypes, not all of which are causal.

Fine mapping, which is a statistical process in which additional data are introduced to the GWAS dataset, enables researchers to prioritize those variants that warrant additional examination. And it also helps them identify which variants narrowly missed the genome wide significance threshold but actually are causal.

But fine mapping is easier said than done. For starters, you have to set up the proper computing environment — one that promotes traceability and reproducibility. Traceability and reproducibility become even more important when you are testing a drug which will potentially enter clinical trials. You also need to assemble the data in a way your fine mapping algorithms expects, which can be challenging. Not to mention the scientific challenges: it’s hard to compare and evaluate models and there are no frameworks that enable you to interact with the models and improve upon them.

The DNAnexus Platform provides end-to-end support for machine learning and also enables you to build and deploy the models such that domain scientists can ask questions and interact with the models themselves.

Join us for our upcoming webinar in which we provide an overview of how to refine your GWAS results using fine mapping. Specifically, by borrowing from Bayesian statistical methods, we present an interactive approach for applying machine learning-based models in fine mapping. Real-life examples will be demonstrated using UK Biobank data on the DNAnexus Platform. Register now.