B2091 - An investigation into the confounding associated with genome-wide score and methods to minimise its impact - 10/10/2013

B number: 
B2091
Principal applicant name: 
Dr Tauseef Ahmad Khan (University College London, UK)
Co-applicants: 
Dr Nic Timpson (University of Bristol, UK)
Title of project: 
An investigation into the confounding associated with genome-wide score and methods to minimise its impact.
Proposal summary: 

AIM

We plan to investigate the reasons for confounding of the genome-wide genetic BMI score and its extent using in ALSPAC and examine various means to minimise its impact any subsequent analysis.

Hypothesis

In an ongoing study of ALSPAC participants (Deanfield London Study) our recruitment is based on extremes of BMI percentages as determined by BMI associated common genetic variants. We used a genome-wide score that enabled us to reach a difference of 3 BMI units between the top and bottom percentages of BMI categories.

Preliminary data analysis on 200 subjects in the current study suggests that while the genetic score has as expected separated the population into two distinct BMI categories with mean difference of 3 BMI unit difference. However, the score itself is not totally unconfounded from environmental factors as was predicted at the start. This is likely is due to the use of a genome-wide score, instead of using a score from a limited set of strongly associated SNPs, that appears to have re-introduced confounding to some extent, although the reason behind these unexpected properties is unclear.

Methods

We will to explore this confounding and aim to answer the following questions.

a) What is the extent of the confounding of the genetic score i.e. what are the main confounders?

b) What is the main driver of this confounding: Genuine or horizontal pleiotropy? Vertical pleiotropy with SNPs acting on intermediate pathways to BMI or population stratification?

c) Can this confounding be controlled by adjusting for the main confounders?

d) Is this confounding less than what is expected by generation of groups by BMI alone? If yes, by how much?

e) In light of the above what are the limitations of our selected groups with differential BMI for the purpose of using them to study the main programme question i.e. adiposity-cardiovascular disease associations.

To answer the above questions we would require extensive data on confounders, the genetic score groups, obesity profiles and cardiovascular risk factors. In brief the following analysis will be performed.

Using standard regression model we shall investigate if the BMI genetic score is associated with an extensive list of confounders. We shall use adjustment methods for various confounders to investigate the the drivers of confounding and if their effect can be minimized. Intermediate pathway confounders will be exploited to investigate the presence or absence of vertical plieotropy.

Employing PCA (principle component analysis) we shall also investigate if there is a structure in the SNP sets used in the score and can PCA adjustment remove the confounding effect.

Using regression modelling we shall quantify extent confounding using conventional BMI and compare and contrast with the ones revealed from the genetic score analysis.

Using dummy data which shall be used to study the main questions from the ongoing study (Deanfield London Study) i.e. visceral adiposity-cardiovascular disease markers association, this way we can inquire into any limitation of adjustment for these confounding on any subsequent analysis on the real data when its available after study completion.

Date proposal received: 
Tuesday, 8 October, 2013
Date proposal approved: 
Thursday, 10 October, 2013
Keywords: 
Genetics, BMI
Primary keyword: 
Methods