B1338 - Predicting phenotypic outcomes of genetic variants - 29/03/2012

B number: 
B1338
Principal applicant name: 
Dr Julilan Gough (University of Bristol, UK)
Co-applicants: 
Prof George Davey Smith (University of Bristol, UK)
Title of project: 
Predicting phenotypic outcomes of genetic variants
Proposal summary: 

AIMS

The aim of this project is to test and develop a proof of principle for a molecular-based bioinformatics approach to phenotype prediction.

HYPOTHESIS

The hypothesis is that single nucleotide variations in humans leading to non-synonymous changes in protein coding genes will affect the phenotype of the individual and that it is possible to predict this from knowledge of the protein.

METHODOLOGY

We are already able to predict with a reasonable degree of success whether a mutation is likely to have a phenotypic impact using bioinformatic techniques to compare the non-synonymous mutation to the wild-type using hidden Markov models trained on all proteins of known structure. Basically we compare the emission probability from the hidden Markov model of the wild-type and mutated amino acids; a big difference indicates that the change will not be well supported by the structure and function of the protein. In addition to this, independent work has shown that we are able to predict the function of an unkown protein in most cases (this is the field of protein function prediction). As well as gene function we are able to associate phenotypic outcomes from, e.g. knockouts in mice or known disease-causing mutations in humans, with many proteins. The approach which we wish to test on the ALSPAC data is the combination of the likelihood of a phenotypic impact of a mutation with phenotype and gene function associations. We wish not only to consider single point mutation outcomes but also multi-loci variations which, although individually may not have a significant predicted phenotypic outcome, when taken together across the genome could have a summed effect which is measurable in the phenotype.

DATA

The object of this is not to detect new associations but to seek a proof of principle that some phenotypic information may be inferred from molecular data alone. As such we require the genome sequence information for as many indivuals as possible, with all associated phenotype data for the purpose of validating predictions.

Date proposal received: 
Thursday, 29 March, 2012
Date proposal approved: 
Thursday, 29 March, 2012
Keywords: 
Genetics
Primary keyword: