B1470 - Investigation of the mapping from genetic markers to facial features - 22/11/2012

B number: 
B1470
Principal applicant name: 
Dr Colin Campbell (University of Bristol, UK)
Co-applicants: 
Prof Stephen Richmond (University of Cardiff, UK), Dr Dave Evans (University of Bristol, UK), Dr Lavinia Paternoster (University of Bristol, UK)
Title of project: 
Investigation of the mapping from genetic markers to facial features.
Proposal summary: 

Aim: we aim to investigate the mapping from genetic variants to normal facial variation measures using data from the Avon Longitudinal Study of Parents and Children (ALSPAC). 3D high-resolution images have been obtained using two laser scanners for 4747 children. The images were merged, aligned and 22 important facial landmarks were identified. Their x, y and z co-ordinates were used to generate 54 3D distances reflecting facial features. These children also have genome-wide single nucleotide polymorphism (SNP) data available for ~2.5 million genetic markers.

Dr. Campbell has a long standing interest in machine learning and the posed problem of mapping from genetic data to a set of facial feature measures is a classic problem within machine learning. We will start by looking at some Bayesian ARD algorithms, previously devised by Dr. Campbell and collaborators (ARD = automatic relevance determination). These algorithms have an inbuilt mechanism to discard features (in this case genetic variant data) irrelevant to the considered problem. They derive a regression function (since the mapped facial feature will be continuously-valued). Performance evaluation will be on test data by ranked performance i.e. we make a prediction from the input data, evaluate the distance between this prediction and the corresponding facial feature measures of members of the test group, hence determining the position of the correct matching value in the rank list. For n members of the test set, an average rank of n/2 indicates null predictive performance. We will experiment with other types of classifier and other feature selection strategies, including predictors which give a posterior spread for the prediction (a mode and a spread for the predicted facial features). We also aim to look at other machine learning tools, for example, sparseCCA (canonical correlation analysis) which may indicate the correlates between sets of input (genetic) features and facial features.

Dr. Campbell has already contacted Dr. Lavinia Paternoster (MRC CAiTE centre) and will work closely with her and the rest of the team generating and analysing this data. We only envisage an initial look-see investigation to determine the level of predictive performance which may be achievable. It is possible the size of the input data (~2.5 million genetic markers) may be too prohibitive in computational cost for the anticipated Bayesian ARD algorithms. Also predictive performance may be too low. However, should positive results be obtained, we will communicate this to Lavinia and collaborators to discuss extensions and any possible scope for publications.

Date proposal received: 
Thursday, 22 November, 2012
Date proposal approved: 
Thursday, 22 November, 2012
Keywords: 
GWAS
Primary keyword: