B2021 - Using linked health and administrative data to reduce bias in observational research - 06/06/2013

B number: 
B2021
Principal applicant name: 
Mrs Rosie Cornish (University of Bristol, UK)
Co-applicants: 
John Macleod (University of Bristol, UK), Prof Kate Tilling (University of Bristol, UK)
Title of project: 
Using linked health and administrative data to reduce bias in observational research.
Proposal summary: 

Aim

The overarching aim is to examine how linked health and administrative data can be used to avoid bias in prospective cohort studies, using the Avon Longitudinal Study of Parents and Children (ALSPAC) as an exemplar. This aim will be addressed using simulation studies and by examining three questions of epidemiological importance:

a) Is breastfeeding associated with IQ at age 15? Linkage to education data (GCSE results) will be used to examine the missingness mechanism for IQ, and may be used in imputation of the missing values.

b) Is smoking in the early teenage years associated with educational attainment at age 16? Data on smoking from the young people's GP records will be used to examine missing data patterns in self-reported smoking and to investigate misclassification. GCSE results from linked educational data will be used as the outcome in this analysis.

c) Is maternal smoking in pregnancy associated with depression at age 17? As for smoking, linkage to relevant data held within GP records will be used to look at the objectives below in relation to this outcome.

Objectives

1. To develop methods for using linked health and administrative data to examine patterns of missing data and model missingness mechanisms in longitudinal studies such as ALSPAC, focussing in particular on outcomes and exposures that are likely to be MNAR (missing not at random).

2. To incorporate linked health and administrative data in multiple imputation models to explore biases introduced by missing data in exposures or outcomes in observational studies.

3. To compare data in ALSPAC to equivalent outcomes recorded in linked electronic primary care records (GP data) to investigate misclassification in the self-reported outcomes and, in particular, to identify whether these are subject to differential or non-differential misclassification.

4. To develop methods to use both linked data and self-reported data to minimise the impact of measurement error on analyses in observational studies.

As one of the exemplars involves obtaining data on depression from electronic patient records, a further objective is:

5. To devise and modify existing algorithms for defining depression using electronic GP data, using information contained within Read codes and to use this information to estimate the prevalence of depression among ALSPAC teenagers.

Exposure variables

Breastfeeding, smoking in pregnancy, early teenage smoking (at 12/13 years) - from ALSPAC and linked GP records

Outcome variables

IQ at 15 years, GCSE results (linked data), depression at 17 years - from ALSPAC and linked GP records

Confounding variables

Maternal and paternal education, family occupational social class, housing tenure, family adversity index (and the individual components), family income, maternal and paternal smoking, maternal & paternal pre and post-natal depression, parental conflict, marital status (parents), maternal age at birth, maternal alcohol intake in pregnancy, family composition.

Date proposal received: 
Tuesday, 28 May, 2013
Date proposal approved: 
Thursday, 6 June, 2013
Keywords: 
Alcohol, Breast Feeding, Depression, Education, Smoking
Primary keyword: