B2063 - Assessing linkage error and bias between ALSPAC and HES - 15/08/2013

B number: 
B2063
Principal applicant name: 
Mr Andy Boyd (University of Bristol, UK)
Co-applicants: 
Prof Harvey Goldstein (University of Bristol, UK), Prof John Macleod (University of Bristol, UK)
Title of project: 
Assessing linkage error and bias between ALSPAC and HES.
Proposal summary: 

Longitudinal studies are making increased use of routine health and administrative data as a means of informing missing data techniques and sustainable data collection. These advantages are dependent on the accurate interpretation of the linkage. Links between an individual and their routine records are established by comparing personal identifiers common to both datasets. The potential to do this accurately is impacted by the choice and application of the linkage algorithms and the quality and discriminatory potential of the available identifiers. Recent work by Goldstein, Harron and Wade (2012) demonstrated new methods to enhance the efficiency of the linkage process using multiple imputation (MI) techniques. Once linked, the onus is on the study team to provide the provenance of the data; describing the linkage methodology and assessing the quality of the linkage at an individual level.

Through the Project to Enhance ALSPAC through Record Linkage (PEARL) we are linking the study index children to their secondary health care records, held within the Hospital Episodes Statistics (HES) dataset. The accuracy of this linkage is of concern as the personal identifiers held in early HES data (pre 1997) will in some cases lack the discriminatory power to identify a single individual. The NHS Data Linkage Service linked a pilot sample of 3,198 study participants to their 1991-2012 HES records. The linkage algorithm varied depending on the ability of the identifiers to establish a 'true match'.

AIM: To provide evidence of the quality of the linkage between ALSPAC and HES, particularly in terms of population coverage. Ultimately to use the evidence (if the hypothesis is true) to seek HES permissions to alter the linkage methodology, specifically to use the prior informed imputation techniques proposed by Goldstein. To use ALSPAC data obtained through different channels (data abstraction) as a 'gold standard' to 'validate' the imputed output through replicating a known ALPSAC finding (this will be subject to a new research proposal with standard data access conditions).

HYPOTHESIS: That the linkage variables used by HES to conduct the match are insufficient to identify all possible ALSPAC records.

EXPOSURE VARIABLES: The administrative linkage data supplied to the NHS, residential address history (specifically if they lived within England & Wales or not), self-reported hospital admissions (including cause and date and length of stay).

OUTCOME VARIABLES: Linkage status, delivery location and date, birth weight, gestational age.

CONFOUNDING VARIABLES: These relate to the accuracy of the linkage variables: enrolment details (i.e. for new cases we won't have postcode at delivery) and indicators of address quality (participation status at delivery, home movement,house tenure, known birth outcome).

Date proposal received: 
Thursday, 8 August, 2013
Date proposal approved: 
Thursday, 15 August, 2013
Keywords: 
Data Linkage
Primary keyword: 
Methods