B3737 - The CIVIC project Predicting Covid-19 Impact on Vulnerable Individuals and Communities via Health and Loyalty Card data - 19/03/2021

B number: 
Principal applicant name: 
James Goulding | University of Nottingham (United Kingdom)
Prof Nic Timpson , Prof Markus Owens, Dr Anya Skatova, Dr John Harvey, Prof Chris Starmer, Prof Andrew Smith
Title of project: 
The CIVIC project: Predicting Covid-19 Impact on Vulnerable Individuals and Communities via Health and Loyalty Card data
Proposal summary: 

Diseases such as COVID-19 are not defeated, and there remains an urgent need for improved modelling of disease incidence to support future early-warning systems; improved prevalence estimation; and understanding of long term impacts to vulnerable communities. Finding data to underpin such analyses, however, is extremely difficult indeed. Much COVID incidence goes completely unreported; even the largest studies are limited to tiny fractions of the population, and biased towards specific demographics; and generalized studies have to use either tip-of-the-iceberg medical statistics (ONS, NHS), fine-grained but unsustainable self-reporting technologies (e.g. KCL’s COVID Symptom Study app), or broad brush behavioural data (e.g. Google COVID-19 mobility reports, Social media data). With a lack of widespread adoption of track-and-trace systems in the UK, and (understandably) declining public engagement with self-reporting initiatives, new approaches are urgently required.

A solution is available however. Epidemics such as COVID-19 are now well recognized as being driven as much by behavioural factors as they are from clinical ones - behavioural factors that are richly embedded in the mass, anonymized retail transaction logs held (untapped) in CIVIC's private-sector partners' datasets. Through the interrogation of such data (e.g. health purchases from the UK's leading health retailer), CIVIC will address key epidemiological knowledge gaps in: determining dynamic estimates of untested COVID-19 via fine-grained pharmacy and self-medication datasets; advancing AI/Modelling knowledge by triangulating behavioural features embedded in >1.5 billion loyalty-card logs - that can act as predictors of future outbreak; and advancing the state-of-the-art in identification and mapping of key vulnerable communities across the UK (Olio, Fareshare, ONS).

In such modelling "ground-truth" labelling of target (independent) variables) is crucial. In stage 1 of CIVIC, analysis will occur at aggregated LSOA geographical levels, with data-points labelled with (a) time series of recorded incidence in each LSOA (NHS) and (b) time series of relevant 111-related activity. However in Stage-2, CIVIC will importantly, also investigate the potential of "data-linkage", an approach that holds potential to underpin finer-grained individual level analyses at scale. Such processes must be engrained with privacy-by-design, and underpinned by informed and participatory consent, if they are ever to contribute to public health.

Impact of research: 
The aim, and target impact, of this research is 3-fold: 1. To establish the potential of behavioural signals (held in mass, and readily anonymizable, transactional datasets) to shed light on public health risks, such as COVID-19 (and underpin new forms of syndromic surveillance tools, providing capabilities for early-warning systems at scale; sustainably; and without reliance on self-reporting apps). 2. To uncover potentially hidden impacts of COVID-19 on vulnerable and under-examined communities (e.g. BAME, areas of Food Poverty, Deprivation) 3. To demonstrate the importance of transparency, security and privacy-by-design in data linkage systems, through open frameworks connecting key stakeholders: health services, researchers, private-sector data holders, and individuals/communities themselves. We expect the programme's work to directly impact on practices within our partners in the form of NHS, ONS, Joint Biosecurity Centre and Boots Pharmacy.
Date proposal received: 
Thursday, 11 March, 2021
Date proposal approved: 
Thursday, 11 March, 2021
Epidemiology, Infection, Statistical methods, Linkage