B3471 - Using novel data collection approaches to enhance the ALSPAC resource - 20/03/2020

B number: 
Principal applicant name: 
Louise AC Millard | MRC IEU
Title of project: 
Using novel data collection approaches to enhance the ALSPAC resource
Proposal summary: 

Cohorts like ALSPAC typically collect data on their participants over several years, but since data collection is usually both expensive and burdensome these data collection events tend to take place every few years, measuring or recording information at a particular instance in time e.g. via questionnaires or clinic visits. Hence, these data contain a limited amount of information on phenotypic variability across the life-course, and restricts the research questions that can be asked using these data. There is much more scope to exploit existing and emerging technologies to collect data ‘continuously’ over the longer term in cost-effective and less burdensome ways.

Digital health devices have been successfully used to collect data on specific traits over a number of days (e.g. physical activity measured with accelerometers), but these devices tend to each focus on particular traits such that collecting data in this way is expensive (having to buy specific devices to collect specific phenotypes), and many types of phenotypes do not lend themselves to this type of data collection, in particular, those that can only (currently) be collected via self-report. Recent advances in artificial intelligence and voice recognition technologies means it is now feasible to use voice-based systems to collect self-reported data continuously over several days or weeks in a less burdensome way. However, to date, voice-based data collection has not been used in epidemiology.

A second potentially valuable source of data comes from our pervasive use of the world wide web (the ‘web’). ALSPAC has included items in questionnaires (e.g. “Have you sought help or advice regarding your sex life from the internet in the last year?”), but collecting web usage information passively using a technological approach over a potentially long period of time (weeks, months or even years), has the potential to provide a very large and currently untapped source of health-related information, if collected in ALSPAC.

In this study we aim to assess feasibility and acceptability of a voice-based approach to data collection and passive collection of web usage data. We then plan to collect these data in ALSPAC participants.

Impact of research: 
To raise the profile of ALSPAC as a leader of ‘deep’ innovative methods of data collection in epidemiology cohorts studies that will allow new research questions to be answered, through exploiting existing and emerging technologies. To demonstrate the feasibility and value of these technological approaches for epidemiological research. This will provide novel ‘deep’ data, to widen the scope of research questions that can be answered with the ALSPAC resource, to further understanding of the causes and consequences of traits and disease.
Date proposal received: 
Thursday, 19 March, 2020
Date proposal approved: 
Friday, 20 March, 2020
Statistics/methodology, Statistical methods, Voice-controlled data collection on wearable devices, using systems like Amazon Alexa and the Google Assistant. Technological approach to tracking web usage., Nutrition - breast feeding, diet, Statistical methods