A fully funded 4-year PhD Project with Wellcome Trust Molecular, genetic and lifecourse epidemiology programme
Supervisors
Dr Anya Skatova (lead), Prof Deborah Lawlor
Read more about how to apply and the PhD programme here
Rationale
Shopping history records collected by supermarkets contain population level health information which could be missing from traditional health research data such as medical records. For example, shopping transactions can provide granular and objective data on under/unreported behaviours and outcomes in reproductive health domain – related to pain and weight management, vitamins consumption, infant feeding, etc – that can be tracked longitudinally. Combining shopping history datasets with epidemiological methods has potential for health research and might improve diagnosis, disease prevention and planning of interventions.
Aims & Objectives
The aim of the PhD is to explore the potential of shopping history data to identify key reproductive events and lifestyle choices around these in real time. The specific focus of the PhD will be developed by the student, with potential objectives including: (1) determining the accuracy of shopping history to determine one or more reproductive events, such as conception, pregnancy, breastfeeding or parenthood; (2) whether shopping histories can identify lifestyle changes around these events, such as pre-conception, pregnancy and breastfeeding related changes in diet; (3) the extent to which shopping histories enhance repeat data collected in cohort studies, for example, shopping histories with data in real time might be able to pinpoint the timing of events such as planning a pregnancy and conception, whereas cohort data collected from movement sensors over periods that coincide with the timing of these events might better identify changes in physical activity and sleep patterns. The PhD will work with standalone population level supermarket shopping histories data, as well as a subset of shopping histories data linked into Avon Longitudinal Study of Parents and Children (ALSPAC).
Methods
The student will mainly work with shopping histories data of a large UK health and beauty retailer, both standalone (>12.5m customers, >1.5 billion transactions) and linked into ALSPAC (for ~1,500 index ALPSAC participants). There is a scope for additional new quantitative data collection with ALSPAC participants where it is needed to meet research aims of the PhD project.
Shopping histories data will be used first to identify a reproductive life event of interest (e.g., pregnancy) and a time window associated with it. Products that are bought during this time window will be then explored. This will allow to identify other behaviours (e.g., pain management, fertility issues) and health outcomes (e.g., miscarriages) associated with this life event. Those behaviours and outcomes will be then validated through the contextual variables using the data available in ALSPAC (and new data collected through surveys) related to causes and consequences of the life event. The student will be expected to explore the structure of the repeat shopping data and identify appropriate methods for analysing those data. For example, repeat purchasing of sanitary products (indicative of menstruation) which change over time might be analysed by multilevel models or structural equations depending on the structure of the data and the specific research question.
This is a data intensive quantitative PhD. The successful candidate would be expected to have had experience of statistical analysis in their first degree, be competent in handling complex large-scale data and eager to learn new quantitative methods and/or about new topic areas in a multidisciplinary team. Depending on their previous experience the successful candidate will obtain training in epidemiology, survey design and data collection, advanced statistical methods, and data science, including the ethics and governance and management and use of data, through the completion of the research project and through postgraduate short courses.