Since the COVID-19 pandemic, there has been rapid progress made towards the availability and accessibility of national healthcare data for research. Consequently, for the first time we are analysing data from over 65 million patients across the UK to help us understand more about COVID-19. Our analyses have the advantages of being able to study individuals with and without different health-related problems across all age groups, ethnicities, geographies and socioeconomic settings. Our results will be directly relevant to everyone living in the UK.
However, there are a number of challenges and limitations to using routinely collected healthcare data for research. The problems mainly arise because electronic health records are designed for clinical purposes, and do not necessary provide an accurate picture of the true health status on all patients at all times. If we do not address these problems properly in the analysis, then we will get biased results and make incorrect conclusions.
We aim to identify and provide solutions to address the challenges and limitations in the analysis of population-wide healthcare data. Ultimately, we want to ensure results arising from population-data healthcare data are accurately reported.
View this project on the Health Data Research Gateway
Subprojects
CCU005_01: Linked electronic health records for research on a nationwide cohort of more than 54 million people in England: data resource
CCU005_02: Comparisons of study designs in estimating cardiovascular risk associated with COVID-19 vaccination and infection
CCU005_03: Harmonising electronic health records for reproducible research: challenges, solutions and recommendations from a UK-wide COVID-19 research collaboration
CCU005_04: Assessment of serial measures and data coverage in the National Institute for Cardiovascular Outcomes Research (NICOR) cardiovascular disease audits
CCU005_06: Efficient handling of missing values in population-wide e-health records: external validation of cardiovascular risk prediction models with competing risks
CCU005_07: Handling of missing values in whole-population electronic health records: a simulation study
CCU005_08: Assessing the impact of the COVID-19 pandemic on the accuracy, completeness and agreement of stroke cases across a national registry and whole population electronic health records
Outputs
Linked electronic health records for research on a nationwide cohort including over 54 million people in England
- BMJ publication 08/04/21 can be viewed here
- BMJ editorial 08/04/21 can be viewed here and public contributor opinion piece here
- The press release explaining this research can be viewed here
- medRxiv preprint 26/02/21 can be viewed here
- Code and phenotypes used to produce this paper are available in GitHub here
Harmonising electronic health records for reproducible research: challenges, solutions and recommendations from a UK-wide COVID-19 research collaboration
- BMC Medical Informatics and Decision Making publication 16/01/23 can be viewed here
- Preprint 28/10/22 can be viewed here
- Related GitHub repo can be accessed here