The BHF Data Science Centre has published a new study on how data science can improve our understanding of stroke outcomes and care.
Stroke is a leading cause of death and disability in the UK, affecting over 100,000 people every year. More than 1.3 million stroke survivors live with its lasting effects. High-quality stroke care is essential to give people the best chance of recovery, with prompt diagnosis and treatment crucial to minimise brain damage, enhance recovery and reduce the chance of further stroke.
To improve stroke care, we need to understand more about the people who have a stroke, the care they receive, and how they live after a stroke. Data and analysis are important learning tools to achieve this.
Stroke data in the NHS
Data is recorded in a person’s medical record whenever they interact with the health service, such as during a GP appointment or hospital visit. It is recorded by healthcare professionals for use by other healthcare professionals, and includes information about diagnosis, monitoring, and treatment of disease.
For people affected by stroke, information on their health and care data is collected by GPs in primary care, hospital admissions and death records. The Sentinel Stroke National Audit Programme (SSNAP) is an audit that captures extra information to measure quality of care in stroke units . The SSNAP dataset provides more details that are not available in other NHS records, however it can miss some people affected by stroke, like those not admitted to hospital.
Combining data to give a complete picture
The British Heart Foundation Data Science Centre and NHS England have recently worked together to develop techniques to link these datasets in England. This has enabled researchers to develop a better understanding of many health conditions for the purpose of understanding the effect of the COVID-19 pandemic on the health system.
Combining data in this way can provide additional details about strokes, for example by filling in gaps with the results of tests that can determine stroke type, carried out at a hospital.
Carrying out these studies across the entire population is complicated and requires the development of complex computational instructions to efficiently extract, combine, and analyse relevant data across different sources.
What we found
By combining information from different data sources, we were able to capture a more complete picture of the number of strokes occurring. We analysed almost the whole population of England (everyone registered with a GP) and included all strokes reported in hospitals and primary care settings (e.g. GP records), as well as in stroke units from 2020 to 2024. We identified almost twice as many people affected by stroke (over 400,000) compared to those recorded in the SSNAP dataset alone (220,000).
This approach across multiple datasets provided more complete information about the type of stroke, and the SSNAP dataset provided additional details on stroke type that was not available in other datasets. It also allowed us to look at the quality of care and outcomes after stroke.
By linking information on stroke type with the medication dispensed to patients, we were able to assess whether this meets guidelines for treatment after stroke. We found that fewer people start taking lipid-lowering and blood pressure medication after stroke than recommended. These guidelines are important as they have been designed to improve the long-term outcomes after stroke.
Impact and next steps
Our study shows the benefit of combining data across sources. Currently information on patients not admitted to stroke units are hidden from national measurements of care quality . By joining up multiple sources, we were able to paint a more complete picture of people affected by stroke and the care they received.
These findings highlight the advantages of using linked datasets to enhance our understanding of conditions and could enable improvements to our health care system, aligning with the key recommendations of the Sudlow review.