What do we mean by ‘defining disease’ and by ‘phenotypes’?
A phenotype is a characteristic of an individual which can be seen or measured. Phenotypes can include diseases (e.g. type 2 diabetes or heart failure), blood measurements (e.g. cholesterol), lifestyle risk factors (e.g. whether someone smokers or consumes alcohol), physiological measurements (e.g. blood pressure, body mass index ), procedures (e.g. coronary artery bypass graft) or prescriptions of particular medications (e.g. statins).
Phenotypes can be used to identify and differentiate one person, or set of people, from another in research studies, for example to define a set of people with a particular disease.
A computable phenotype (also sometimes called a phenotyping algorithm) is a way of finding out if a person has a specific characteristic by analysing information about that person using a computer. To find out if a person has a particular disease or health condition a computer would analyse information contained in a person’s health records.
Professor Spiros Denaxas is the Theme Lead for Defining Disease.
Why are ‘computable phenotypes’ important for cardiovascular disease research?
Computable phenotypes enable researchers to understand and use the detailed and complex information contained in a person’s health records for research. Analysing this data using a computer also enables researchers to look at information from very large numbers of people in an accurate and efficient manner.
Studies of large groups of people provide stronger and more reliable results that can directly inform health and healthcare policy. They also provide information that is more relevant to the entire population, such as identifying different types of cardiovascular disease.
What are we doing?
We want to improve the way that health information is analysed by computers. This will help researchers and clinicians to better understand complex health information, which could have benefits for all research, leading to better care for patients. Computers need to be given instructions to be able to analyse health information; these computational instructions are also called algorithms.
We are supporting researchers to use enormous health data sets by ensuring phenotyping algorithms are available to meet their needs and providing guidance on Phenotyping.
Ensuring phenotyping algorithms are FAIR
Currently, most phenotyping algorithms are not readily accessible because of limited sharing and a lack of defined standards. We want to make sure that phenotyping algorithms are findable, accessible, interoperable (usable across systems), and reusable (FAIR) by everyone.
Following community engagement and workshops to identify researchers’ needs, we have now published a report that includes our recommendations to ensure FAIR phenotyping algorithms are available to meet the needs of the cardiometabolic research community. Our recommendations include that phenotyping algorithms should be made available via a single, centrally accessible repository and that they are fully described using the information we outline.
You can find the report HERE and read the related web story HERE.
To make it easier for researchers to combine and compare research studies across countries we are part of a programme to optimise cardiovascular phenotyping between the major UK and Germany cohort studies (UK Biobank and NAKO). As part of this, we are leading the UK side of a collaboration to define and compare cardiovascular phenotypes between the UK and Germany and develop best practices for performing these types of comparisons.
Making available phenotyping algorithms
We are also working with experts such as clinicians, researchers, and data scientists to make available validated phenotyping algorithms.
Many of these algorithms have been created as part of the research projects within CVD-COVID-UK and include phenotypes such as deep vein thrombosis, sudden cardiac death, stroke and cardiac arrhythmias.
These definitions and algorithms are being shared via the HDR UK Phenotype Library. They can then be re-used by other researchers, which stops duplication of effort and makes it easier to reproduce research. You can view all BHF Data Science Centre phenotype definitions available in the Phenotype Library HERE.
Generating systematic information on cardiovascular diseases in the population
The Disease Atlas is an ambitious project involving the generation of systematic, data-driven knowledge across all common and rare diseases. We are working to define and provide information on cardiovascular diseases within the Disease Atlas. We have engaged with clinicians to review key diseases and findings. Using nationwide data on 56 million people the Atlas is generating novel comparative insights of the health needs of patients, the care provided, and the research that is carried out . We believe that the Disease Atlas may change the way we think about and research diseases, inform policy and practice and unlock new ways to improve the health of patients and communities.
Areas of work
Find out more about our data-led research.

Whole Population Data
Better use of nationally-collated, structured, coded data: accessing, improving and using linked, national, population-wide health data.

Enhancing Cohorts
Facilitating the linkage of large, ‘omics-rich’ cohorts to electronic health records to better understand the causes of cardiovascular diseases.

Data Enabled Clinical Trials
Supporting the development of efficient, cost-effective trials, using routine health data to recruit and follow patients with cardiovascular conditions.

Imaging
Better use of unstructured data: addressing the challenges of accessing, improving and using unstructured data, for example from cardiac and brain imaging, medical free text and electrocardiograms.

Smartphones and Wearables
Exploring how data from apps and wearables, linked to other health datasets, can inform trajectories of cardiovascular health and disease.

CVD-COVID-UK / COVID-IMPACT
One of seven National Flagship Projects approved by the NIHR-BHF Cardiovascular Partnership, linking population healthcare datasets across the UK to understand the relationship between COVID-19 and cardiovascular diseases.

Diabetes Data Science Catalyst
This exciting partnership between the BHF Data Science Centre, Diabetes UK and HDR UK aims to develop improvements in our understanding of the link between cardiovascular diseases and diabetes.

Stroke Data Science Catalyst
This partnership between the BHF Data Science Centre, HDR UK and the Stroke Association will enable researchers to securely access, link and analyse existing UK health data, speeding up the search for better stroke prevention, treatments and care.

Kidney Data Science Catalyst
This partnership between the BHF Data Science Centre, Kidney Research UK and HDR UK will enable researchers to securely access, link and analyse existing UK health data, speeding up the search for better kidney and cardiovascular disease prevention, treatments, and care.