For the public
What are cohorts in health research?
When a group of people – usually patients with a particular condition – agree to be the subjects of an investigation aimed at producing some improvement in health-care, they are called a research cohort. Personal information about the cohort members and how they react during the research project is gathered and then analysed to produce the results of the research.
This information is nowadays stored electronically – in what is known as a data-set. It will be maintained and handled by a data-set curator or manager.
Because data-sets are “portable” – easily transmitted from one place to another – the possibility arises for using them in other research projects by uniting them with data-sets from elsewhere. It is this fact that is opening up an enormous new field of potential for data-research in health and medicine.
To achieve a linkage between different sets of data, and to organise them in the way that a new project needs, there must be a sophisticated facility for storing and handling the data-sets; this is known as a Trusted Research Environment (TRE).
What is a Trusted Research Environment?
Trusted Research Environments (TREs) are safe, secure, computing environments where researchers, scientists and other experts can gain access to valuable data for research in the public interest, via a rigorous approval process to improve society.
TRE’s bring great benefits to the research community and, ultimately, to the public. TRE-enabling technology allows for deep insights and new discoveries by using incredibly rich health or administrative data. TREs allow researchers to work together on society’s biggest challenges by affording them a safe collaborative space with all the necessary software and resources they need.
More information on what a TRE is can be found here: What is a TRE? – SAIL Databank
What is the UK Clinical Cohorts (UK CliC) TRE?
The BHF Data Science Centre is working with SAIL at Swansea University to create a new TRE for UK-wide research in cardiovascular disease, known as UK CliC. It will provide storage for cohort data-sets and the tools for linkage between them and with existing NHS data, in a secure environment. Researchers will have secure, remote access to the data.
This will enable cohort data-sets to reach their full research potential, reducing current inefficiencies, achieving economies of scale and accelerating research outputs. The TRE could be extended to support research in a range of other diseases.
More information on data linkage can be found here: Data linkages: explore the evolution of healthcare records in research (youtube.com)
What assurances can be given on the security of the Trusted Research Environment?
The TRE has adopted the Five Safes Framework, which allows TRE operators to provide safe access to data.
A description of the five safes is below and further information can be found here: Health data research explained – HDR UK
-
- Safe People: can the data user be trusted to use the data in an appropriate manner? Do the researchers have the knowledge and skills to act in accordance with the required standards of behaviour?
-
- Safe Projects: is the use of this data appropriate, lawful, ethical, and sensible? Is the project expected to deliver public benefit?
-
- Safe settings: does the tool that the researcher is using to access the data prevent unauthorised use or mistakes? Are there controls on the way the data is accessed, both from a technology perspective and considering the physical environment?
-
- Safe Data: is there a disclosure risk in the data itself? Has the data been treated appropriately to minimise the potential for identification of individuals or organisations?
-
- Safe outputs: do the results of the research using the data prevent someone from identifying individuals from the data? What can be done to minimise risk when releasing the findings of the project?
How will the TRE operate?
We are working with the Secure Anonymous Information Linkage Service (SAIL), which is part of Swansea University. We have partnered with SAIL because of their track record of developing Trusted Research Environments. The SAIL TRE is ISO 27001 certified and UK Statistics Authority accredited, these are certificates designed to test and give confidence in the security of IT systems.
SAIL have a well established governance and regulatory framework surrounding the use of person-based data. SAIL Databank reduces the risk to researchers associated with collecting, storing and analysing sensitive data. For more information on SAIL, please visit their website here: Home – SAIL Databank
Have patient/public groups been involved in the development of the UK Clinical Cohorts (UK CliC) TRE ?
Yes, the BHF Data Science Centre has a PPIE group who have co-developed the UK CliC TRE. This group meets on a regular basis and provides feedback on the aims, objectives and development of processes. SAIL also has a PPIE group that advises on the development of the SAIL infrastructure.
Has permission been sought from cohort participants?
Yes, all of the cohorts who are being included within the UK CliC TRE have asked their participants for permission for their data to be linked to routinely collected NHS information and other data sources.
Who will be given access to the data?
Only researchers associated with a bona fide research institution and with valid research objectives will be granted access to the data held on the TRE. Researchers will apply for access through the SAIL Information Governance Review Process (IGRP). More information on this process can be found here: Apply to work with the data – SAIL Databank
Will my data be sold to external companies for profit?
No, in no circumstances will the data held on the TRE be sold for profit. The UK CliC TRE will operate on a cost-recovery basis, meaning we charge users only what it costs to run the TRE.
For cohort data-set owners and managers
What is UK CliC?
The UK Clinical Cohorts (UK CliC) TRE aims to provide clinical cohort owners with a Trusted Research Environment where they can collect, deposit, link and share their cohort data. The BHF Data Science Centre will coordinate the linkage process, including submission of applications to Data Custodians and provide expert Data Scientist support in curating any newly linked datasets. Cohort data held in the UK CliC TRE will be made available to third party researchers through an application process.
Who hosts the UK CliC TRE?
Swansea University has developed the United Kingdom Secure eResearch Platform (UKSeRP) technology, a software service that improves researcher access to large-scale and complex datasets.
UKSeRP has been developed based on experience in the use of the SAIL Gateway and provides a secure environment that allows research groups to conform to best practices in data management, security and governance.
The system delivers remote access to a large-scale IT infrastructure together with standard and bespoke analytical tools.
What are the benefits of sharing my data on the UK CliC TRE?
Administrative/Management and Governance support
The platform will provide research management, governance and data scientist input to your project. This includes providing oversight of the process for applying for access (for example through the NHS England DARS process).
Data collection module
The platform will have a dedicated area (called the Garden Shed) to collect and manage data prior to it being loaded.
Reduced costs
We anticipate economies of scale. Rather than multiple access applications for any one project, there will be one application, resulting in lower costs.
Access to BHF DSC Data Science team (for support during your research project)
Data access/disclosure control for external researchers
The SAIL Information Governance Review Process (IGRP) is well established and will provide a mechanism to share data with other researchers. This means that important data is used beyond the purpose it was originally collected for.
What are the governance considerations of sharing my data?
As the UK CliC TRE is part of the SAIL Databank, it will adhere to the governance structure of SAIL. SAIL is an ISO 27001 certified and UK Statistics Authority accredited environment which navigates the rigorous legal and regulatory framework surrounding the use of person-based data. SAIL Databank reduces the risk to researchers associated with collecting, storing and analysing sensitive data.
Cohort owners are required to liaise with their study sponsor and research governance team to ensure the transfer of data to the TRE is in line with the governance arrangements for the data. A data sharing agreement will be shared upon request, which outlines the governance responsibilities of each party.
How much is it to house my data on the platform?
The BHF DSC, HDRUK, and SAIL all work on a not-for-profit basis and so will the platform. A breakdown of charges will be provided at the scoping phase of onboarding.
Charges will cover:
-
- BHF DSC Data Scientist support time (if required)
- Base costs for access
- Project set-up costs
- Disclosure control
- Data loading (for a new data-set)
- SAIL analytical services
- Data refreshes (when updates of routinely collected NHS data becomes available)
- Data Costs (including NHS England data, Public Health Scotland data and any other data-sets that are requested)
- Disclosure control
- Data loading (for a new dataset)
- SAIL analytical services
- Data refreshers (when updates of routinely-collected NHS data become available)
- Data costs (including NHS England data, Public Health Scotland data and any other datasets that are requested)
- Project Management and Governance
- “Cohort Only Area” (also known as the Garden Shed) costs
Will I be able to review requests for access to my cohort data?
Yes. Once your data is on the platform, it will be advertised on the SAIL asset register. You can decide if you want it to exist as a core restricted or core asset. For core restricted datasets, the original data owner will be consulted as part of the data access request process (known as IGRP).
Can I collect data directly into the UK CliC TRE?
Yes. A dedicated area for data analysis, data collection and data cleaning is to be made available for some cardiovascular and diabetes cohort projects in UKSeRP – this area is known as the Garden Shed.
Not all projects will choose to use this environment and may have a separate arrangement, e.g. with a Clinical Trials Unit. If cohorts make the decision to use this area it is likely this will be required during the recruitment phase of their study, but possibly longer into follow-up, whilst cohort research data is collected.
This environment would be for data collected on consented cohort participants only. There would be no opportunity for data to be linked to NHS organisations (e.g. NHS Wales or NHS England) within this environment. When a Cohort is ready to share their data with external researchers, their data will be migrated to the UK CliC TRE, where it can be linked with NHS data and made available for external research to apply for access.
As part of the scoping process, we will be required to review the cohort consent forms and information sheets and potentially other relevant documents (e.g. privacy notices, newsletters/updates provided to cohort participants).
Who else will be able to see my cohort data?
The data you are using for a project will will not be available to others until the project ends. It will then be available to other users of SAIL but you can retain control of who has access.
How do I start the process of being onboarded to the platform?
A summary of the process is outlined below.
Stage 1 – Cohort data owner indicates interest in using the platform through our UK CliC questionnaire found here: British Heart Foundation Data Science Centre Trusted Research Environment: Vanguard Cohort Linkage Survey (surveymonkey.com).
BHF Data Science Centre staff will provide an overview of the aims and objectives of the platform and make sure it is a good fit for both parties.
BHF DSC will provide a welcome pack – which includes an overview document, template costs and a draft data sharing agreement.
Stage 2 – If both parties wish to go ahead with onboarding, undertake scoping process with SAIL (this ensures that there is consent to perform linkage from the cohort members, clarifies funding arrangements and provides the SAIL team with key information about the dataset being onboarded)
Stage 3 – If the scoping review is satisfactorily concluded, proceed with contractual arrangements. Data-set owners will be required to sign up to a tripartite Data Sharing Agreement with SAIL and Digital Health and Care Wales (DHCW).
Stage 4 – Users set up, file transfer process implemented.
Stage 5 – Data curation and dataset listed on SAIL asset register as requested (i.e., Core Restricted or Core).
Stage 6 – External researchers can apply for access to the dataset through the SAIL IGRP process.
For external researchers
What is stored in the platform and how do I access it?
The SAIL databank has an asset register which can be found here https://saildatabank.com/data/explore-the-data/.
Cohorts who have housed their data within the UK CliC TRE will be listed on this asset register.
You can apply for access to the datasets through the SAIL Information Governance Review Panel. Information and enquiry form for this process can be found here:
Information Governance – SAIL Databank
Is there a cost for me to access the data held by the platform?
Yes, but the charges are on a cost-recovery basis, there is no profit element. For fuller details please enquire at bhfdsc@hdruk.ac.uk