Mechanism allows analysis of sensitive information without compromising data ownership
Africa’s population cohort studies are set to test a novel way of sharing their data: keeping the data at home and bringing the analysis to them.
That’s the idea behind TRE in a Box, a ‘trusted research environment’ about to be trialled in the African Population Cohorts Consortium. The tool is designed to help African cohorts take part in big, international studies while maintaining control of their data and staying on the right side of national laws.
The APCC brings together 67 population cohorts in 21 African countries, supported by a secretariat with a programme manager and a stakeholder engagement officer. TRE in a Box has been developed to overcome data-sharing challenges within the network, Kobus Herbst, APCC interim steering committee co-chair, told a 28 May webinar to update network stakeholders on its progress.
“Data sovereignty is a hot topic at the moment in Africa, and I think we are ahead of the curve by establishing the capability of bringing analysis to the data rather than sending data outside of our institutions and our countries,” Herbst told the webinar.
Secure environment
TRE in a Box will give each cohort a secure environment in which approved researchers can analyse data, manage provenance and metadata, and integrate findings across studies, all without data leaving the cohort’s own infrastructure. “You have absolutely full control on what goes in there and what goes out of it,” said Herbst.
A prototype has been developed, he said, and an introductory workshop is planned for August, likely in Nairobi, to give selected cohorts hands-on experience before they begin piloting the system. The long-term ambition is to link these environments to allow federated, multi-cohort analysis.
The push for better data infrastructure is closely tied to the APCC’s scientific plans. A flagship project, Gen-Impact, funded by the Wellcome Trust, will deep-phenotype and genotype 10,000 participants from APCC cohorts covering under-studied African populations, examining how genes and environment interact in the burden of non-communicable and infectious diseases. It will support five sub-studies of about 2,000 participants each, with about US$1 million per sub-study.
The APCC is also developing work on climate and health research and universal health coverage indicators within cohorts as it continues to build out its data science capacity. Monthly webinars will continue as the data system evolves, and the consortium’s ethics and legal teams will keep liaising with institutions to refine the rules around data visiting and joint projects, the webinar heard.
