Under public health emergencies, and particularly the COVID19 pandemic, it is fundamental that data is shared in both a timely and an accurate manner. This coupled with the harmonisation of the many diverse data infrastructures is, now more than ever, imperative to share preliminary data and results early and often. It is clear that open research data is a key component to pandemic preparedness and response.
In late March, RDA received a direct request from one of its funders, the European Commission, to create global guidelines and recommendations for data sharing under COVID-19 circumstances. Over 600 data professionals and domain experts signed up and began work in early April 2020.
They have produced a rich set of detailed guidelines to help researchers and data stewards follow best practices to maximise the efficiency of their work, and to act as a blueprint for future emergencies; coupled with recommendations to help policymakers and funders to maximise timely, quality data sharing and appropriate responses in such health emergencies.
On 30 June 2020, RDA published the final version of the RDA COVID-19 Recommendations and Guidelines on data sharing covering four research areas – clinical data, omics practices, epidemiology and social sciences – complimented by overarching areas focusing on legal and ethical considerations, research software, community participation and indigenous data.
Some highlights of the common challenges that emerged from these areas include:
- The unprecedented spread of the virus has prompted a rapid and massive research response with a diversity of outputs that pose a challenge to interoperability.
- To make the most of global research efforts, findings and data need to be shared equally rapidly, in a way that is useful and comprehensible.
- The challenge here, of course, is the trade-off between timeliness and precision. The speed of data collection and sharing needs to be balanced with accuracy, which takes time.
- The lack of pre-approved data sharing agreements and archaic information systems hinder rapid detection of emerging threats and development of an evidence-based response.
- While the research and data are abundant, multi-faceted, and globally produced, there is no universally adopted system or standard, for collecting, documenting, and disseminating COVID-19 research outputs.
- Furthermore, many outputs are not reusable by, or useful to, different communities if they have not been sufficiently documented and contextualised, or appropriately licensed.
- Correspondingly, research software is developed and maintained in ad hoc fashion. Access to the software developed for analysis in papers, is not placed consistently in papers and, if they are available, often they are placed in arbitrary locations with no guarantee of their persistence.
The report specifically emphasises the importance of the following during the COVID-19 emergency response:
- Sharing clinical data in a timely and trustworthy manner to maximise the impact of healthcare measures and clinical research during the emergency response;
- encouraging people to Publish their data alongside a paper (particularly important in reference to omics data);
- underlining that epidemiology data underpin early response strategies and public health measures;
- providing general guidelines to collect or link important social and behavioral data in all pandemic studies;
- evidencing the importance of sharing research software alongside the research data it analyses, and providing guidelines and best practices for enabling this;
- offering general guidance to navigate the applicable rule of law and exploit relevant ethical frameworks relating to the collection, analysis and sharing of data in similar emergency situations;
- looking at data management and sharing issues related to the technical, social, legal and ethical considerations from the community participation perspective.
The RDA COVID-19 activities were conducted under the RDA guiding principles of Openness, Consensus, Balance, Harmonisation, Community-driven, Non-profit and Technology-neutral. The results and outputs are open to all.