More information
We invite you to read our other webpages about data management:
ZonMw is aware that FAIR data are crucial in realising the ambitions in research and innovation to meet the challenges with respect to health and health care processes. The knowledge we need therefore must be more precise, more personalised, timely available, and it must be based on data from multiple sources, often in large amounts (i.e. big data).
Realising these ambitions depends strongly on the opportunities to use computational technology and data science on a large scale. The data that we use in research and innovation must therefore be fully understandable for the computer, in other words: data must be ‘machine readable’ or ‘machine actionable’.
The purpose of this section is to provide background information for researchers and data stewards who are active in FAIRifying their data. With the term FAIRification we stress that the creation of FAIR data is a process, in which data gradually become more FAIR. At the end, data are optimally reusable, both by humans and -where possible- by machines, with full compliance to privacy protection regulations (if relevant). FAIRification is important for all types of data, whether they are generated through research, innovation processes, or societal activities.
FAIR data are reusable because they are Findable, Accessible, Interoperable and Reusable, both for humans and for machines (computers). In the ideal situation, data are made readable (or actionable) for machines, and they can be ‘visited’ by an algorithm while the data remain at the data owner. In that case, the acronym FAIR may also be read as ‘Fully (or Federated) AI-Ready’. Data become FAIR by taking the 15 FAIR principles as a guidance for data stewardship. Here you can read the original publication about the ‘FAIR Guiding Principles for scientific data management and stewardship’, and a concise overview that explains each principle individually. Since the publication in 2016, the principles became acknowledged and were widely adopted in international scientific research. They became an integral part of Open Science.
For ZonMw, the FAIR principles form the basis for its policy and requirements with respect to data management and stewardship in the projects it funds. However, for most researchers and data stewards it is quite challenging to put the FAIR principles into practice. ZonMw therefore organises support, and collaborates in the development of new tools and practices for the creation of FAIR data. An overview can be found on the M4M Metadata-for-machines resource page. M4M is one of the components of the three-point FAIRification Framework. The aim of these initiatives is to make FAIR data as easy as possible.
FAIR data is not a well-defined endpoint. Instead, data may gain a certain level of FAIRness through data stewardship actions, taking FAIR principles as a guidance. Depending on their goals, researchers and data stewards may decide to focus specifically on for instance findability, or interoperability (etc). Implementing all FAIR principles is very challenging though, and for most researchers and data stewards not yet possible because they lack the appropriate knowledge, tools or infrastructure. Strictly speaking, however, as long as data (or their metadata) are not machine readable, they should not be labelled as ‘FAIR’.
You can read more about a step-by- step workflow for FAIRification, and take a look at some examples of tools therefore, such as the RDA FAIR Data Maturity Model, and the Data Stewardship Wizard.
ZonMw requires grant holders to take actions to make data as findable, accessible, interoperable and reusable as possible, and appropriate for the type of project. ZonMw’s M4M-workshops for the COVID-19 research programme were the first step towards machine readability, and thereby achieve some ‘true’ FAIRness of data in projects it funds.You can read more about the concept of metadata for machines (M4M) and find out how they are produced, and can be used.
The concept of FAIR is usually framed in connection with data. The FAIR principles, however, may also help to make other output (or assets, or artefacts) from projects reusable. Consider for example a collection of biomaterials, or the recording of an interview. Such assets can gain FAIRness, e.g. by improving their findable (through a persistent identifier such as a DOI-code), and by providing rich metadata to describe what the asset is about and how it was created (provenance). Note that metadata themselves should also be FAIR, as well as the codes, software, algorithms, tools and services to use the (meta)data.You can read about FAIR digital objects to see what is actually ‘done’ (or handled, or curated) with data/assets to make them FAIR.
In principle, ZonMw’s requirements and procedures apply to all assets from the projects it funds. Through the list of key items, grantholders can show how their assets /products are made reusable.
ZonMw’s activities (etc) to FAIRify data have primarily been aimed at research data. The next steps are to FAIRify real world data, such that they can be reused for research and innovation as well. Real world data include among others patient or population data, and data derived from e.g. devices or eHealth programmes.
Looking at the table with the 15 FAIR guiding principles, the following aspects are notable:
Metadata: “ FAIR is 90% metadata”. This expert opinion points at the crucial role of metadata (i.e. the data about data, or other assets). Metadata after all, contain persistent identifiers, information on how data were generated, procedures to get access (or not), etc.
Machine actionable: the machine (or computer) can read, understand, and actually use the data, metadata, persistent identifiers, etc. The three-point FAIRification Framework involves the tools to accomplish this.
Generic vs. domain-specific: the FAIR principles guide to technical, generic operations (these are in principle the same for each research domain), and to domain-relevant aspects. Both are needed for FAIRification. For the domain-relevant aspects, a community (e.g. researchers in a specific research domain) agrees on certain standards, tools or infrastructures that are commonly used. Promoting the use of such community choices, improves the interoperability of data within that domain.
ZonMw promotes community choices: it stimulates grantees to choose, whenever this is possible, standards (etc) that are common within their research domain. By organising M4M-workshops for the COVID-19 research programme, ZonMw took the next step to develop machine readable metadata (M4M) schemes that include COVID-19 community choices.
Ideally, datasets from ZonMw’s projects are FAIR and open to provide access for reuse. However, in the case of privacy sensitive dataset or IP issues, it may be necessary to restrict access to the data. Such a dataset can comply very well with FAIR principles, while the data remain at the same time closed for other users. The FAIR principle for Accessibility (A1.2) guides to the procedure for authentication and authorization. A data producer can thereby set the desired access restrictions, such that the computer (and an algorithm) understands whether it is allowed to get access to the dataset or not. Of course, the information on access restrictions can be made ‘human readable’ as well.
Note that OPEN data are not in itself FAIR data. Also open data must be FAIRified! Finally, metadata (i.e. data that describe and lead to the dataset) must be FAIR and OPEN. Otherwise the dataset cannot be found.
The terms data management and data stewardship both refer to activities ‘to take care of your data’, with “the aim to optimise the usability, reusability and reproducibility of the resulting data”. You can read this and much more at DTL. Data management usually points at activities during a project to collect, annotate, and archive data, while data stewardship is about making data reusable for the long-term, also after the project has ended. The FAIR principles provide the guidance therefore. Taking care of your data, may also be viewed as steps in a data life cycle, corresponding with phases in research:
To avoid the discussion on definitions, ZonMw uses for this website the term research data management and stewardship (RDM). More important is what a data provider actually does in the context of RDM to FAIRify his or her data! Planning and performing RDM must already start when you prepare your project and grant application, and continues throughout the project. Read more about the ZonMw’s requirements and procedures for RDM.
To make FAIRification as easy as possible, researchers, data stewards and other data producers rely on state of art facilities (services, tools, infrastructures), which are integrated in a coherent FAIR data-ecosystem. In the Netherlands, the NPOS (Nationaal Programa Open Science) coordinates the development of such an ecosystem at the national level and for all research domains. NPOS thereby aligns with the facilities and concepts that are conceived within the EOSC (European Open Science Cloud). For the domain of life sciences and health in the Netherlands, the ecosystem is maturing within a network of research institutes, and FAIR data expert organisations such as GO FAIR, Health-RI, and DTL. Also the role of ZonMw and other funding agencies is growing.
Researchers who apply for or receive a grant from ZonMw, are recommended to involve a data steward who can provide support to perform the various tasks in RDM and for the FAIRification of their data. The rapid growing awareness about the need to FAIRify data, and the emerging facilities therefore, ask for highly specialised data stewardship professionals.
The capacity of data stewardship, however, is not yet op to the level that is needed to meet the growing demand. Moreover, data stewardship is a new profession, which still waits for full acknowledgement as a crucial expertise in the FAIR data-ecosystem. ZonMw delivered on behalf of the NPOS a report about Professionalising Data stewardship in the Netherlands, which provides the recommendations and practical criteria for competences, training and education to improve the capacity. The report also states that data stewards need to be recognised as full members in research groups, and that sufficient budgets must come available to maintain their positions in the long run.
ZonMw has put in its requirements for each grant proposal, that the time and the costs of data stewardship must be planned and budgeted properly.
Once data are FAIR, a ‘machine’ (i.e. computer) can find the data (and metadata), understand them and use them. FAIR data are therefore ‘machine readable’ or ‘machine actionable’. In fact, with FAIR data, it is the ‘human’ (e.g. a researcher) who can use the computer to query the internet, and find and read the data, irrespective of their location. All the information about the data is captured in metadata. Metadata must also be FAIR, because machines and humans need the metadata to find out what the data are exactly about, and whether it is possible and allowed to use the data. The M4M templates that were developed in the COVID-19 programme, were developed for this purpose. The most advanced way to use FAIR data is ‘data visiting’: the metadata and data that are found by the machine, can be used for research with the help of an algorithm. This approach is not yet common practice. It is in development, for example in the personal health train concept.
In the current situation, however, machine readable metadata are already of benefit for researchers. For example, the metadata that are produced in ZonMw’s COVID-19 projects are exposed on the COVID-19 data portal that is built by Health-RI. There they can be searched and analysed, and data can be requested, according to the associated license, and governance defined per dataset. You can read more in The case of the COVID-19 research programme and metadata-for-machines.
Read more about these (and other) topics in the in DATA Intelligence, the Special Issue on Emerging FAIR Practices. It includes an article by ZonMw and Health Research Board (Ireland) about a FAIR funding model.
This presentation of the FAIR guiding principles (provided by Erik Schultes, GO FAIR) highlights a number of aspects that are important in the process of FAIRification. Read more in Some important aspects of FAIR data that we have to keep in mind.
ZonMw develops and implements innovative approaches in its research programmes that enable researchers, data stewards and other data producers to generate FAIR data. Together with FAIR data experts, ZonMw focusses on the development and implementation of FAIR metadata schemes for a standardised and machine-readable description of datasets with controlled vocabularies. The topics included in the metadata scheme are selected by a research community and point at domain-relevant standards, ontologies and technologies.
The significant impact of the outbreak of the virus on worldwide public health and economies urged ZonMw to set a next step towards the FAIRIfication of research data. Together with GO FAIR (see also GO FAIR Foundation), DTL and Health-RI, ZonMw developed and implemented a workflow to produce and use metadata for machines (M4M) to describe COVID-19 research projects, and to expose the metadata on the COVID-19 data portal at Health-RI.
In a number of workshops, these parties engaged ZonMw funded COVID-19 researchers and their data stewards to form a research community. Together, they selected the relevant elements and controlled vocabularies for the metadata schemes to describe COVID-19 datasets. On the basis of their input, and the topics of the ZonMw COVID-19 research programme, GO FAIR Foundation developed a machine readable collection of terms for COVID-19, that is published on Bioportal and implemented in the COVID-19 M4M templates. Next, researchers and data stewards were trained and supported to use the templates to describe their datasets and other assets.
This approach involves three types of metadata templates to describe datasets in the context of a project and its scientific background: 1. Project Admin template (providing administrative semantics to describe the project), 2. Project Content template (providing domain specific, scientific semantics to describe the project), 3. Metadata forms for Dataset (describing the datasets, their interoperability, their conditions for access, etc. (read more in https://osf.io/2y6ba)).
Health-RI developed a COVID-19 data portal to expose the machine-readable metadata to become human-readable as well. Irrespective of the repository where each project's data is stored, it is possible to find the metadata via the data portal. One can analyse the information that is derived from the metadata, and datasets can be requested, according to the associated license, and governance defined per dataset. In the future this will also become the route for data visiting with algorithms.
This initiative to FAIRify COVID-19 research data is a novel approach towards FAIRification of data. The lessons learned will lead to next steps, also in other research domains. The complete workflow of community choices, metadata development, and exposure on FAIR data points may be viewed stepwise in the so called FAIR hourglass, a framework for FAIR implementation.
For the Infectious Diseases and Antimicrobial research community (ID&AMR), ZonMw used the approach that was developed for the COVID-19 research programme . Read more in the section The case of the COVID-19 research programme and metadata-for-machines.
The research community for ID&AMR selected the relevant elements and controlled vocabularies for the metadata (i.e. Project Content template) to describe ID&AMR related datasets. An extra component was added to describe collections of biological samples. On the basis of their input a machine readable collection of terms for ID & AMR was produced, that is published on Bioportal and implemented in ID&AMR M4M templates (read more in https://osf.io/8gm4n).
Here you can learn more about the experiences of AMR researchers who FAIRify their data.
The new FAIRification approaches that ZonMw implemented, are developed in collaboration with GO FAIR (see also GO FAIR Foundation), DTL and Health-RI, and will be further optimised.
We invite you to read our other webpages about data management:
Disclaimer: this webpage will be gradually completed during 2022.