ZonMw is aware that FAIR data are crucial in realising the ambitions in research and innovation to meet the challenges with respect to health and health care processes. The knowledge we need therefore must be more precise, more personalised, timely available, and it must be based on data from multiple sources, often in large amounts (i.e. big data).
Realising these ambitions depends strongly on the opportunities to use computational technology and data science on a large scale. The data that we use in research and innovation must therefore be fully understandable for the computer, in other words: data must be ‘machine readable’ or ‘machine actionable’.
The purpose of this section is to provide background information for researchers and data stewards who are active in FAIRifying their data. With the term FAIRification we stress that the creation of FAIR data is a process, in which data gradually become more FAIR. At the end, data are optimally reusable, both by humans and -where possible- by machines, with full compliance to privacy protection regulations (if relevant). FAIRification is important for all types of data, whether they are generated through research, innovation processes, or societal activities.
FAIR data are reusable because they are Findable, Accessible, Interoperable and Reusable, both for humans and for machines (computers). In the ideal situation, data are made readable (or actionable) for machines, and they can be ‘visited’ by an algorithm while the data remain at the data owner. In that case, the acronym FAIR may also be read as ‘Fully (or Federated) AI-Ready’. Data become FAIR by taking the 15 FAIR principles as a guidance for data stewardship. Here you can read the original publication about the ‘FAIR Guiding Principles for scientific data management and stewardship’, and a concise overview that explains each principle individually. Since the publication in 2016, the principles became acknowledged and were widely adopted in international scientific research. They became an integral part of Open Science.
For ZonMw, the FAIR principles form the basis for its policy and requirements with respect to data management and stewardship in the projects it funds. However, for most researchers and data stewards it is quite challenging to put the FAIR principles into practice. ZonMw therefore organises support, and collaborates in the development of new tools and practices for the creation of FAIR data. An overview can be found on the M4M Metadata-for-machines resource page. M4M is one of the components of the three-point FAIRification Framework. The aim of these initiatives is to make FAIR data as easy as possible.
FAIR data is not a well-defined endpoint. Instead, data may gain a certain level of FAIRness through data stewardship actions, taking FAIR principles as a guidance. Depending on their goals, researchers and data stewards may decide to focus specifically on for instance findability, or interoperability (etc). Implementing all FAIR principles is very challenging though, and for most researchers and data stewards not yet possible because they lack the appropriate knowledge, tools or infrastructure. Strictly speaking, however, as long as data (or their metadata) are not machine readable, they should not be labelled as ‘FAIR’.
ZonMw requires grant holders to take actions to make data as findable, accessible, interoperable and reusable as possible, and appropriate for the type of project. ZonMw’s M4M-workshops for the COVID-19 research programme were the first step towards machine readability, and thereby achieve some ‘true’ FAIRness of data in projects it funds.You can read more about the concept of metadata for machines (M4M) and find out how they are produced, and can be used.
The concept of FAIR is usually framed in connection with data. The FAIR principles, however, may also help to make other output (or assets, or artefacts) from projects reusable. Consider for example a collection of biomaterials, or the recording of an interview. Such assets can gain FAIRness, e.g. by improving their findable (through a persistent identifier such as a DOI-code), and by providing rich metadata to describe what the asset is about and how it was created (provenance). Note that metadata themselves should also be FAIR, as well as the codes, software, algorithms, tools and services to use the (meta)data.You can read about FAIR digital objects to see what is actually ‘done’ (or handled, or curated) with data/assets to make them FAIR.
In principle, ZonMw’s requirements and procedures apply to all assets from the projects it funds. Through the list of key items, grantholders can show how their assets /products are made reusable.
ZonMw’s activities (etc) to FAIRify data have primarily been aimed at research data. The next steps are to FAIRify real world data, such that they can be reused for research and innovation as well. Real world data include among others patient or population data, and data derived from e.g. devices or eHealth programmes.
Looking at the table with the 15 FAIR guiding principles, the following aspects are notable:
Metadata: “ FAIR is 90% metadata”. This expert opinion points at the crucial role of metadata (i.e. the data about data, or other assets). Metadata after all, contain persistent identifiers, information on how data were generated, procedures to get access (or not), etc.
Machine actionable: the machine (or computer) can read, understand, and actually use the data, metadata, persistent identifiers, etc. The three-point FAIRification Framework involves the tools to accomplish this.
Generic vs. domain-specific: the FAIR principles guide to technical, generic operations (these are in principle the same for each research domain), and to domain-relevant aspects. Both are needed for FAIRification. For the domain-relevant aspects, a community (e.g. researchers in a specific research domain) agrees on certain standards, tools or infrastructures that are commonly used. Promoting the use of such community choices, improves the interoperability of data within that domain.
ZonMw promotes community choices: it stimulates grantees to choose, whenever this is possible, standards (etc) that are common within their research domain. By organising M4M-workshops for the COVID-19 research programme, ZonMw took the next step to develop machine readable metadata (M4M) schemes that include COVID-19 community choices.
Ideally, datasets from ZonMw’s projects are FAIR and open to provide access for reuse. However, in the case of privacy sensitive dataset or IP issues, it may be necessary to restrict access to the data. Such a dataset can comply very well with FAIR principles, while the data remain at the same time closed for other users. The FAIR principle for Accessibility (A1.2) guides to the procedure for authentication and authorization. A data producer can thereby set the desired access restrictions, such that the computer (and an algorithm) understands whether it is allowed to get access to the dataset or not. Of course, the information on access restrictions can be made ‘human readable’ as well.
Note that OPEN data are not in itself FAIR data. Also open data must be FAIRified! Finally, metadata (i.e. data that describe and lead to the dataset) must be FAIR and OPEN. Otherwise the dataset cannot be found.
The terms data management and data stewardship both refer to activities ‘to take care of your data’, with “the aim to optimise the usability, reusability and reproducibility of the resulting data”. You can read this and much more at DTL. Data management usually points at activities during a project to collect, annotate, and archive data, while data stewardship is about making data reusable for the long-term, also after the project has ended. The FAIR principles provide the guidance therefore. Taking care of your data, may also be viewed as steps in a data life cycle, corresponding with phases in research:
To avoid the discussion on definitions, ZonMw uses for this website the term research data management and stewardship (RDM). More important is what a data provider actually does in the context of RDM to FAIRify his or her data! Planning and performing RDM must already start when you prepare your project and grant application, and continues throughout the project. Read more about the ZonMw’s requirements and procedures for RDM.
To make FAIRification as easy as possible, researchers, data stewards and other data producers rely on state of art facilities (services, tools, infrastructures), which are integrated in a coherent FAIR data-ecosystem. In the Netherlands, the NPOS (Nationaal Programa Open Science) coordinates the development of such an ecosystem at the national level and for all research domains. NPOS thereby aligns with the facilities and concepts that are conceived within the EOSC (European Open Science Cloud). For the domain of life sciences and health in the Netherlands, the ecosystem is maturing within a network of research institutes, and FAIR data expert organisations such as GO FAIR, Health-RI, and DTL. Also the role of ZonMw and other funding agencies is growing.
Researchers who apply for or receive a grant from ZonMw, are recommended to involve a data steward who can provide support to perform the various tasks in RDM and for the FAIRification of their data. The rapid growing awareness about the need to FAIRify data, and the emerging facilities therefore, ask for highly specialised data stewardship professionals.
The capacity of data stewardship, however, is not yet op to the level that is needed to meet the growing demand. Moreover, data stewardship is a new profession, which still waits for full acknowledgement as a crucial expertise in the FAIR data-ecosystem. ZonMw delivered on behalf of the NPOS a report about Professionalising Data stewardship in the Netherlands, which provides the recommendations and practical criteria for competences, training and education to improve the capacity. The report also states that data stewards need to be recognised as full members in research groups, and that sufficient budgets must come available to maintain their positions in the long run.
ZonMw has put in its requirements for each grant proposal, that the time and the costs of data stewardship must be planned and budgeted properly.
Once data are FAIR, a ‘machine’ (i.e. computer) can find the data (and metadata), understand them and use them. FAIR data are therefore ‘machine readable’ or ‘machine actionable’. In fact, with FAIR data, it is the ‘human’ (e.g. a researcher) who can use the computer to query the internet, and find and read the data, irrespective of their location. All the information about the data is captured in metadata. Metadata must also be FAIR, because machines and humans need the metadata to find out what the data are exactly about, and whether it is possible and allowed to use the data. The M4M templates that were developed in the COVID-19 programme, were developed for this purpose. The most advanced way to use FAIR data is ‘data visiting’: the metadata and data that are found by the machine, can be used for research with the help of an algorithm. This approach is not yet common practice. It is in development, for example in the personal health train concept.
In the current situation, however, machine readable metadata are already of benefit for researchers. For example, the metadata that are produced in ZonMw’s COVID-19 projects are exposed on the COVID-19 data portal that is built by Health-RI. There they can be searched and analysed, and data can be requested, according to the associated license, and governance defined per dataset. You can read more in The case of the COVID-19 research programme and metadata-for-machines.
Read more about these (and other) topics in the in DATA Intelligence, the Special Issue on Emerging FAIR Practices. It includes an article by ZonMw and Health Research Board (Ireland) about a FAIR funding model.
This presentation of the FAIR guiding principles (provided by Erik Schultes, GO FAIR) highlights a number of aspects that are important in the process of FAIRification. Read more in Some important aspects of FAIR data that we have to keep in mind.
ZonMw is building expertise in its research programmes to enable researchers, data stewards and other data producers to generate FAIR data. ZonMw strives to incorporate state of art FAIRification processes in its procedures, tools and services such that it becomes as easy as possible to use them. To improve interoperability of data and metadata, ZonMw stimulates researchers to choose standards (etc) that are commonly used within their research domain. In addition, ZonMw started in some programmes to organise activities for research communities to agree on such preferred domain-relevant standards and technologies.
The significant impact of the outbreak of the virus on public health and on economies worldwide urged ZonMw to develop together with GO FAIR Foundation, DTL and Health-RI a workflow, tools and a data portal to produce and use metadata-for-machines (M4M) for COVID-19 research projects.
In a number of workshops, the partners organised ‘support and community building’ for the COVID-19 researchers and their data stewards. The participants formed the community to choose and agree on a the relevant elements and controlled vocabularies for the metadata schemes to describe COVID-19 datasets. On the basis of their input, and the topics of the ZonMw COVID-19 research programme, GO FAIR Foundation developed the COVID-19 M4M templates. Next, the researchers and data stewards were trained and supported to use the templates.
The metadata that COVID-19 researchers produce with the M4M templates to describe their projects and datasets is readable for machines (i.e. computers). Health-RI developed a COVID-19 data portal to expose the metadata, and thereby the information becomes ‘human readable’ as well. Irrespective of the repository where each project's data is stored, it is possible to find the metadata via the data portal. One can analyse the information that is derived from the metadata, and data can be requested, according to the associated license, and governance defined per dataset.
This initiative to FAIRify COVID-19 research data is an important step in the FAIRification of data. The lessons learned will lead to next steps, also in other research domains. You can read more about concept of metadata for machines (M4M) and find out how they are produced, and can be used. The complete workflow of community choices, metadata development, and exposure on FAIR data points, is now being consolidated in the three-point FAIRification framework.
We invite you to read our other webpages about data management:
Disclaimer: this webpage will be gradually completed during September 2021