Searching for data

Directories of research data repositories, such as and FAIRsharing, web search engines, and colleagues can be consulted to discover domain-specific portals in your discipline. Repository developers invest significant time and energy organising data in ways to make them more discoverable; use their work to your advantage. Familiarise yourself with the controlled vocabularies, subject categories, and search fields used in particular repositories. | Gregory, 2018

Suppose you want to reuse research data from other studies, but you don't know where you could find such data. So, where do you start? In this section you will find a number of tips on how to find data and assess its value.

Ways to find data

The first step in the quest to find research data for reuse is to very clearly state what attributes the data should have to be able to meet a researcher's goals. Once that picture is very clear, the following ways of searching for data can be tried. 

1. Use a catalogue of data archives

Through a catalogue or directory such as (n.d.) or FAIRsharing (n.d.) you can search for data archives (and not yet for the data itself). In you can search for Subject, Content type and Country. You can also select whether you want to search for data archives with a quality seal, with datasets that are made available through open access and/or with datasets that have a persistent identifier. The search for the datasets themselves is subsequently in the data archive of your choice. Each data archive has its own search options.

2. Use a search engine

You can use Google (n.d.) to discover data archives that contain the type of research data you're looking for. In addition to keywords describing the research topic, it is also important to add keywords such as 'data archive' or 'datasets' to the search. The advantage of this approach is that you will certainly look beyond the 'usual suspects'. Google indexes trillions of web pages. The disadvantage may be that it takes time to filter the results.

3. Use a metadata aggregator

The following search tools are examples of tools with which you can search in the metadata of a selection of data archives. You search in the description of the data and not in the data itself.

  • Google Dataset Search
    Google Dataset Search (n.d.) localises datasets. The disadvantage is that it is not very clear in which data archives Google Dataset Search searches. If you don't find suitable data, that doesn't mean that they aren't there. They are just not indexed via Google Dataset Search.
  • DataCite Search
    With DataCite Search (DataCite, n.d.) you search for datasets to which a DOI (Digital Object Identifier) has been assigned.
  • Data Citation Index
    The Data Citation Index is a paid service from Clarivate Analytics (n.d.).
    Through NARCIS (DANS, n.d.)  you search for datasets that are available from Dutch data archives and repositories.
  • B2FIND
    B2FIND (EUDAT, n.d.) is a data set discovery service that searches the metadata of research data collections from EUDAT data centres and other repositories.
  • CESSDA Data Catalogue
    Through the CESSDA Data Catalogue (CESSDA, n.d.) you are looking for social science datasets from the affiliated CESSDA data archives.
  • Survey Data Netherlands
    In Survey Data Netherlands (n.d.) you can search for Dutch survey data.

Searching in data archives

If you have found a data archive in which you expect your type of data, you can take the next step: searching in the data archive. In a data archive, you usually search in the metadata assigned to the datasets and not in the actual content of the datasets themselves. Take this into account when formulating your search terms. If you study a number of datasets, you automatically learn which metadata fields the archive assigns, giving you hints about appropriate search terms. To fully exploit the search possibilities, have a look at the 'Advanced Search' options.

How useful is the dataset?

Finding a seemingly suitable dataset is the first step. Assessing the usefulness of the found dataset for the intended research purposes is an important second step. In the video below you will find a number of helpful criteria (Utrecht University, 2018).

