Directories of research data repositories, such as re3data.org and FAIRsharing, web search engines, and colleagues can be consulted to discover domain-specific portals in your discipline. Repository developers invest significant time and energy organising data in ways to make them more discoverable; use their work to your advantage. Familiarise yourself with the controlled vocabularies, subject categories, and search fields used in particular repositories. | Gregory, 2018
Suppose you want to reuse research data from other studies, but you don't know where you could find such data. So, where do you start? In this section you will find a number of tips on how to find data and assess its value.
Ways to find data
The first step in the quest to find research data for reuse is to very clearly state what attributes the data should have to be able to meet a researcher's goals. Once that picture is very clear, the following ways of searching for data can be tried.
1. Use a catalogue of data archives
Through a catalogue or directory such as Re3data.org (n.d.) or FAIRsharing (n.d.) you can search for data archives (and not yet for the data itself). In Re3data.org you can search for Subject, Content type and Country. You can also select whether you want to search for data archives with a quality seal, with datasets that are made available through open access and/or with datasets that have a persistent identifier. The search for the datasets themselves is subsequently in the data archive of your choice. Each data archive has its own search options.
2. Use a search engine
You can use Google (n.d.) to discover data archives that contain the type of research data you're looking for. In addition to keywords describing the research topic, it is also important to add keywords such as 'data archive' or 'datasets' to the search. The advantage of this approach is that you will certainly look beyond the 'usual suspects'. Google indexes trillions of web pages. The disadvantage may be that it takes time to filter the results.
3. Use a metadata aggregator
The following search tools are examples of tools with which you can search in the metadata of a selection of data archives. You search in the description of the data and not in the data itself.
- Google Dataset Search
Google Dataset Search (n.d.) localises datasets. The disadvantage is that it is not very clear in which data archives Google Dataset Search searches. If you don't find suitable data, that doesn't mean that they aren't there. They are just not indexed via Google Dataset Search.
- DataCite Search
With DataCite Search (DataCite, n.d.) you search for datasets to which a DOI (Digital Object Identifier) has been assigned.
With DataSearch (Elsevier, n.d.a.) you can search for certain types of data, such as text, image, audio, slides, software, etc. This tool also allows you to search for supplementary data associated with articles from arXiv and ScienceDirect. The list of indexed data archives can be found under the FAQ (Elsevier. n.d.b.).
- Data Citation Index
The Data Citation Index is a paid service from Clarivate Analytics (n.d.).
Through NARCIS (DANS, n.d.) you search for datasets that are available from Dutch data archives and repositories.
B2FIND (EUDAT, n.d.) is a data set discovery service that searches the metadata of research data collections from EUDAT data centres and other repositories.
- CESSDA Data Catalogue
Through the CESSDA Data Catalogue (CESSDA, n.d.) you are looking for social science datasets from the affiliated CESSDA data archives.
- Survey Data Netherlands
In Survey Data Netherlands (n.d.) you can search for Dutch survey data.
Searching in data archives
If you have found a data archive in which you expect your type of data, you can take the next step: searching in the data archive. In a data archive, you usually search in the metadata assigned to the datasets and not in the actual content of the datasets themselves. Take this into account when formulating your search terms. If you study a number of datasets, you automatically learn which metadata fields the archive assigns, giving you hints about appropriate search terms. To fully exploit the search possibilities, have a look at the 'Advanced Search' options.
How useful is the dataset?
Finding a seemingly suitable dataset is the first step. Assessing the usefulness of the found dataset for the intended research purposes is an important second step. In the video below you will find a number of helpful criteria (Utrecht University, 2018).
Click to open/close
CESSDA (n.d.) CESSDA DC Data Catalogue. Chapter 7. Discover. https://www.cessda.eu/Training/Training-Resources/Library/Data-Management-Expert-Guide/7.-Discoverhttps://datacatalogue.cessda.eu/
CESSDA (2018). Data Management Expert Guide.
Clarivate Analytics (n.d.). The Data Citation Index. http://wokinfo.com/products_tools/multidisciplinary/dci/
DANS (n.d.). NARCIS. https://www.narcis.nl/search/coll/dataset/Language/en
DataCite (n.d.). DataCite. Find, Access and Reuse Data. https://search.datacite.org/
Elsevier (n.d.a.). DataSearch Beta. https://datasearch.elsevier.com/#/
Elsevier (n.d.b.). FAQ. https://datasearch.elsevier.com/faq#/
EUDAT (n.d.). B2FIND. https://eudat.eu/services/b2find
FAIRsharing.org. (n.d.). A curated, informative and educational resource on data and metadata standards, inter-related to databases and data policies. https://fairsharing.org/
Google (n.d.). https://www.google.com
Google Dataset Search Beta. (n.d.) https://toolbox.google.com/datasetsearch
Gregory, K., Khalsa, S.J., Michener, W.K., Psomopoulos, F.E., de Waard, A., Wu, M. (2018). Eleven quick tips for finding research data. PLoS Comput Biol 14(4): e1006038. https://doi.org/10.1371/journal.pcbi.1006038
Re3data (n.d.).Registry of research data repositories. https://www.re3data.org/search?query=
Survey Data Netherlands (n.d.) https://www.surveydata.nl/browse-our-data
Utrecht University (2018, July 20). How useful is this dataset? Follow this short tutorial. [video]. https://youtu.be/t1SZutbCAxI
VOGIN (n.d.). VOGIN-cursus Online opsporen van informatie (2 + 3 dagen). https://www.vogin.nl/academie/cursussen/