Data citation is the practice of providing a reference to data in the same way as researchers routinely provide a bibliographic reference to other scholarly resources. | ANDS, 2017
The correct citation of research data - data citation - is seen as one of the most important ways in which research data can be counted as 'first-class research output'. In this section, we will show what other advantages data citation offers, what role persistent identifiers (PIDs) play and what a data citation looks like.
Working on a culture of data citation
The publication of datasets increasingly counts as a citable contribution to the research curriculum. DataCite (n.d.a.) is an important player in building the technical infrastructure to enable data citation. In addition, it is the research community itself that has published two manifestos to point the way: one with a number of data citation principles (FORCE 11, 2014) and one with software citation principles (Smith, 2016). These initiatives form the basis for building a culture of data citation (ANDS, n.d.).
Citing research data is part of the Altmetrics (2010) - alternative metrics - movement that states that the impact of your research is determined by (the references to) a wide range of research output such as datasets, software, blog posts, presentations, etc.
- Makes data easier to find;
- Promotes reproducibility;
- Promotes reuse of data;
- Makes it possible to track the impact of the research data;
- Creates a publication structure that enables long-term availability of data;
- Provides a structure in which the impact of the data can be traced back to the researchers who created the data.
Persistent identifiers and data citation
To be citable, a dataset needs a persistent identifier (PID), a unique label that is linked to a digital object. This means that the object can always be found, even in the event of changes of name and location. With a PID you can prevent the creation of broken links or a page not found.
When publishing data in a data archive, a PID is automatically assigned to the data. A PID is conditional for the F of FAIR data. Without a PID, a dataset cannot be found in a sustainable way. A PID is therefore necessary, but not sufficient for FAIRness. If the dataset is only assigned a PID and no machine-readable metadata, it will still be difficult to find a dataset, unless the PID is known. It is via the metadata that a dataset is found and via the PID that the dataset is then located.
In the video below we explain the role of a PID - in this case the DOI (n.d.) - in data citation.
Persistent identifiers describe a kind of endpoint. To be really useful, these endpoints must be connected to each other (Haak et. al., 2018). To be able to create a so-called 'research graph' in which the relationships between data, researchers, publications, research funders, organisational resources, etc. can be seen at a glance, more PIDs are needed than those for the research data alone. A well-known PID for a unique researcher is ORCID iD (n.d.).
PIDs act as both unique identifiers and, critically, as connectors. By unambiguously identifying and connecting an individual researcher with their research organisations, professional activities and other contributions, we can be confident that we understand – and can assert – the relationships between each of them. And, by doing so using resolvable PIDs that incorporate FAIR metadata, we also make researchers, their affiliations and their contributions more easily discoverable. | Meadows, 2019
Click to open/close
Altmetrics (2010). Altmetrics: a manifesto. http://altmetrics.org/manifesto/
ANDS (n.d.). Building a culture of data citation. https://www.ands.org.au/__data/assets/pdf_file/0003/383025/data_citation_poster.pdf
ANDS. (2017). Data citation. ANDS Guide. awareness. https://www.ands.org.au/__data/assets/pdf_file/0005/724334/Data-citation.pdf
DANS (n.d.). Resolve identifier. http://www.persistent-identifier.nl/
DataCite (n.d.a.). https://datacite.org/
DataCite (n.d.b.). DataCite MDS API. https://mds.datacite.org/
DataCite (n.d.c.). DataCite - Cite Your Data. http://www.datacite.org.s3-website-eu-west-1.amazonaws.com/cite-your-data.html
DataCite (2019, Augustus 16th). Datacite Metadata Schema. Metadata Schema 4.4. https://schema.datacite.org/
DCP (n.d.) Persistent identifiers. https://dpconline.org/handbook/technical-solutions-and-tools/persistent-identifiers
Delft University of Technology (n.d.). DataCite Netherlands. https://www.tudelft.nl/en/library/support/datacite-netherlands/
DOI (n.d.) https://www.doi.org/
Figshare (n.d.). https://figshare.com/
FORCE 11 (2014). Joint Declaration of Data Citation Principle. - Final. https://www.force11.org/datacitationprinciples
FREYA (n.d.). The FREYA project. https://www.project-freya.eu/en/about/mission
GitHub (2016). Making your code citable. https://guides.github.com/activities/citable-code/
Haak, L., Meadows, A., Brown, J. (2018). Using ORCID, DOI, and Other Open Identifiers in Research Evaluation. Front. Res. Metr. Anal, vol 3, p28. https://doi.org/10.3389/frma.2018.00028
Ishiyama, T., Rieder, S., Makino, J., Zwart, S.P., Groen, D., Nitadori, K., Laat, C. de, McMillan, S., Hiraki, K., Harfst, S. (2011). The Cosmogrid Simulation: Statistical Properties of Small Dark Matter Halos (2048-103). Leiden University. 10.25606/SURF.578c6039-0bf84511
Keen, A.S (2011): Erosive Bar Migration Using Density and Diameter Scaled Sediment Erosive Profile Set-Prototype Scale (Actual Scal 1:10). TU Delft. doi:10.4121/uuid:32c53005-a4f2-447c-b231-6cdb7dcdd17f.
Meadows, Alice, Laurel L. Haak, and Josh Brown. 2019. “Persistent Identifiers: The Building Blocks of the Research Information Infrastructure”. Insights32 (1): 9. http://doi.org/10.1629/uksg.457
Netwerk Digitaal Erfgoed (n.d.).https://www.pidwijzer.nl/en/pid_results/new
ORCID. (n.d.). Register for an ORCID iD. Retrieved from https://orcid.org/register
PID Forum. (n.d.) https://www.pidforum.org/
Smith, A.M., Katz, D.S., Niemeyer, K.E., FORCE11 Software Citation Working Group. (2016) Software Citation Principles. PeerJ Computer Science 2:e86. https://doi.org/10.7717/peerj-cs.86
SURF (n.d.a.). SURF Data Archive. https://www.surf.nl/en/secure-long-term-storage-with-data-archive
SURF (n.d.b.). Data Persistent Identifier: data always findable by permanent references. https://www.surf.nl/en/data-persistent-identifier-data-always-findable-by-permanent-references
SURF (n.d.c.). SURF Data Repository. https://repository.surfsara.nl/
Zenodo (n.d.a.). https://zenodo.org/
Zenodo (n.d.b.). Zenodo Communities. https://zenodo.org/communities/