I totally agree that we, as data supporters, should expect the question ‘What is in it for me?’. It might indeed be useful to have testimonials at your disposal, but I would rather have some ‘evidence’ in the form of scientific papers (or data, of course) that - preferably - shows a citation advantage for data sharing | Student Essentials 4 Data Support
As a data supporter, you advise a researcher about measures to work with integrity and reproducibility and to deliver FAIR data. You can expect the question "What is in it for me?". If you are asked this question, it can help to have testimonials and scientific evidence at your disposal that show how other researchers have benefited. In this section, we have prepared a list for you.
Researchers having their say
Below you will see a number of testimonials of researchers. Why do they find reproducibility, open science and FAIR data management important?
- Open science
The following video, which is part of the Open Science MOOC (n.d.), gives a number of researchers a chance to talk about what open science means to them in practice.
- Working reproducibly
Have a look at Patrick Vandewalle's experiences with reproducible working, including the sharing of software and code.
Are you curious about the tools Patrick spoke about?
- Vandewalle, P.; Kovacevic, J; Vetterli, M. (2009). Reproducible Research in Signal Processing - What, why, and how. IEEE Signal Processing Magazine, 26 (3). pp. 37-47.
- Vandewalle, P. (2012). Code Sharing is Associated with Research Impact in Image Processing. IEEE Computing in Science and Engineering, 14 (4). pp. 42-47.
- Publishing research data
The following showcases show why researchers choose to publish their data in a data archive. In this case, in 4TU.Centre for Research Data (n.d.).
The science that speaks
The answer to the question what motivates researchers (other than the one you have in front of you) to share data is partly to be found in the scientific literature. Different studies (e.g. Van den Eynden & Bishop, 2014; Digital Science, 2017; Houtkoop, 2018; Zuiderwijk & Spiers, 2019) separately show that there are three main reasons:
1. Sharing data leads to increased visibility and a citation advantage
As early as 2007, scientific studies have been published showing that the publication of research data leads to increased visibility, reuse and citation and thus recognition of scientific work (Piwowar, 2007). The following studies confirm this effect:
- Social sciences (Pienta, 2010)
- Genetics (Piwowar & Vision, 2013; Botstein, 2010)
- Astronomy (Henneken & Accomazzi, 2011; Dorch, 2012)
- Oceanography (Belter, 2014)
Research by Colavizza et al. (2019) shows that the citation advantage is greater if the underlying data is actually published in a data archive and not just as supplementary material.
2. Sharing data is good for science itself
The publication of data has direct benefits for the research itself, for the scientific discipline and for science in general by enabling new collaborations and new types of use of existing data. The sharing of data is also a prerequisite for verifying or reproducing research, and this - in turn - leads to trust in science. In addition, less resources and time is wasted when data is reused.
Although it is theoretically very plausible that data sharing is good for science, in practice examples of reuse are easier to find in the big sciences, where an infrastructure often already exists and where responsible data management is of paramount importance. Think, for example, of the Hubble Telescope (Hubblesite, n.d.). The observations carried out with this telescope cost a lot of money and can only be done once. The Hubble Telescope data is reused on a large scale (NASA, 2011a).
- When reanalyzing old data, NASA researchers discover a new planet (NASA, 2011b).
- When reanalyzing RNA sequence data, researchers discover so-called fusion genes ( (Kangaspeska, 2012).
- Researchers were able to draw conclusions about the climate at the time from old observations of the places where whales were caught in the Arctic (de la Mare, 1997).
- Since the end of 2010, DANS EASY knows a form of peer review. From the data of one of the reviewed datasets (DANS, n.d.) you can see that 3 out of 8 users intend to use the data for their own publication.
Examples of the reuse of research data can be an incentive for researchers to make their research data available as well. However, the reuse of data is still insufficiently mapped (Pasquetto, 2017).
3. Third parties require it
Research funders and publishers have a significant influence on the sharing of research data. See the paragraph data policy for more information.
Research data published from 2007 onwards have gradually attracted more citations reflecting a bias towards more recent research data which might be due to the awareness of and demand for research data reuse | Fecher, 2015
Click to open/close
4TU.Centre for Research Data (n.d.). https://researchdata.4tu.nl/en/
Belter, C.W. (2014). Measuring the Value of Research Data: A Citation Analysis of Oceanographic Data Sets. PLoS ONE 9(3): e92590.https://doi.org/10.1371/journal.pone.0092590
Botstein, D. (2010). It’s the data! Molecular Biology of the Cell, 21(1), pp.4–6. https://doi.org/10.1091/mbc.E09-07-0575
Digital Science. Hahnel, M., Treadway, J., Fane, B., Kiley, R., Peters, D., Baynes, G. (2017). The State of Open Data Report 2017. https://doi.org/10.6084/m9.figshare.5481187.v1
DANS (n.d.). Detailed reactions for 'Bestand Bodemgebruik 2006 - BBG'06'. Retrieved from http://datareviews.dans.knaw.nl/details.php?l=en&pid=urn:nbn:nl:ui:13-0an-1ei
DANS (2013). Data delen: goed voor de wetenschap, goed voor u. [video] https://youtu.be/DLt0xLyMEVw
Dorch, B. (2012). On the citation advantage of linking to data: Astrophysics. http://hprints.org/docs/00/71/47/34/PDF/Dorch_2012a.pdf
Fecher, B., Friesike, S., & Hebing, M. (2015). What drives academic data sharing? PLoS One, 10(2), e0118053. https://doi.org/10.1371/journal.pone.0118053
Goodman, A., e.a. (2014). 10 Simple Rules for the Care and Feeding of Scientific Data. Retrieved from http://arxiv.org/pdf/1401.2134v1.pdf
Google Dataset Search (n.d.). https://toolbox.google.com/datasetsearch/
Henneken, E.A. & Accomazzi, A. (2011). Linking to data - effect on citation rates in astronomy, Digital Libraries; Instrumentation; Methods for Astrophysics. http://arxiv.org/abs/1111.3618v1
Houtkoop, B.L., Chambers, C., Macleod, M., Bishop, D.V.M., Nichols, T.E., Wagenmakers, E-J. (2018). Data sharing in pyschology; A survey on barriers and preconditions. https://doi.org/10.1177/2515245917751886
Hubble Site (n.d.). Retrieved from hubblesite.org
Kangaspeska, S., Hultsch, S., Edgren, H., Nicorici, D., Murumägi, A., Kallioniemi, O. (2012). Reanalysis of RNA-Sequencing Data Reveals Several Additional Fusion Genes with Multiple Isoforms. PLoS ONE 7(10): e48745. https://doi.org/10.1371/journal.pone.0048745
Mare, W.K. de la. (1997). Abrupt mid-twentieth centry decline in Antarctic sea-ice extent from whaling records. Nature, 389, 57-60. https://doi.org/10.1038/37956
NASA. (2011a). Hubble racks up 10,000 science papers [News] https://hubblesite.org/contents/news-releases/2011/news-2011-40.html
NASA. (2011b). Astronomers find elusive planets in decade-old Hubble data. http://www.nasa.gov/mission_pages/hubble/science/elusive-planets.html
Pasquetto, I.V., Randles, B.M. and Borgman, C.L., 2017. On the Reuse of Scientific Data. Data Science Journal, 16, p.8. https://doi.org/10.5334/dsj-2017-008
Pienta, A.M., Alter, G. C. & Lyle, J.A. (2010). The Enduring Value of Social Science Research: The Use and Reuse of Primary Research Data. Retrieved from http://hdl.handle.net/2027.42/78307
Piwowar, H.A., Day, R.S., Fridsma, D.B. (2007) Sharing Detailed Research Data Is Associated with Increased Citation Rate. PLoS ONE 2(3): e308. https://doi.org/10.1371/journal.pone.0000308
Piwowar, H.A., Vision, T.J. (2013). Data reuse and the open data citation advantage. PeerJ1:e175. https://doi.org/10.7717/peerj.175
Runmycode (n.d). Retrieved from http://www.runmycode.org/
Research Compendia (n.d.). See notice on http://www.re3data.org/repository/r3d100010758
SPARC Europe (n.d.).European Open Data Champions. https://openscholarchampions.eu/opendata/
Untrecht University (n.d.). RDM Support. RDM Stories. https://www.uu.nl/en/research/research-data-management/rdm-stories
Van den Eynden, V., Knight, G., Vlad, A., Radler, B., Tenopir, C., Leon, D. et al. (2016): Survey of Wellcome researchers and their attitudes to open research. figshare. Paper. https://doi.org/10.6084/m9.figshare.4055448.v1
Vandewalle, P., Kovacevic, J., Vetterli, M. (2009). Reproducible Research in Signal Processing - What, why, and how. IEEE Signal Processing Magazine, 26 (3). pp. 37-47. https://doi.org/10.1109/MSP.2009.932122
Vandewalle, P. (2012). Code Sharing is Associated with Research Impact in Image Processing. IEEE Computing in Science and Engineering, 14 (4). pp. 42-47. https://doi.org/10.1109/MCSE.2012.63
Wicherts (2019). The citation advantage of linking publications to research data. https://arxiv.org/pdf/1907.02565.pdf
Zuiderwijk, A., Spiers, H. (2019). Sharing and re-using open data: A case study of motivations in astrophysics. International Journal of Information Management. Volume 49, pages 228-241. https://doi.org/10.1016/j.ijinfomgt.2019.05.024