A variety of organisations and perspectives on data has led to different definitions. In the course we use the definitions below.
A data archive is a facility which moves data to an environment for long-term retention. A data archive is indexed and has search facilities, enabling data to be retrieved.
The way in which data or information is coded and stored. A data format (or file format) gives information on how to process the data.
|Data backup||A copy of the data for the purpose of creating a duplicate dataset.|
|Data lab||A data lab is a virtual research environment that enables researchers to organize and share their research data and related output during their research project.|
|Data management plan|
A written agreement describing the research project, the type and volume of data produced and stating which data will be saved, how they will be saved (file format, version control, metadata), whether and when data will be submitted to a repository and under which terms. If necessary, it describes the tools (hardware and software) that are required to (re)use the data.
|Data provenance||Data provenance is providing a historical record of the data and its origins. It refers to the process of tracing and recording the origins of data and its movement between databases. (see http://db.cis.upenn.edu/DL/fsttcs.pdf).|
|Data repository||A general term for a location to store data. A data repository with a policy for long-term preservation is called a data archive.|
Data sharing policy
An institutional policy concerning the sharing of research data. It is often written as a letter of intent declaring that research data will be submitted to dedicated repositories as soon as possible, complying with international data and exchange formats
|Datatweeps||People tweeting about data|
The digital object identifier is a unique and stable identifier that ensures that a digital object can be permanently found on the World Wide Web, regardless of changes in the URL where the object is found. A central registry ensures that the user of a DOI will be referred to its current location. (see e.g. http://www.datacite.org/).
|Data seal of approval|
An archive holding a Data Seal of Approval (DSA) complies with requirements ensuring that in the future, research data can still be processed in a high-quality and reliable manner. (see http://www.datasealofapproval.org).
A term used to describe a recommended best practice for exposing, sharing, and connecting pieces of data, information and knowledge on the Semantic Web using RDF. Linked data refers to data published on the web in such a way that it is machine-readable, that its meaning is explicitly defined, that it is linked to other external data sets, and that in turn it can be linked to from external data sets. (see http://linkeddata.org).
|Open data||A piece of data or content is open if anyone is free to use, reuse, and redistribute it — subject only, at most, to the requirement to attribute and/or share-alike. (see http://okfn.org/opendata/).|
A unique code that is coupled to a digital object. With this code, the object can be identified even when the object is moved to a different location. The DOI and the URN:UBN are examples of persistent identifiers.
A file format that has, to the best of our knowledge at this moment, the best chances of being useable in the (far) future.
When speaking about preservation, two distinct perspectives are distinguished:
Keeping data in its present shape, means protecting data from incidental loss and making data findable through proper metadata. Long-term preservation adds the task of changing the data format in a reliable way and being accountable for all manipulations in order to keep the data in a shape that is demanded by future software or future working practices of the designated community.
|RDF||RDF is a standard model for data interchange on the Web (see www.w3.org/RDF/).|
Data are facts, observations or experiences on which an argument or theory is based. (see http://ands.org.au/guides/what-is-research-data.pdf).
A system that brings about the link between a persistent identifier and the location where the object is currently situated.
|Text- and data mining|
The computer-based process of deriving or organising information from text or data. It works by copying large quantities of material, extracting the data, and recombining it to identify patterns, trends and hypotheses or by providing the means to organise the information mined. (see www.ipo.gov.uk/ipreview-doc-t.pdf).
|Trusted digital repository (TDR)|
A certified digital repository that has been set up to provide reliable, sustainable access to the data deposited. TDRs may be certified at three levels:
A unique and stable identifier (persistent identifier) that ensures that a digital object can be permanently found on the World Wide Web, regardless of changes in URL where the object is found. A central registry (see www.persistent-identifier.nl) ensures that the user of a URN will be referred to its current location. This persistent identifier is based on the Uniform Resource Name (URN), the National Bibliographic Numer (NBN), a land code and a unique string.
Eric Rumondor - Ik krijg meer inzicht in wat Trusted Digital Repository is, wat DSA doet, de rol van ISO 16363 en de external audit.
Research Data Netherlands - Een definitie van "digital curation":
Digital curation is a series of repository activities including ingest, data management, (archival) storage, preservation planning, access, common services, and repository administration, as well as pre-repository activities (production and pre-ingest (e.g. appraisal, selection, preparation, rights)), post-repository activities, and management activities.
Marcel Leermakers - Goede definities die het e.e.a. helder maken. Zeker wat geleerd van deze paragraaf.
Sindiswa Sota - Some jargon is self explanatory, that makes it easy to understand. Some, obviously it is something that will be learned and understood during this process of getting to grips with RDM.