Data Selection

When selecting research data for archiving in a research data center, the purpose for further use is crucial. Possible decision-making approaches for the selection of data are, for example:

  • The use of the data for further publications: For this purpose, referenced (=processed) data with additional documentation are necessary (metadata).
  • For teaching: For this purpose, samples of original data and compiled data including analysis steps are useful
  • For verification of research results: Referenced data including analysis steps for traceability of steps of all results should be included for this purpose
  • For further analyses: If possible, all original data should be archived for this purpose

Data Evaluation

In the end, researchers and Data curators must decide for themselves which Data are actually relevant for potential reuse. The following checklist can help researchers decide whether Data is worth archiving:

Checklist for clarifying the achievability of research Data
  • Are there any third-party requirements (e.g., from research funding agencies, Data policies, guidelines of the research institution) that make long-term storage necessary?
  • Do the Data donors have the necessary rights to use the Data for sharing? Under what conditions do they “own” the Data?
  • Are the Data collected one-time and not reproducible, or are the costs of reproduction higher than the costs of long-term retention?
  • Is re-collection of Data unlikely to provide better results?
  • Is there a high level of post-use interest in the research Data?
  • Have the Data not yet been fully (scientifically) studied?
  • Are the Data characteristic or atypical of a research area, or are they unique research findings?
  • Do the Data have general or regional historical significance?
  • Is the Data quality good technically and in terms of content?
  • Is descriptive metadata complete or can it be generated?
  • Can the necessary preservation metadata (reference, provenance, context, and persistence information, as well as information on access rights) be provided?

[Source: Weber, Andreas and Piesche, Claudia. “4.2 Datenspeicherung, -kuration und Langzeitverfügbarkeit”. Praxishandbuch Forschungsdatenmanagement, edited by Markus Putnings, Heike Neuroth and Janna Neumann, Berlin, Boston: De Gruyter Saur, 2021, pp. 327-356. https://doi.org/10.1515/9783110657807-019 (translated by KonsortSWD)]