RDM Consultation: Organizing Research Data

Developing and using a clear structure is the first step towards efficient data management. Researchers who join a research group for the first time can find out whether any previously developed structures already exist. One of the effective data organization strategies is the use of a folder structure. Each folder should contain structurally or contextually related data and should be named appropriately.

Tips for a Folder Structure

  • Organize data in a hierarchical folder structure (e.g., Project → Data, Output, Code, Paper)
  • Use systematic, content-related folder naming, that is understandable to others
  • Avoid using more than three levels of subfolders
  • Maintain consistent file naming conventions, for instance for dates: YYYYMMDD_Name
  • Avoid special characters, capital letters, spaces, and periods in file names
  • After project is completed, assess which files are still needed for archiving and which can be deleted

Version Control

An efficient handling of the individual data or datasets is equally important as the effective data organization in folder structures. This is especially true if datasets undergo changes in the course of a research project. Therefore, a functional version control system is an important component of an organizational strategy.

Structured version control of files can be conducted in a separate version control document, which describes various document versions with relevant parameters (e.g., “Date of Last Modification,” “Modified By,” “Relevant Changes,” etc.).

Furthermore, documents can be versioned, in particular by assigning whole numbers for significant version changes and numbers linked with underscores for minor alterations (e.g., v1, v2, v1_01, etc.). It is not recommended to use labels such as “final,” “final2,” or “revision,” as it is easy to lose track of them.

There are several free version control software tools that can be used for research data organization. Using version control software is especially advisable if you have to deal with large amounts of data.

Ghent University Data Stewards (2020) provides two videos which address data documentation and file organization.

Collaborative Work

Documents and data often need to be managed in a controlled and organized manner across multiple organizations or research institutions. Various options and tools are available to researchers today. The following requirements should be met by a collaborative environment:

  • Storage and exchange of files, backups
  • Files can be organized into folders
  • Access control system, which manages authentication and authorization
  • Files version control
  • Users can work on the same file simultaneously
  • Ideally, provision of a discussion platform

Be careful when storing data in a cloud. Data security as well as the ownership rights should be definitely checked in the terms of use before research documents are stored there!

In addition to well-known cloud services, it is possible to establish a group drive within an institution or use a virtual research environment (e.g., MS SharePoint). Recommended tools can be found here, for example.

The German Network of Educational Research Data offers a comprehensive overview of the relevant data organization topics on their website.