Ein Angebot von
Working with research data is a responsible task. This is particularly true for social, behavioral, educational, and economic sciences, where people are often the subject of a research, and thus sensitive, personal data form the core of a research. In this context, various ethical principles can come into conflict. This issue is presented in our article on research ethics.
Legal requirements, such as data protection and copyright laws, also play a central role in research data management for data curators, and regulate the handling of research data at EU, federal or state level. One of the key responsibilities of data managers is not only to be aware of these aspects, but also to actively work towards their compliance to recognize risks and violations, and to develop procedures for adhering to these principles in their institutions.
For the ingestion of the selected research data to your data center, the topics of anonymization and pseudonymization of sensitive data are relevant in connection with the necessary data processing.
Anonymization | Anonymization refers to the process of altering personally identifiable data to ensure that no direct or indirect identification of a specific individual is possible. Anonymization involves removing or modifying any information that could lead to the identification of a person. The goal is to modify the data in such a way that it can no longer be attributed to a specific individual. |
Pseudonyzation | Pseudonymization is the process of replacing personally identifiable data with pseudonyms or codes to make direct identification of a person more difficult. The link between the pseudonyms and actual identities is stored in a separate table or database accessible only to authorized personnel. |
[Source: Glossary | Practice in Short | Research Data and Research Data Management]
Both anonymization and pseudonymization are methods used to protect privacy and ensure data protection. They are employed to ensure that sensitive data cannot be used in the social sciences to identify individuals or violate their privacy. These measures minimize the risk of unauthorized disclosure or misuse of personal data.
Common strategies for anonymizing sensitive data include:
It is important to note that the choice of anonymization strategy depends on various factors, such as specific research context, data protection regulations, and data analysis requirements. It is advisable to adhere to applicable data protection policies and laws and to consult with data privacy experts if necessary.
There are helpful working papers from the Consortium for Research Data in Education which examines the anonymization of qualitative as well as quantitative research data. The free anonymization tool QualiAnon supports the anonymization/pseudonymization of text data.
Finally, in case of data for which comprehensive anonymization cannot be sufficiently guaranteed, there is the option of institutionally implemented access controls.
An example of this is the recommendations of the German Psychological Society (DGPs) on the handling of research data. The RDC at ZPID has implemented these recommendations in the form of an Access Class Model: For data with special requirements for to data protection and research ethics, ZPID offers various data release levels.