Access, Use & Reuse

RDM Consultation: Persistent Identifiers

Persistent Identifiers (PIDs) act as an intermediary layer between a reference and an object, decoupling the object from the electronic location. PIDs do not refer to location on the internet, but to the object itself, such as a data set. This reduces the number of ‘Broken Links’ (Error 404: Page not found) and increases the stability of references, even when data changes location.

In contrast, URLs (Uniform Resource Locators) do not refer to specific content but to ‘location’ on the internet. If the desired content, such as a scientific dataset, is moved to a different location, the URL becomes useless for locating it. Furthermore, datasets are often published in multiple locations on the internet, resulting in several URLs pointing to the same dataset, which is impractical for reliable scholarly citation. And last but not least, URLs often contain semantic references to the domain on which they are based and therefore not suitable as neutral identifiers.

For these reasons, the concept of persistent identifiers was developed, which has become a widely accepted standard for identifying digital objects in recent years. Deleting PIDs is technically possible, but not intended and should not be done in practice. Occasionally, objects may need to be deleted for data protection or copyright reasons, but the metadata (information about the data) associated with the object remains intact and discoverable.

Types of Persistent Identifiers

Certain persistent identifiers have gained significant prominence in specific domains, while others are less commonly used in the German or European context.

Examples of persistent identifiers include:

  • Digital Object Identifier (DOI, http://doi.org): the most commonly used identifier for research data, recognized nationally and internationally; includes 5 mandatory metadata fields to ensure proper citation; available for academic institutions for free; always use DOI for citations if available!
  • Uniform Resource Name (URN, http://www.dnb.de/urnservice.html): primarily used for publications and more common in the German and European context; not universally resolvable; available free of charge through the German National Library.
  • Handle (http://www.handle.net): suitable for scenarios involving an extremely large number of objects that need global persistent identification; no mandatory metadata fields; associated with modest costs.
  • Persistent Uniform Resource Locator (PURL, http://purl.org): less known in the German-speaking region; functions similarly to an HTTP redirection.
  • Permalink: a long-lasting URL but lacks consistent guidelines or quality standards.
  • Archival Resource Key (ARK): used internationally by libraries, publishers, academic institutions, archives, and museums; also based on a long-lasting URL but less common for research data.

[Source: https://www.cms.hu-berlin.de/de/dl/dataman/teilen/pid/persistente-identifikation]

This video provides a comprehensive overview of Persistent Identifiers: