Transform

Conversion

If during the preservation phase, there is a need to adjust data or delivery methods due to changing technologies and/or the requirements of the target user group, these changes must be carefully considered.

First of all, the new requirements have to be defined, alternatives evaluated, results analyzed, and then a new preservation plan created. This plan may require adjustments to the software being used as well as modifications to file formats. Any data conversion can be carried out with the help of appropriate conversion software, even when the data is used and access is appropriate.

Belonging to a specific discipline and the associated choice of measurement methods often leads to the use of certain data formats that are meaningful to the research community. This also involves the use of ‘typical’ data collection and analysis tools and corresponding software that typically operate with their own file formats but often include export options for other file formats.

Depending on the context of use and discipline, certain data formats are more popular than others. For statistical analyses, SPSS, R, SAS, and STATA are common programs. For example, if you want to open an SPSS file (*.sav) in other software, the file must be first converted.

Conversion refers to the process of transferring data from one format to another. When converting data, it is important to research various options depending on the source and target formats. Usually you can find all relevant information in the software help menu.

There are different ways to convert data:

Conversion via data export

In this specific example, in SPSS (version 24), you can select various formats of other statistical software such as STATA or SAS under “File – Export” or even specify different versions of the respective statistical programs under “File – Save As.” This allows you to generate directly a different data format through data export and to open the data in the ‘new’ statistical program in the appropriate file format.

Conversion during data import

Converting SPSS files into R-compatible data formats can be done directly in R using various applications. Here, the data format conversion is performed alongside the data import.

Conversion to other file formats can be lossless (the new data contains all the information that was present in the original), lossy (the new data no longer contains all the information from the original), or meaningful (conversion ‘in essence’; the essential content is preserved with or without information loss), whereby lossless conversion should be strived for. However, sometimes you may need to opt for a smaller file size, which comes at the cost of information loss.

In the context of data transformation, it may also be useful or necessary (e.g., due to changing research community requirements, new insights in the relevant research field, or new research possibilities) to supplement a dataset or split it into subdatasets.