Humboldt-Universität zu Berlin - Research data management

Choose file format

Recommendations for the choice of a file format for long-term archiving.

For long-term archiving of your research data the files should be:

  • unencrypted
  • not compressed
  • non-proprietary/patent-encumbered
  • use an open, documented standard


The following file formats can be recommended:

  • Tabular data: CSV, TSV, SPSS portable
  • Text: TXT, HTML, RTF, PDF/A (only, if layout matters)
  • Multimedia:
    • Container: AVI, WAV, MP4, Ogg
    • Codec: Theora, Dirac, FLAC
  • Image: TIFF, JPEG2000, PNG
  • Structured data: XML, RDF


To be avoided are the file formats:

  • Tabular data: XLS
  • Text: DOC, PPT
  • Multimedia: Windows Media Video, QuickTime, H264
  • Image: GIF, JPG
  • Structured data: RDBMS