• Ensure that data is backed up to prevent data loss during post-processing.
    • Protect against accidental writes / deletions:
      • Linux: chmod -R a-w
    • Store backup in a safe place. I.e. do not travel with backup disks.
  • Control access to data
    • Protect against unauthorized access:
      • Linux: chmod -R o-rwx
  • Do not operate with the raw (or even worse backup) data. Separate corpus from derived data/features.
    • Also helps to restart automated processing from scratch in case of errors because directories are separated.


Always: Automate as much as possible.

Data Extraction

  • Extract trials which fulfill quality criteria:
    • Recording complete?
    • Use validation script
    • Ignore trials/recordings where issues occurred during the recording session (as indicated in the recording log).
  • Extract regions of interest from videos etc.
  • Conversion to target format:
    • Decide on target formats (video, audio, CSV? etc) depending on usage patterns (h264+aac is windows, mac and linux compatible).
    • Generate multiple resolutions/sizes in case it helps quick annotation/processing


  • Think about the required accuracy of synchronization.
    • Think about how to validate the accuracy.


  • Define and document an annotation scheme based on literature and with relation to hypotheses.
  • Think about how to proof annotation reliability (interrator agreement etc.)
    • Use multiple persons to annotate the data in parallel and isolation.
  • Train annotators.

Technical Solutions

Have a look at the Dataset Processing Project for some useful scripts.


  • Automate audio and video analyses (using existing tools)
    • Blackframe detection to synchronize videos containing black frames (e.g. ffmpeg)
    • Detection of known sound patterns (e.g. clapperboard, robot utterance, via praat and cross-correlation to reference audio signal)
    • Estimation of temporal offsets between cameras (cross-correlation)
    • Clapperboard detection using Vicon markers (distance of two markers)
  • Align videos to system logs by e.g. recording one audio channel via middleware or more sophisticated solutions

Video Processing

  • ffmpeg
  • fktool
  • mediainfo
  • mpv player

Audio Processing

  • sox


  • Scripting languages (bash, Python)



  • ELAN: For data including videos
  • Praat: Audio-only