Documentation

Learn how to use the Multimodal Annotation Tool effectively

Getting Started

Quick Start Guide

  1. Set up your annotation schema (click "Set Up Schema" button)
  2. Upload your media file (audio or video) and transcript CSV file (supports drag and drop)
  3. Start annotating by clicking on utterances and filling in the annotation fields
  4. Export your annotated data when finished

CSV Format Requirements

Required Columns

ColumnTypeDescription
turn_idintegerUnique identifier for each utterance
speakerstringSpeaker identifier (e.g., "S1", "S2")
startfloatStart time in seconds
endfloatEnd time in seconds
utterancestringThe spoken text

Example CSV

turn_id,speaker,start,end,utterance
1,S1,0.0,2.5,"Hello, how are you?"
2,S2,3.0,5.2,"I'm doing well, thanks!"
3,S1,6.1,8.9,"That's great to hear."

Supported Media Formats

Audio Formats

  • ✓ MP3 (.mp3)Recommended
  • WAV (.wav)
  • OGG (.ogg)
  • WebM (.webm)
  • M4A (.m4a)
  • AAC (.aac)

Video Formats

  • ✓ MP4 (.mp4)Recommended
  • WebM (.webm)
  • MOV (.mov)
  • AVI (.avi)
  • MKV (.mkv)

File Size Recommendations

MP3 and MP4 formats are highly compressed and provide fast processing with smaller file sizes.

  • Optimal performance: Keep files under 500 MB
  • Maximum recommended: 1 GB
  • Warning: Files over 1 GB may cause slow processing or browser performance issues as all data is processed in-browser memory

CSV Transcript Files

Keep CSV files under 10 MB for best performance. Large transcripts with thousands of rows may cause UI lag.

Keyboard Shortcuts

Play / PauseSpace
Navigate annotationsTab

Features

Schema Customization

Define custom annotation columns with your own labels. Set up your schema before loading data for automatic column matching.

Synchronized Playback

The transcript automatically highlights the current utterance during playback. Click any utterance to jump directly to that point in the media.

Drag and Drop

Simply drag media and CSV files directly onto the window for quick loading.

Export Annotations

Export your annotated data as a CSV file, preserving all original columns plus your annotations.

Resizable Panels

Adjust the layout by dragging panel dividers to customize your workspace.

Exporting Data

Click the "Save Annotations" button to export your annotated data. The exported CSV will include:

  • All original transcript columns
  • All annotation columns you defined
  • Your annotation values for each utterance

The exported file will be named original_filename_annotated.csv