At Technicolor, we believe that sharing datasets within the scientific community is important to make research more effective, propose challenges, compare results, forge links and foster collaborations. You are welcome to use them for your research and do not hesitate to contact us (see conditions and contacts inside).

The VSD benchmark is a collection of ground-truth files based on the extraction of violent events in movies, together with high level audio and video concepts. It is intended to be used for assessing the quality of methods for the detection of violent scenes and/or the recognition of some high level, violent related, concepts in movies.

Automatic extraction of face tracks is a key component of systems that analyses people in audio-visual content such as TV programs and movies. The Hannah dataset, based on the full audio-visual person annotation of a feature movie, enable state-of-the art tracking metrics to be exploited to evaluate face tracks used by, e.g., automatic character naming systems.

The Interestingness Dataset is a collection of movie excerpts and key-frames and their corresponding ground-truth files based on the classification into interesting and non-interesting samples. It is intended to be used for assessing the quality of methods for predicting the interestingness of multimedia content.

The Annotated Creative Commons Emotional Database. LIRIS-ACCEDE consists of video excerpts with a large content diversity annotated along affective dimensions, in contrast to existing datasets with very few video resources and limited accessibility due to copyright constraints.

http://liris-accede.ec-lyon.fr/

The automatic recognition of human emotions is of great interest in the context of multimedia applications and brain-computer interfaces. The HR-EEG4EMO dataset contains simultaneous physiological recordings (electroencephalography EEG, electrocardiogram ECG, respiration, blood oxygen level, pulse rate, galvanic skin response GSR) for different users watching various type of audio-visual content. The original intent was to infer emotions from those signals.

The Light-Field Technicolor Dataset is a collection of Light-Field video sequences captured by our 4x4 camera array (sparse Light-Field video).