Rushes video summarization by object and event understanding

Feng Wang, Chong-Wah Ngo

Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review

Abstract

This paper explores a variety of visual and audio analysis techniques in selecting the most representative video clips for rushes summarization at TRECVID 2007. These techniques include object detection, camera motion estimation, keypoint matching and tracking, audio classification and speech recognition. Our system is composed of two major steps. First, based on video structuring, we filter undesirable shots and minimize theinter-shot redundancy by repetitive shot detection. Second, a representability measure is proposed to model the presence of objects and four audio-visual events: motion activity of objects, camera motion, scene changes,and speech content, in a video clip. The video clips with the highest representability scores are selected for summarization. The evaluation at TRECVID shows that our experimental results are highly encouraging, where we rank first in EA (easy to understand), second in RE (little redundancy) and third in IN (inclusion of objects and events). Copyright 2007 ACM.
Original languageEnglish
Title of host publicationProceedings of the ACM International Multimedia Conference and Exhibition
Pages25-29
DOIs
Publication statusPublished - 2007
EventInternational Multimedia Conference, MM'07 - Workshop on TRECVID Video Summarization - Augsburg, Bavaria, Germany
Duration: 28 Sept 200728 Sept 2007

Conference

ConferenceInternational Multimedia Conference, MM'07 - Workshop on TRECVID Video Summarization
PlaceGermany
CityAugsburg, Bavaria
Period28/09/0728/09/07

Research Keywords

  • Event understanding
  • Object detection
  • Rushes video summarization

Fingerprint

Dive into the research topics of 'Rushes video summarization by object and event understanding'. Together they form a unique fingerprint.

Cite this