Time series models for semantic music annotation
Research output: Journal Publications and Reviews › RGC 21 - Publication in refereed journal › peer-review
Author(s)
Related Research Unit(s)
Detail(s)
Original language | English |
---|---|
Article number | 5613150 |
Pages (from-to) | 1343-1359 |
Journal / Publication | IEEE Transactions on Audio, Speech and Language Processing |
Volume | 19 |
Issue number | 5 |
Publication status | Published - 2011 |
Link(s)
Abstract
Many state-of-the-art systems for automatic music tagging model music based on bag-of-features representations which give little or no account of temporal dynamics, a key characteristic of the audio signal. We describe a novel approach to automatic music annotation and retrieval that captures temporal (e.g., rhythmical) aspects as well as timbral content. The proposed approach leverages a recently proposed song model that is based on a generative time series model of the musical content-the dynamic texture mixture (DTM) model that treats fragments of audio as the output of a linear dynamical system. To model characteristic temporal dynamics and timbral content at the tag level, a novel, efficient, and hierarchical expectation-maximization (EM) algorithm for DTM (HEM-DTM) is used to summarize the common information shared by DTMs modeling individual songs associated with a tag. Experiments show learning the semantics of music benefits from modeling temporal dynamics. © 2010 IEEE.
Research Area(s)
- Audio annotation and retrieval, dynamic texture model, music information retrieval
Citation Format(s)
Time series models for semantic music annotation. / Coviello, Emanuele; Chan, Antoni B.; Lanckriet, Gert.
In: IEEE Transactions on Audio, Speech and Language Processing, Vol. 19, No. 5, 5613150, 2011, p. 1343-1359.
In: IEEE Transactions on Audio, Speech and Language Processing, Vol. 19, No. 5, 5613150, 2011, p. 1343-1359.
Research output: Journal Publications and Reviews › RGC 21 - Publication in refereed journal › peer-review