A bag of systems representation for music auto-tagging
Research output: Journal Publications and Reviews › RGC 21 - Publication in refereed journal › peer-review
Author(s)
Related Research Unit(s)
Detail(s)
Original language | English |
---|---|
Article number | 6583960 |
Pages (from-to) | 2554-2569 |
Journal / Publication | IEEE Transactions on Audio, Speech and Language Processing |
Volume | 21 |
Issue number | 12 |
Publication status | Published - 2013 |
Link(s)
DOI | DOI |
---|---|
Attachment(s) | Documents
Publisher's Copyright Statement
|
Link to Scopus | https://www.scopus.com/record/display.uri?eid=2-s2.0-84887360800&origin=recordpage |
Permanent Link | https://scholars.cityu.edu.hk/en/publications/publication(df8c51b2-8532-4d7d-b15a-cc38510de83d).html |
Abstract
We present a content-based automatic tagging system for music that relies on a high-level, concise "Bag of Systems" (BoS) representation of the characteristics of a musical piece. The BoS representation leverages a rich dictionary of musical codewords, where each codeword is a generative model that captures timbral and temporal characteristics of music. Songs are represented as a BoS histogram over codewords, which allows for the use of traditional algorithms for text document retrieval to perform auto-tagging. Compared to estimating a single generative model to directly capture the musical characteristics of songs associated with a tag, the BoS approach offers the flexibility to combine different generative models at various time resolutions through the selection of the BoS codewords. Additionally, decoupling the modeling of audio characteristics from the modeling of tag-specific patterns makes BoS a more robust and rich representation of music. Experiments show that this leads to superior auto-tagging performance. © 2006-2012 IEEE.
Research Area(s)
- Audio annotation and retrieval, bag of systems, content-based music processing, dynamic texture model, music information retrieval
Citation Format(s)
A bag of systems representation for music auto-tagging. / Ellis, Katherine; Coviello, Emanuele; Chan, Antoni B. et al.
In: IEEE Transactions on Audio, Speech and Language Processing, Vol. 21, No. 12, 6583960, 2013, p. 2554-2569.
In: IEEE Transactions on Audio, Speech and Language Processing, Vol. 21, No. 12, 6583960, 2013, p. 2554-2569.
Research output: Journal Publications and Reviews › RGC 21 - Publication in refereed journal › peer-review
Download Statistics
No data available