TY - GEN
T1 - Flickr distance
AU - Wu, Lei
AU - Hua, Xian-Sheng
AU - Yu, Nenghai
AU - Ma, Wei-Ying
AU - Li, Shipeng
N1 - Publication details (e.g. title, author(s), publication statuses and dates) are captured on an “AS IS” and “AS AVAILABLE” basis at the time of record harvesting from the data source. Suggestions for further amendments or supplementary information can be sent to [email protected].
PY - 2008
Y1 - 2008
N2 - This paper presents Flickr distance, which is a novel measurement of the relationship between semantic concepts (objects, scenes) in visual domain. For each concept, a collection of images are obtained from Flickr, based on which the improved latent topic based visual language model is built to capture the visual characteristic of this concept. Then Flickr distance between different concepts is measured by the square root of Jensen-Shannon (JS) divergence between the corresponding visual language models. Comparing with WordNet, Flickr distance is able to handle far more concepts existing on the Web, and it can scale up with the increase of concept vocabularies. Comparing with Google distance, which is generated in textual domain, Flickr distance is more precise for visual domain concepts, as it captures the visual relationship between the concepts instead of their co-occurrence in text search results. Besides, unlike Google distance, Flickr distance satis es triangular inequality, which makes it a more reasonable distance metric. Both subjective user study and objective evaluation show that Flickr distance is more coherent to human perception than Google distance. We also design several application scenarios, such as concept clustering and image annotation, to demonstrate the effectiveness of this proposed distance in image related applications. Copyright 2008 ACM.
AB - This paper presents Flickr distance, which is a novel measurement of the relationship between semantic concepts (objects, scenes) in visual domain. For each concept, a collection of images are obtained from Flickr, based on which the improved latent topic based visual language model is built to capture the visual characteristic of this concept. Then Flickr distance between different concepts is measured by the square root of Jensen-Shannon (JS) divergence between the corresponding visual language models. Comparing with WordNet, Flickr distance is able to handle far more concepts existing on the Web, and it can scale up with the increase of concept vocabularies. Comparing with Google distance, which is generated in textual domain, Flickr distance is more precise for visual domain concepts, as it captures the visual relationship between the concepts instead of their co-occurrence in text search results. Besides, unlike Google distance, Flickr distance satis es triangular inequality, which makes it a more reasonable distance metric. Both subjective user study and objective evaluation show that Flickr distance is more coherent to human perception than Google distance. We also design several application scenarios, such as concept clustering and image annotation, to demonstrate the effectiveness of this proposed distance in image related applications. Copyright 2008 ACM.
KW - Concept relationship
KW - Flickr distance
KW - Similarity measurement
KW - TagNet
KW - Visual concept net
KW - Visual distance
UR - http://www.scopus.com/inward/record.url?scp=70350258087&partnerID=8YFLogxK
UR - https://www.scopus.com/record/pubmetrics.uri?eid=2-s2.0-70350258087&origin=recordpage
U2 - 10.1145/1459359.1459364
DO - 10.1145/1459359.1459364
M3 - RGC 32 - Refereed conference paper (with host publication)
SN - 9781605583037
T3 - MM'08 - Proceedings of the 2008 ACM International Conference on Multimedia, with co-located Symposium and Workshops
SP - 31
EP - 40
BT - MM'08 - Proceedings of the 2008 ACM International Conference on Multimedia, with co-located Symposium and Workshops
T2 - 16th ACM International Conference on Multimedia, MM '08
Y2 - 26 October 2008 through 31 October 2008
ER -