ARISTA - Image search to annotation on billions of web photos

Xin-Jing Wang, Lei Zhang, Ming Liu, Yi Li, Wei-Ying Ma

Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review

74 Citations (Scopus)

Abstract

Though it has cost great research efforts for decades, object recognition is still a challenging problem. Traditional methods based on machine learning or computer vision are still in the stage of tackling hundreds of object categories. In recent years, non-parametric approaches have demonstrated great success, which understand the content of an image by propagating labels of its similar images in a large-scale dataset. However, due to the limited dataset size and imperfect image crawling strategy, previous work can only address a biased small subset of image concepts. Here we introduce the Arista project, which aims to build a practical image annotation engine targeting at popular concepts in the real world. In this project, we are particularly interested in understanding how many image concepts can be addressed by the data-driven annotation approach (coverage) and how good the performance is (precision). This paper reports the first stage of the work. Two billions web images were indexed, and based on simple yet effective near-duplicate detection, the system is capable of automatically generating accurate tags for popular web images having near-duplicates in the database. We found that about 8.1% web images have more than ten near duplicate and the number increases to 28.5% for top images in search results. Further, based on random samples in the latter case, we observed the precision of 57.9% at the point of the highest recall of 28% on ground truth tags. ©2010 IEEE.
Original languageEnglish
Title of host publication2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2010
Pages2987-2994
DOIs
Publication statusPublished - 2010
Externally publishedYes
Event2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2010 - San Francisco, CA, United States
Duration: 13 Jun 201018 Jun 2010

Publication series

NameProceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
ISSN (Print)1063-6919

Conference

Conference2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2010
PlaceUnited States
CitySan Francisco, CA
Period13/06/1018/06/10

Bibliographical note

Publication details (e.g. title, author(s), publication statuses and dates) are captured on an “AS IS” and “AS AVAILABLE” basis at the time of record harvesting from the data source. Suggestions for further amendments or supplementary information can be sent to [email protected].

Fingerprint

Dive into the research topics of 'ARISTA - Image search to annotation on billions of web photos'. Together they form a unique fingerprint.

Cite this