TY - GEN
T1 - Exploring LDA-based document model for geographic information retrieval
AU - Li, Zhisheng
AU - Wang, Chong
AU - Xie, Xing
AU - Wang, Xufa
AU - Ma, Wei-Ying
N1 - Publication details (e.g. title, author(s), publication statuses and dates) are captured on an “AS IS” and “AS AVAILABLE” basis at the time of record harvesting from the data source. Suggestions for further amendments or supplementary information can be sent to [email protected].
PY - 2008
Y1 - 2008
N2 - Latent Dirichlet Allocation (LDA) model, a formal generative model, has been used to improve ad-hoc information retrieval recently. However, its feasibility and effectiveness for geographic information retrieval has not been explored. This paper proposes an LDA-based document model to improve geographic information retrieval by inheriting the LDA model with text retrieval model. The proposed model has been evaluated on GeoCLEF2007 collection. This is a part of the experiments of Columbus Project of Microsoft Research Asia (MSRA) in GeoCLEF2007 (a cross-language geographical retrieval track which is part of Cross Language Evaluation Forum). This is the second time we participate in this event. Since the queries in GeoCLEF2007 are similar to those in GeoCLEF2006, we leverage most of the methods that we used in GeoCLEF2006, including MSRAWhitelist, MSRAExpansion, MSRALocation and MSRAText approaches. The difference is that MSRAManual approach is not included this time, and we use MSRALDA instead. The results show that the application of LDA model in GeoCLEF monolingual English task performs stably but needs to be further explored. © 2008 Springer-Verlag Berlin Heidelberg.
AB - Latent Dirichlet Allocation (LDA) model, a formal generative model, has been used to improve ad-hoc information retrieval recently. However, its feasibility and effectiveness for geographic information retrieval has not been explored. This paper proposes an LDA-based document model to improve geographic information retrieval by inheriting the LDA model with text retrieval model. The proposed model has been evaluated on GeoCLEF2007 collection. This is a part of the experiments of Columbus Project of Microsoft Research Asia (MSRA) in GeoCLEF2007 (a cross-language geographical retrieval track which is part of Cross Language Evaluation Forum). This is the second time we participate in this event. Since the queries in GeoCLEF2007 are similar to those in GeoCLEF2006, we leverage most of the methods that we used in GeoCLEF2006, including MSRAWhitelist, MSRAExpansion, MSRALocation and MSRAText approaches. The difference is that MSRAManual approach is not included this time, and we use MSRALDA instead. The results show that the application of LDA model in GeoCLEF monolingual English task performs stably but needs to be further explored. © 2008 Springer-Verlag Berlin Heidelberg.
KW - Evaluation
KW - Geographic information retrieval
KW - Latent Dirichlet Allocation
KW - System design
UR - http://www.scopus.com/inward/record.url?scp=70349795163&partnerID=8YFLogxK
UR - https://www.scopus.com/record/pubmetrics.uri?eid=2-s2.0-70349795163&origin=recordpage
U2 - 10.1007/978-3-540-85760-0_108
DO - 10.1007/978-3-540-85760-0_108
M3 - RGC 32 - Refereed conference paper (with host publication)
SN - 3540857591
SN - 9783540857594
VL - 5152 LNCS
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 842
EP - 849
BT - Advances in Multilingual and Multimodal Information Retrieval - 8th Workshop of the Cross-Language Evaluation Forum, CLEF 2007, Revised Selected Papers
PB - Springer Verlag
T2 - 8th Workshop of the Cross-Language Evaluation Forum, CLEF 2007
Y2 - 19 September 2007 through 21 September 2007
ER -