TY - JOUR
T1 - A similarity reinforcement algorithm for heterogeneous web pages
AU - Liu, Ning
AU - Yan, Jun
AU - Bai, Fengshan
AU - Zhang, Benyu
AU - Xi, Wensi
AU - Fan, Weiguo
AU - Chen, Zheng
AU - Ji, Lei
AU - Hu, Chenyong
AU - Ma, Wei-Ying
N1 - Publication details (e.g. title, author(s), publication statuses and dates) are captured on an “AS IS” and “AS AVAILABLE” basis at the time of record harvesting from the data source. Suggestions for further amendments or supplementary information can be sent to [email protected].
PY - 2005
Y1 - 2005
N2 - Many machine learning and data mining algorithms crucially rely on the similarity metrics. However, most early research works such as Vector Space Model or Latent Semantic Index only used single relationship to measure the similarity of data objects. In this paper, we first use an Intra- and Inter- Type Relationship Matrix (IITRM) to represent a set of heterogeneous data objects and their inter-relationships. Then, we propose a novel similarity-calculating algorithm over the Inter- and Intra- Type Relationship Matrix. It tries to integrate information from heterogeneous sources to serve their purposes by iteratively computing. This algorithm can help detect latent relationships among heterogeneous data objects. Our new algorithm is based on the intuition that the intrarelationship should affect the inter-relationship, and vice versa. Experimental results on the MSN logs dataset show that our algorithm outperforms the traditional Cosine similarity. © Springer-Verlag Berlin Heidelberg 2005.
AB - Many machine learning and data mining algorithms crucially rely on the similarity metrics. However, most early research works such as Vector Space Model or Latent Semantic Index only used single relationship to measure the similarity of data objects. In this paper, we first use an Intra- and Inter- Type Relationship Matrix (IITRM) to represent a set of heterogeneous data objects and their inter-relationships. Then, we propose a novel similarity-calculating algorithm over the Inter- and Intra- Type Relationship Matrix. It tries to integrate information from heterogeneous sources to serve their purposes by iteratively computing. This algorithm can help detect latent relationships among heterogeneous data objects. Our new algorithm is based on the intuition that the intrarelationship should affect the inter-relationship, and vice versa. Experimental results on the MSN logs dataset show that our algorithm outperforms the traditional Cosine similarity. © Springer-Verlag Berlin Heidelberg 2005.
UR - http://www.scopus.com/inward/record.url?scp=24144480934&partnerID=8YFLogxK
UR - https://www.scopus.com/record/pubmetrics.uri?eid=2-s2.0-24144480934&origin=recordpage
U2 - 10.1007/978-3-540-31849-1_13
DO - 10.1007/978-3-540-31849-1_13
M3 - RGC 21 - Publication in refereed journal
SN - 0302-9743
VL - 3399
SP - 121
EP - 132
JO - Lecture Notes in Computer Science
JF - Lecture Notes in Computer Science
T2 - 7th Asia-Pacific Web Conference on Web Technologies Research and Development - APWeb 2005
Y2 - 29 March 2005 through 1 April 2005
ER -