TY - JOUR
T1 - Mining high-utility sequences with positive and negative values
AU - Zhang, Xiaojie
AU - Lai, Fuyin
AU - Chen, Guoting
AU - Gan, Wensheng
PY - 2023/8
Y1 - 2023/8
N2 - Sequence pattern discovery is a fundamental topic in the domain of data mining. It has been widely used to solve various problems (e.g., behavior pattern discovery, gene pattern discovery in bioinformatics, user click pattern mining, etc.). High-utility sequence mining as a novel hot issue is more challenging and has generally attracted plenty of attention. Our paper focuses on mining high-utility sequences in a more complicated environment with high efficiency. Most of the previous methods for utility mining aim to find high-utility sequences suitable for items with positive values, but most real-world situations contain items with both positive and negative values. Several algorithms have been applied to the above sophisticated situation and can be used as our comparing algorithms. In this paper, we introduce the FHUSN (Fast mining High Utility Sequences with Negative item) algorithm to mine high-utility sequences in situations with or without negative utility values. FHUSN utilizes the new utility array to store data. Several new pruning strategies that apply to situations with or without negative values have been used to reduce search space. Experiments are carried out on several benchmark datasets, and experimental results illustrate that our method has better performance. © 2023 Published by Elsevier Inc.
AB - Sequence pattern discovery is a fundamental topic in the domain of data mining. It has been widely used to solve various problems (e.g., behavior pattern discovery, gene pattern discovery in bioinformatics, user click pattern mining, etc.). High-utility sequence mining as a novel hot issue is more challenging and has generally attracted plenty of attention. Our paper focuses on mining high-utility sequences in a more complicated environment with high efficiency. Most of the previous methods for utility mining aim to find high-utility sequences suitable for items with positive values, but most real-world situations contain items with both positive and negative values. Several algorithms have been applied to the above sophisticated situation and can be used as our comparing algorithms. In this paper, we introduce the FHUSN (Fast mining High Utility Sequences with Negative item) algorithm to mine high-utility sequences in situations with or without negative utility values. FHUSN utilizes the new utility array to store data. Several new pruning strategies that apply to situations with or without negative values have been used to reduce search space. Experiments are carried out on several benchmark datasets, and experimental results illustrate that our method has better performance. © 2023 Published by Elsevier Inc.
KW - Data mining
KW - Pattern mining
KW - High-utility sequence
KW - Negative value
KW - Pruning strategy
UR - http://www.scopus.com/inward/record.url?scp=85152591034&partnerID=8YFLogxK
UR - https://www.scopus.com/record/pubmetrics.uri?eid=2-s2.0-85152591034&origin=recordpage
U2 - 10.1016/j.ins.2023.118945
DO - 10.1016/j.ins.2023.118945
M3 - RGC 21 - Publication in refereed journal
SN - 0020-0255
VL - 637
JO - Information Sciences
JF - Information Sciences
M1 - 118945
ER -