TY - JOUR
T1 - Federated Topic Discovery
T2 - A Semantic Consistent Approach
AU - Shi, Yexuan
AU - Tong, Yongxin
AU - Su, Zhiyang
AU - Jiang, Di
AU - Zhou, Zimu
AU - Zhang, Wenbin
PY - 2021/9
Y1 - 2021/9
N2 - General-purpose topic models have widespread industrial applications. Yet high-quality topic modeling is becoming increasingly challenging because accurate models require large amounts of training data typically owned by multiple parties, who are often unwilling to share their sensitive data for collaborative training without guarantees on their data privacy. To enable effective privacy-preserving multiparty topic modeling, we propose a novel federated general-purpose topic model named private and consistent topic discovery (PC-TD). On the one hand, PC-TD seamlessly integrates differential privacy in topic modeling to provide privacy guarantees on sensitive data of different parties. On the other hand, PC-TD exploits multiple sources of semantic consistency information to retain the accuracy of topic modeling while protecting data privacy. We verify the effectiveness of PC-TD on real-life datasets. Experimental results demonstrate its superiority over the state-of-the-art general-purpose topic models. © 2020 IEEE.
AB - General-purpose topic models have widespread industrial applications. Yet high-quality topic modeling is becoming increasingly challenging because accurate models require large amounts of training data typically owned by multiple parties, who are often unwilling to share their sensitive data for collaborative training without guarantees on their data privacy. To enable effective privacy-preserving multiparty topic modeling, we propose a novel federated general-purpose topic model named private and consistent topic discovery (PC-TD). On the one hand, PC-TD seamlessly integrates differential privacy in topic modeling to provide privacy guarantees on sensitive data of different parties. On the other hand, PC-TD exploits multiple sources of semantic consistency information to retain the accuracy of topic modeling while protecting data privacy. We verify the effectiveness of PC-TD on real-life datasets. Experimental results demonstrate its superiority over the state-of-the-art general-purpose topic models. © 2020 IEEE.
UR - http://www.scopus.com/inward/record.url?scp=85096125135&partnerID=8YFLogxK
UR - https://www.scopus.com/record/pubmetrics.uri?eid=2-s2.0-85096125135&origin=recordpage
U2 - 10.1109/MIS.2020.3033459
DO - 10.1109/MIS.2020.3033459
M3 - RGC 21 - Publication in refereed journal
SN - 1541-1672
VL - 36
SP - 96
EP - 103
JO - IEEE Intelligent Systems
JF - IEEE Intelligent Systems
IS - 5
ER -