Towards Private and Scalable Cross-Media Retrieval

Research output: Journal Publications and Reviews (RGC: 21, 22, 62)21_Publication in refereed journalpeer-review

6 Scopus Citations
View graph of relations

Author(s)

Related Research Unit(s)

Detail(s)

Original languageEnglish
Pages (from-to)1354-1368
Journal / PublicationIEEE Transactions on Dependable and Secure Computing
Volume18
Issue number3
Online published4 Jul 2019
Publication statusPublished - May 2021

Abstract

Cross-media retrieval (CMR) is an attractive networked application where a server responds to queries with retrieval results of different modalities. Different from traditional information retrieval, CMR relies on a more enriched set of machine learning techniques to produce semantic models projecting multimodal data into a common space. A larger training dataset usually gives more accurate models, leading to a better retrieval result. Despite very promising with potential underpinnings in network analytics and multimedia applications, applying CMR in such contexts also faces severe privacy challenges, due to the fact that various data scattering among multiple parties may be sensitive and not allowed to be shared publicly. Studies jointly considering cross-media analytics, privacy protection, collaborative learning, and distributed networking contexts, are relatively sparse. In this work, we propose the first practical system for privacy-preserving cross-media retrieval by utilizing trusted processors. Our scheme enables secure aggregation of the data from distinct parties, and secure canonical correlation analysis (CCA) over collaborated data to obtain semantic models. Verification mechanisms are designed to defend against active attacks from a malicious adversary. Furthermore, to deal with large data sets, we provide a set of optimization methods to accomodate to limited trusted memory and improve the efficiency of training process in CMR. We consider issues such as data block splitting to manage memory overhead, ordering of operations as well as parameters reuse and release to simplify I/O, and parallel computation to speed up dual operations. Our experiments over both synthetic and real datasets show that our solution is very efficient in practice, outperforms the existing solutions, and performs comparably with the original CMR system.

Research Area(s)

  • collaborative learning, Cross-media retrieval, privacy protection, SGX, trusted processor

Citation Format(s)

Towards Private and Scalable Cross-Media Retrieval. / Hu, Shengshan; Zhang, Leo Yu; Wang, Qian; Qin, Zhan; Wang, Cong.

In: IEEE Transactions on Dependable and Secure Computing, Vol. 18, No. 3, 05.2021, p. 1354-1368.

Research output: Journal Publications and Reviews (RGC: 21, 22, 62)21_Publication in refereed journalpeer-review