Enabling Metadata-private Communication at Scale
元數據隱私保護的加密通信關鍵技術研究
Student thesis: Doctoral Thesis
Author(s)
Related Research Unit(s)
Detail(s)
Awarding Institution | |
---|---|
Supervisors/Advisors |
|
Award date | 25 Jun 2024 |
Link(s)
Permanent Link | https://scholars.cityu.edu.hk/en/theses/theses(e08a5c98-0dac-4feb-b4c8-f26f2d93ff17).html |
---|---|
Other link(s) | Links |
Abstract
In the past decades, end-to-end encryption (E2EE) has been a staple for private communication in online services, like WhatsApp and Signal. While E2EE effectively protects message payloads, it unfortunately leaves communication metadata exposed, which includes the identities of senders and receivers, the timing, frequency, and volumes of conversations, etc. Protecting the communication metadata is challenging due to the existence of global adversaries that can monitor and even actively interfere with the traffic. Established systems like Tor are not adequate under such adversarial models. Existing designs for metadata-private communication mostly fall into the balancing act among security, performance, and trust assumptions: 1) designs with cryptographic security often use hefty operations, incurring performance roadblocks and expensive operational costs for large-scale deployment; 2) more performant systems often follow a weaker security guarantee, like differential privacy, and generally demand more trust from the involved servers. So far, there has been no dominant solution.
Towards metadata-private communication at scale, this thesis takes a different technical route from prior art, and proposes several metadata-private messaging mechanisms and systems leveraging the readily available trust assumption on secure enclaves (as those emerging in the cloud). Based on trusted hardware, we address challenges like in-memory access obliviousness, defense against global active attackers, system scalability, and utility enhancement in communication models. To address these challenges, this study develops: 1) oblivious message exchange protocols using trusted hardware, 2) security-aware horizontal scaling mechanisms that do not reveal workload distribution patterns, and 3) a dialing-free metadata-private messaging system that aligns with user habits in modern instant messaging applications.
We implement these prototypes on cloud server clusters equipped with Intel SGXv2, demonstrating their effectiveness in maintaining metadata privacy in large-scale E2EE communication scenarios. Highlighted contributions are summarized below:
• Boomerang, a metadata-private messaging system on one trusted hardware that defends against active interference attacks. Based on the proposed proactive pattern detection and patching algorithms, Boomerang provides efficient cryptographic metadata privacy against active attack threats. We also consider in- memory side-channel attacks in enclaves, and specifically design memory oblivious algorithms to achieve metadata privacy at the memory level. The system prototype achieves an end-to-end latency of 778 ms on 32,768 clients. We believe such performance can efficiently support communication needs for certain client size in practical scenarios.
• Boomerang+, a security-aware horizontal-scaling system on distributed hardware-assisted server clusters. Inspired by the balls-into-bins analysis, we derive and prove a load upper bound based on the weighted balls-into-bins model. Based on this derived upper bound, we further propose an oblivious sub-batch allocation algorithm and a two-layer load balancing architecture. Built on up to 48 hardware-assisted servers, Boomerang+ achieves an end-to-end latency of 615 ms on 65,536 clients. It achieves a 7.8 second end-to-end latency when dealing with 1.04 million clients. Besides, under the same experimental setting, Boomerang+ has a 36× speedup compared to prior art.
• PingPong, a new framework that supports dialing-free metadata-private messaging system. Based on the concept of message decoupling, PingPong instantiates a “notify-before-retrieval” workflow, like modern messaging systems. PingPong integrates a metadata-private notification subsystem, Ping, and a metadata- private message store, Pong, mitigating the limitations of existing systems’ reliance on synchronous communication and coordination. Built on up to 32 servers, PingPong achieves an end-to-end latency of 28.7 seconds on 100K clients, a 1.8× speedup than prior art. With 5,000 maximum friends, the communicational cost of PingPong is 2700× smaller than prior art.
Towards metadata-private communication at scale, this thesis takes a different technical route from prior art, and proposes several metadata-private messaging mechanisms and systems leveraging the readily available trust assumption on secure enclaves (as those emerging in the cloud). Based on trusted hardware, we address challenges like in-memory access obliviousness, defense against global active attackers, system scalability, and utility enhancement in communication models. To address these challenges, this study develops: 1) oblivious message exchange protocols using trusted hardware, 2) security-aware horizontal scaling mechanisms that do not reveal workload distribution patterns, and 3) a dialing-free metadata-private messaging system that aligns with user habits in modern instant messaging applications.
We implement these prototypes on cloud server clusters equipped with Intel SGXv2, demonstrating their effectiveness in maintaining metadata privacy in large-scale E2EE communication scenarios. Highlighted contributions are summarized below:
• Boomerang, a metadata-private messaging system on one trusted hardware that defends against active interference attacks. Based on the proposed proactive pattern detection and patching algorithms, Boomerang provides efficient cryptographic metadata privacy against active attack threats. We also consider in- memory side-channel attacks in enclaves, and specifically design memory oblivious algorithms to achieve metadata privacy at the memory level. The system prototype achieves an end-to-end latency of 778 ms on 32,768 clients. We believe such performance can efficiently support communication needs for certain client size in practical scenarios.
• Boomerang+, a security-aware horizontal-scaling system on distributed hardware-assisted server clusters. Inspired by the balls-into-bins analysis, we derive and prove a load upper bound based on the weighted balls-into-bins model. Based on this derived upper bound, we further propose an oblivious sub-batch allocation algorithm and a two-layer load balancing architecture. Built on up to 48 hardware-assisted servers, Boomerang+ achieves an end-to-end latency of 615 ms on 65,536 clients. It achieves a 7.8 second end-to-end latency when dealing with 1.04 million clients. Besides, under the same experimental setting, Boomerang+ has a 36× speedup compared to prior art.
• PingPong, a new framework that supports dialing-free metadata-private messaging system. Based on the concept of message decoupling, PingPong instantiates a “notify-before-retrieval” workflow, like modern messaging systems. PingPong integrates a metadata-private notification subsystem, Ping, and a metadata- private message store, Pong, mitigating the limitations of existing systems’ reliance on synchronous communication and coordination. Built on up to 32 servers, PingPong achieves an end-to-end latency of 28.7 seconds on 100K clients, a 1.8× speedup than prior art. With 5,000 maximum friends, the communicational cost of PingPong is 2700× smaller than prior art.
- Privacy-enhancing technologies, trusted hardware, anonymous communication, metadata privacy