Semantics-Aware Cookie Purpose Compliance

Baiqi Chen, Jiawei Lyu, Tingmin Wu, Mohan Baruwal Chhetri, Guangdong Bai*

*Corresponding author for this work

Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review

1 Citation (Scopus)

Abstract

Websites commonly display cookie banners to inform users about the use and purposes of cookies. However, they may still, whether intentionally or unintentionally (e.g., due to third-party libraries imported), mis-declare cookies that may be abused for tracking. In this work, we introduce Coover (cookie value examiner) to assess the non-compliance between the website-declared purpose and the semantic-intended purpose of cookies (denoted as potential cookie purpose violation). We advocate that the value of the cookie is a more reliable indicator of its semantic-intended purpose compared to other features such as expiration time. Coover decomposes the cookie value into primitive segments representing minimal semantic units, and fine-tunes a GPT-3.5 model to automatically interpret their value-inferred semantics. Based on the interpretation, it classifies cookies into four GDPR-defined purposes. Coover achieves an F1 score of 95%, significantly outperforming other methods. We employ Coover to analyze Alexa Top 1k websites to understand the status quo of potential cookie purpose violation on the web. Remarkably, out of 15,339 cookies across these websites, only 3.1% quality as truly necessary cookies, while 44.1% of websites suffer from issues of potential purpose violation. © 2025 the owner/author(s).
Original languageEnglish
Title of host publicationWWW '25 - Proceedings of the ACM Web Conference 2025
Place of PublicationNew York, NY
PublisherAssociation for Computing Machinery
Pages1602-1613
Number of pages12
ISBN (Print)9798400712746
DOIs
Publication statusPublished - Apr 2025
Externally publishedYes
Event34th ACM Web Conference (WWW’25) - Sydney Convention & Exhibition Centre, Sydney, Australia
Duration: 28 Apr 20252 May 2025
https://www2025.thewebconf.org/

Publication series

NameWWW - Proceedings of the ACM Web Conference

Conference

Conference34th ACM Web Conference (WWW’25)
Abbreviated titleWWW 2025
PlaceAustralia
CitySydney
Period28/04/252/05/25
Internet address

Funding

We thank reviewers for their insightful comments. This research has been partially supported by Australian Research Council Discovery Projects (DP230101196, DP240103068). Baiqi Chen is supported by the University of Queensland and CSIRO’s Data61 PhD scholarship.

Research Keywords

  • cookie policy
  • cookies
  • privacy compliance

Publisher's Copyright Statement

  • This full text is made available under CC-BY 4.0. https://creativecommons.org/licenses/by/4.0/

Fingerprint

Dive into the research topics of 'Semantics-Aware Cookie Purpose Compliance'. Together they form a unique fingerprint.

Cite this