Skip to main navigation Skip to search Skip to main content

Are Your Requests Your True Needs? Checking Excessive Data Collection in VPA Apps

  • Fuman Xie
  • , Chuan Yan
  • , Mark Huasong Meng
  • , Shaoming Teng
  • , Yanjun Zhang
  • , Guangdong Bai*
  • *Corresponding author for this work

Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review

Abstract

Virtual personal assistants (VPA) services encompass a large number of third-party applications (or apps) to enrich their function-alities. These apps have been well examined to scrutinize their data collection behaviors against their declared privacy policies. Nonetheless, it is often overlooked that most users tend to ignore privacy policies at the installation time. Dishonest developers thus can exploit this situation by embedding excessive declarations to cover their data collection behaviors during compliance auditing.
In this work, we present Pico, a privacy inconsistency detector, which checks the VPA app's privacy compliance by analyzing (in)consistency between data requested and data essential for its functionality. Pico understands the app's functionality topics from its publicly available textual data, and leverages advanced GPT-based language models to address domain-specific challenges. Based on the counterparts with similar functionality, suspicious data collection can be detected through the lens of anomaly detection. We apply Pico to understand the status quo of data-functionality com-pliance among all 65,195 skills in the Alexa app store. Our study reveals that 21.7% of the analyzed skills exhibit suspicious data collection, including Top 10 popular Alexa skills that pose threats to 54,116 users. These findings should raise an alert to both developers and users, in the compliance with the purpose limitation principle in data regulations. © 2024 Copyright is held by the owner/author(s). Publication rights licensed to ACM.
Original languageEnglish
Title of host publicationICSE '24: Proceedings of the IEEE/ACM 46th International Conference on Software Engineering
PublisherAssociation for Computing Machinery
ISBN (Print)9798400702174
DOIs
Publication statusPublished - May 2024
Externally publishedYes
Event46th IEEE/ACM International Conference on Software Engineering (ICSE 2024) - Centro Cultural de Belém, Lisbon, Portugal
Duration: 14 Apr 202420 Apr 2024
https://conf.researchr.org/home/icse-2024

Publication series

NameProceedings - International Conference on Software Engineering
ISSN (Print)0270-5257

Conference

Conference46th IEEE/ACM International Conference on Software Engineering (ICSE 2024)
PlacePortugal
CityLisbon
Period14/04/2420/04/24
Internet address

Funding

We thank the anonymous reviewers for their insightful comments to improve this manuscript. This work is partially supported by Australian Research Council Discovery Projects (DP230101196, DP240103068).

Research Keywords

  • Alexa skills
  • privacy compliance
  • Virtual Personal Assistant

Fingerprint

Dive into the research topics of 'Are Your Requests Your True Needs? Checking Excessive Data Collection in VPA Apps'. Together they form a unique fingerprint.

Cite this