Skip to main navigation Skip to search Skip to main content

A Lightweight Multi-View Learning Approach for Phishing Attack Detection Using Transformer with Mixture of Experts

  • Yanbin Wang
  • , Wenrui Ma
  • , Haitao Xu*
  • , Yiwei Liu*
  • , Peng Yin
  • *Corresponding author for this work

Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review

88 Downloads (CityUHK Scholars)

Abstract

Phishing poses a significant threat to the financial and privacy security of internet users and often serves as the starting point for cyberattacks. Many machine-learning-based methods for detecting phishing websites rely on URL analysis, offering simplicity and efficiency. However, these approaches are not always effective due to the following reasons: (1) highly concealed phishing websites may employ tactics such as masquerading URL addresses to deceive machine learning models, and (2) phishing attackers frequently change their phishing website URLs to evade detection. In this study, we propose a robust, multi-view Transformer model with an expert-mixture mechanism for accurate phishing website detection utilizing website URLs, attributes, content, and behavioral information. Specifically, we first adapted a pretrained language model for URL representation learning by applying adversarial post-training learning in order to extract semantic information from URLs. Next, we captured the attribute, content, and behavioral features of the websites and encoded them as vectors, which, alongside the URL embeddings, constitute the website’s multi-view information. Subsequently, we introduced a mixture-of-experts mechanism into the Transformer network to learn knowledge from different views and adaptively fuse information from various views. The proposed method outperforms state-of-the-art approaches in evaluations of real phishing websites, demonstrating greater performance with less label dependency. Furthermore, we show the superior robustness and enhanced adaptability of the proposed method to unseen samples and data drift in more challenging experimental settings. © 2023 by the authors.
Original languageEnglish
Article number7429
JournalApplied Sciences (Switzerland)
Volume13
Issue number13
Online published22 Jun 2023
DOIs
Publication statusPublished - Jul 2023
Externally publishedYes

Research Keywords

  • multi-view learning
  • phishing attack detection
  • self-supervised learning
  • transformer

Publisher's Copyright Statement

  • This full text is made available under CC-BY 4.0. https://creativecommons.org/licenses/by/4.0/

Fingerprint

Dive into the research topics of 'A Lightweight Multi-View Learning Approach for Phishing Attack Detection Using Transformer with Mixture of Experts'. Together they form a unique fingerprint.

Cite this