Skip to main navigation Skip to search Skip to main content

A Computational Aesthetic Design Science Study on Online Video Based on Triple-Dimensional Multimodal Analysis

Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review

Abstract

Computational video aesthetic prediction refers to using models that automatically evaluate the features of videos to produce their aesthetic scores. Current video aesthetic prediction models are designed based on bimodal frameworks. To address their limitations, we developed the Triple-Dimensional Multimodal Temporal Video Aesthetic neural network (TMTVA-net) model. The Long Short-Term Memory (LSTM) forms the conceptual foundation for the design framework. In the multimodal transformer layer, we employed two distinct transformers: the multimodal transformer and the feature transformer, enabling the acquisition of modality-specific patterns and representational features uniquely adapted to each modality. The fusion layer has also been redesigned to compute both pairwise interactions and overall interactions among the features. This study contributes to the video aesthetic prediction literature by considering the synergistic effects of textual, audio, and video features. This research presents a novel design framework that considers the combined effects of multimodal features. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.
Original languageEnglish
Title of host publicationHCI International 2024 – Late Breaking Papers
Subtitle of host publication26th International Conference on Human-Computer Interaction, HCII 2024, Washington, DC, USA, June 29 – July 4, 2024, Proceedings
EditorsAaron Marcus, Elizabeth Rosenzweig, Marcelo M. Soares, Pei-Luen Patrick Rau, Abbas Moallem
Place of PublicationCham
PublisherSpringer 
Pages68-79
VolumePart VII
ISBN (Electronic)978-3-031-76821-7
ISBN (Print)9783031768200
DOIs
Publication statusPublished - 2025
Event26th International Conference on Human-Computer Interaction (HCII 2024) - Washington Hilton Hotel, Washington, United States
Duration: 29 Jun 20244 Jul 2024
https://2024.hci.international/index.html
https://2024.hci.international/about.html

Publication series

NameLecture Notes in Computer Science
Volume15380
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference26th International Conference on Human-Computer Interaction (HCII 2024)
Abbreviated titleHCI International 2024
PlaceUnited States
CityWashington
Period29/06/244/07/24
Internet address

Research Keywords

  • Computational Video Aesthetic
  • Design Science
  • Multimodal Analysis
  • Neural Network

Fingerprint

Dive into the research topics of 'A Computational Aesthetic Design Science Study on Online Video Based on Triple-Dimensional Multimodal Analysis'. Together they form a unique fingerprint.

Cite this