LLaVA-Oil Painting Appreciation: A Vision-Language Model for Enhancing Understanding of Oil Paintings Through AI-Driven Analysis and Conversation

Diyang Guan

Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review

Abstract

Oil painting has been a dominant art form since the 15th century, known for its profound impact on artistic expression. While exhibitions offer people unprecedented access to masterpieces, many viewers lack the formal artistic training or understanding of oil painting found hard to appreciate the intricate techniques and historical context of these works. This paper introduces LLaVA-Oil Painting Appreciation (LLaVA-OPA), a vision-language conversational assistant designed to help individuals engage more deeply with oil paintings. Leveraging the LLaVA-1.6 model and fine-tuning it using LoRA, LLaVA-OPA is trained on a curated dataset of oil paintings, supplemented by expert annotations from art scholars and professionals. The model identifies key elements such as brushwork, composition, and style, offering insightful, accessible feedback to users. By incorporating GPT-4 for generating open-ended questions and responses, LLaVA-OPA creates a conversational experience that allows users, regardless of prior art knowledge, to explore the complexities of oil paintings in an engaging manner. Initial evaluations demonstrate that LLaVA-OPA excels in providing tailored, expert-like analysis while maintaining user-friendly interactions. Future research will focus on deepening the model's understanding by integrating information about artists' creative inspirations and historical context, enabling even more nuanced and meaningful conversations in various art settings, including exhibitions. © 2024 IEEE.
Original languageEnglish
Title of host publication2024 International Conference on Image Processing, Computer Vision and Machine Learning (ICICML)
PublisherIEEE
Pages934-941
ISBN (Electronic)979-8-3503-5541-3
DOIs
Publication statusPublished - Nov 2024
Externally publishedYes
Event2024 3rd International Conference on Image Processing, Computer Vision and Machine Learning (ICICML 2024) - Shenzhen, China
Duration: 22 Nov 202424 Nov 2024
https://www.icicml.org/history_icicml2024

Publication series

NameInternational Conference on Image Processing, Computer Vision and Machine Learning, ICICML

Conference

Conference2024 3rd International Conference on Image Processing, Computer Vision and Machine Learning (ICICML 2024)
Abbreviated titleICICML 2024
Country/TerritoryChina
CityShenzhen
Period22/11/2424/11/24
Internet address

Research Keywords

  • instruction tuning
  • Multimodal Large language Model
  • Oil painting appreciation

Fingerprint

Dive into the research topics of 'LLaVA-Oil Painting Appreciation: A Vision-Language Model for Enhancing Understanding of Oil Paintings Through AI-Driven Analysis and Conversation'. Together they form a unique fingerprint.

Cite this