Segment Any Event Streams via Weighted Adaptation of Pivotal Tokens

Zhiwen Chen, Zhiyu ZHU, Yifan ZHANG, Junhui HOU, Guangming Shi, Jinjian Wu

Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review

Abstract

In this paper we delve into the nuanced challenge of tailoring the Segment Anything Models (SAMs) for integration with event data with the overarching objective of attaining robust and universal object segmentation within the event-centric domain. One pivotal issue at the heart of this endeavor is the precise alignment and calibration of embeddings derived from event-centric data such that they harmoniously coincide with those originating from RGB imagery. Capitalizing on the vast repositories of datasets with paired events and RGB images our proposition is to harness and extrapolate the profound knowledge encapsulated within the pre-trained SAM framework. As a cornerstone to achieving this we introduce a multi-scale feature distillation methodology. This methodology rigorously optimizes the alignment of token embeddings originating from event data with their RGB image counterparts thereby preserving and enhancing the robustness of the overall architecture. Considering the distinct significance that token embeddings from intermediate layers hold for higher-level embeddings our strategy is centered on accurately calibrating the pivotal token embeddings. This targeted calibration is aimed at effectively managing the discrepancies in high-level embeddings originating from both the event and image domains. Extensive experiments on different datasets demonstrate the effectiveness of the proposed distillation method. Code in https://github.com/happychenpipi/EventSAM. © 2024 IEEE.
Original languageEnglish
Title of host publication2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
EditorsCristina Ceballos
PublisherIEEE
Pages3890-3900
Number of pages11
ISBN (Electronic)979-8-3503-5300-6
ISBN (Print)979-8-3503-5301-3
DOIs
Publication statusPublished - 2024
Event2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2024)
- Seattle Convention Center, Seattle, United States
Duration: 17 Jun 202421 Jun 2024
https://cvpr.thecvf.com/Conferences/2024
https://ieeexplore.ieee.org/xpl/conhome/1000147/all-proceedings
https://cvpr.thecvf.com/virtual/2024/index.html

Publication series

NameProceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
ISSN (Print)1063-6919
ISSN (Electronic)2575-7075

Conference

Conference2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2024)
PlaceUnited States
CitySeattle
Period17/06/2421/06/24
Internet address

Bibliographical note

Research Unit(s) information for this publication is provided by the author(s) concerned.

Funding

This work is supported by Fundamental Research Funds for the Central Universities (QTZX23038), CityU Strategic Research Grant (7005990) and Innovation and Technology Fund (MHP/117/21).

Research Keywords

  • cross-modal knowledge distillation
  • event-based vision
  • segmentation

Fingerprint

Dive into the research topics of 'Segment Any Event Streams via Weighted Adaptation of Pivotal Tokens'. Together they form a unique fingerprint.

Cite this