Skip to main navigation Skip to search Skip to main content

MobileSAM-Track: Lightweight One-Shot Tracking and Segmentation of Small Objects on Edge Devices

  • Yehui Liu
  • , Yuliang Zhao*
  • , Xinyue Zhang
  • , Xiaoai Wang
  • , Chao Lian
  • , Jian Li
  • , Peng Shan
  • , Changzeng Fu
  • , Xiaoyong Lyu
  • , Lianjiang Li
  • , Qiang Fu
  • , Wen Jung Li
  • *Corresponding author for this work

Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review

73 Downloads (CityUHK Scholars)

Abstract

Tracking and segmenting small targets in remote sensing videos on edge devices carries significant engineering implications. However, many semi-supervised video object segmentation (S-VOS) methods heavily rely on extensive video random-access memory (VRAM) resources, making deployment on edge devices challenging. Our goal is to develop an edge-deployable S-VOS method that can achieve high-precision tracking and segmentation by selecting a bounding box for the target object. First, a tracker is introduced to pinpoint the position of the tracked object in different frames, thereby eliminating the need to save the results of the split as other S-VOS methods do, thus avoiding an increase in VRAM usage. Second, we use two key lightweight components, correlation filters (CFs) and the Mobile Segment Anything Model (MobileSAM), to ensure the inference speed of our model. Third, a mask diffusion module is proposed that improves the accuracy and robustness of segmentation without increasing VRAM usage. We use our self-built dataset containing airplanes and vehicles to evaluate our method. The results show that on the GTX 1080 Ti, our model achieves a J&F score of 66.4% under the condition that the VRAM usage is less than 500 MB, while maintaining a processing speed of 12 frames per second (FPS). The model we propose exhibits good performance in tracking and segmenting small targets on edge devices, providing a solution for fields such as aircraft monitoring and vehicle tracking that require executing S-VOS tasks on edge devices. © 2023 by the authors.
Original languageEnglish
Article number5665
JournalRemote Sensing
Volume15
Issue number24
Online published7 Dec 2023
DOIs
Publication statusPublished - Dec 2023

Research Keywords

  • correlation filters (CFs)
  • edge devices
  • remote sensing
  • Segment Anything Model (SAM)
  • semi-supervised video object segmentation (S-VOS)

Publisher's Copyright Statement

  • This full text is made available under CC-BY 4.0. https://creativecommons.org/licenses/by/4.0/

Fingerprint

Dive into the research topics of 'MobileSAM-Track: Lightweight One-Shot Tracking and Segmentation of Small Objects on Edge Devices'. Together they form a unique fingerprint.

Cite this