Skip to main navigation Skip to search Skip to main content

Multi-Scale and Multi-Network Deep Feature Fusion for Discriminative Scene Classification of High-Resolution Remote Sensing Images

Baohua Yuan, Sukhjit Singh Sehra, Bernard Chiu*

*Corresponding author for this work

Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review

72 Downloads (CityUHK Scholars)

Abstract

The advancement in satellite image sensors has enabled the acquisition of high-resolution remote sensing (HRRS) images. However, interpreting these images accurately and obtaining the computational power needed to do so is challenging due to the complexity involved. This manuscript proposed a multi-stream convolutional neural network (CNN) fusion framework that involves multi-scale and multi-CNN integration for HRRS image recognition. The pre-trained CNNs were used to learn and extract semantic features from multi-scale HRRS images. Feature extraction using pre-trained CNNs is more efficient than training a CNN from scratch or fine-tuning a CNN. Discriminative canonical correlation analysis (DCCA) was used to fuse deep features extracted across CNNs and image scales. DCCA reduced the dimension of the features extracted from CNNs while providing a discriminative representation by maximizing the within-class correlation and minimizing the between-class correlation. The proposed model has been evaluated on NWPU-RESISC45 and UC Merced datasets. The accuracy associated with DCCA was 10% and 6% higher than discriminant correlation analysis (DCA) in the NWPU-RESISC45 and UC Merced datasets. The advantage of DCCA was better demonstrated in the NWPU-RESISC45 dataset due to the incorporation of richer within-class variability in this dataset. While both DCA and DCCA minimize between-class correlation, only DCCA maximizes the within-class correlation and, therefore, attains better accuracy. The proposed framework achieved higher accuracy than all state-of-the-art frameworks involving unsupervised learning and pre-trained CNNs and 2–3% higher than the majority of fine-tuned CNNs. The proposed framework offers computational time advantages, requiring only 13 s for training in NWPU-RESISC45, compared to a day for fine-tuning the existing CNNs. Thus, the proposed framework achieves a favourable balance between efficiency and accuracy in HRRS image recognition. © 2024 by the authors.
Original languageEnglish
Article number3961
JournalRemote Sensing
Volume16
Issue number21
Online published24 Oct 2024
DOIs
Publication statusPublished - Nov 2024

Funding

This research was funded by the Research Grant Council of the HKSAR, China (Project No. CityU 11205421) and the Jiangsu Engineering Research Center of Digital Twinning Technology for Key Equipment in Petrochemical Process under Grant DTEC202102.

Research Keywords

  • convolutional neural network (CNN)
  • discriminant correlation analysis (DCA)
  • discriminative canonical correlation analysis (DCCA)
  • feature fusion
  • scene classification

Publisher's Copyright Statement

  • This full text is made available under CC-BY 4.0. https://creativecommons.org/licenses/by/4.0/

RGC Funding Information

  • RGC-funded

Fingerprint

Dive into the research topics of 'Multi-Scale and Multi-Network Deep Feature Fusion for Discriminative Scene Classification of High-Resolution Remote Sensing Images'. Together they form a unique fingerprint.

Cite this