Skip to main navigation Skip to search Skip to main content

WCET Estimation for CNN Inference on FPGA SoC With Multi-DPU Engines

  • Wei Zhang
  • , Yunlong Yu
  • , Xiao Jiang
  • , Nan Guan
  • , Naijun Zhan
  • , Lei Ju*
  • *Corresponding author for this work

Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review

Abstract

The Deep Learning Processor Unit (DPU) released in the official Xilinx Vitis AI toolchain stands as a commercial off-the-shelf solution tailored for accelerating convolutional neural network (CNN) inference on Xilinx FPGA devices. While most FPGA accelerator focus on high performance and energy-efficiency, analyzing the worst-case execution time (WCET) bound is essential for using CNN accelerations in real-time embedded systems design. In this work, we show that in a multi-DPU environment, the observed worst-case inference time for a CNN inference task could become 3X larger w.r.t. the best case inference time, which prompts the prominent importance of a static timing analysis for FPGA-based CNN inference. We propose, to the best of the authors’ knowledge, the first static timing analysis framework for CNN inference in a multi-DPU environment. The proposed framework introduces a generalized timing behavior model for shared bus arbitration and memory access contention between parallel running DPU engines. Additionally, it incorporates a fine-grained memory access contention analysis that takes into account the characteristics of deep learning applications. For a single-DPU environment, the analysis result is 27% tighter in average compared with the state-of-the-art results. Furthermore, our proposed method produces relatively tight estimated results in the multi-DPU environment. © 1990-2012 IEEE.
Original languageEnglish
Pages (from-to)1146-1160
JournalIEEE Transactions on Parallel and Distributed Systems
Volume36
Issue number6
Online published1 Apr 2025
DOIs
Publication statusPublished - Jun 2025

Funding

This work was supported in part by the National Natural Science Foundation of China under Grant 62432005 and Grant 62302270, in part by Shandong Provincial Natural Science Foundation under Grant ZR20220F003 and Grant ZR2024MF099, in part by the Department of Science & Technology of Shandong Province under Grant SYS202201, in part by Quan Cheng Laboratory under Grant QCLZD202302, and in part by Taishan Scholars Program under Grant tsqn202211281.

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

  1. SDG 7 - Affordable and Clean Energy
    SDG 7 Affordable and Clean Energy

Research Keywords

  • FPGA
  • memory contention
  • static timing analysis
  • WCET estimation

Fingerprint

Dive into the research topics of 'WCET Estimation for CNN Inference on FPGA SoC With Multi-DPU Engines'. Together they form a unique fingerprint.

Cite this