Low-Latency In Situ Image Analytics With FPGA-Based Quantized Convolutional Neural Network
Research output: Journal Publications and Reviews › RGC 21 - Publication in refereed journal › peer-review
Author(s)
Detail(s)
Original language | English |
---|---|
Pages (from-to) | 2853-2866 |
Journal / Publication | IEEE Transactions on Neural Networks and Learning Systems |
Volume | 33 |
Issue number | 7 |
Online published | 12 Jan 2021 |
Publication status | Published - Jul 2022 |
Externally published | Yes |
Link(s)
DOI | DOI |
---|---|
Attachment(s) | Documents
Publisher's Copyright Statement
|
Link to Scopus | https://www.scopus.com/record/display.uri?eid=2-s2.0-85099543895&origin=recordpage |
Permanent Link | https://scholars.cityu.edu.hk/en/publications/publication(a68c24b2-e8a0-473f-8483-08a29618f4f6).html |
Abstract
Real-time in situ image analytics impose stringent latency requirements on intelligent neural network inference operations. While conventional software-based implementations on the graphic processing unit (GPU)-accelerated platforms are flexible and have achieved very high inference throughput, they are not suitable for latency-sensitive applications where real-time feedback is needed. Here, we demonstrate that high-performance reconfigurable computing platforms based on field-programmable gate array (FPGA) processing can successfully bridge the gap between low-level hardware processing and high-level intelligent image analytics algorithm deployment within a unified system. The proposed design performs inference operations on a stream of individual images as they are produced and has a deeply pipelined hardware design that allows all layers of a quantized convolutional neural network (QCNN) to compute concurrently with partial image inputs. Using the case of label-free classification of human peripheral blood mononuclear cell (PBMC) subtypes as a proof-of-concept illustration, our system achieves an ultralow classification latency of 34.2 μs with over 95% end-to-end accuracy by using a QCNN, while the cells are imaged at throughput exceeding 29 200 cells/s. Our QCNN design is modular and is readily adaptable to other QCNNs with different latency and resource requirements. © 2012 IEEE.
Research Area(s)
- Cell image classification, convolutional neural network (CNN), field-programmable gate array (FPGA), hardware architecture, low-latency inference, multiplexed asymmetric-detection time-stretch optical microscopy (multi-ATOM), quantized convolutional neural network (QCNN), reconfigurable computing
Citation Format(s)
Low-Latency In Situ Image Analytics With FPGA-Based Quantized Convolutional Neural Network. / Wang, Maolin; Lee, Kelvin C. M.; Chung, Bob M. F. et al.
In: IEEE Transactions on Neural Networks and Learning Systems, Vol. 33, No. 7, 07.2022, p. 2853-2866.
In: IEEE Transactions on Neural Networks and Learning Systems, Vol. 33, No. 7, 07.2022, p. 2853-2866.
Research output: Journal Publications and Reviews › RGC 21 - Publication in refereed journal › peer-review
Download Statistics
No data available