TY - JOUR
T1 - Parallelization of cellular neural networks on GPU
AU - Ho, Tze-Yui
AU - Lam, Ping-Man
AU - Leung, Chi-Sing
PY - 2008/8
Y1 - 2008/8
N2 - Recently, cellular neural networks (CNNs) have been demonstrated to be a highly effective paradigm applicable in a wide range of areas. Typically, CNNs can be implemented using VLSI circuits, but this would unavoidably require additional hardware. On the other hand, we can also implement CNNs purely by software; this, however, would result in very low performance when given a large CNN problem size. Nowadays, conventional desktop computers are usually equipped with programmable graphics processing units (GPUs) that can support parallel data processing. This paper introduces a GPU-based CNN simulator. In detail, we carefully organize the CNN data as 4-channel textures, and efficiently implement the CNN computation as fragment programs running in parallel on a GPU. In this way, we can create a high performance but low-cost CNN simulator. Experimentally, we demonstrate that the resultant GPU-based CNN simulator can run 8-17 times faster than a CPU-based CNN simulator. © 2008 Elsevier Ltd. All rights reserved.
AB - Recently, cellular neural networks (CNNs) have been demonstrated to be a highly effective paradigm applicable in a wide range of areas. Typically, CNNs can be implemented using VLSI circuits, but this would unavoidably require additional hardware. On the other hand, we can also implement CNNs purely by software; this, however, would result in very low performance when given a large CNN problem size. Nowadays, conventional desktop computers are usually equipped with programmable graphics processing units (GPUs) that can support parallel data processing. This paper introduces a GPU-based CNN simulator. In detail, we carefully organize the CNN data as 4-channel textures, and efficiently implement the CNN computation as fragment programs running in parallel on a GPU. In this way, we can create a high performance but low-cost CNN simulator. Experimentally, we demonstrate that the resultant GPU-based CNN simulator can run 8-17 times faster than a CPU-based CNN simulator. © 2008 Elsevier Ltd. All rights reserved.
KW - Cellular neural networks
KW - GPU
KW - SIMD
UR - http://www.scopus.com/inward/record.url?scp=42749094942&partnerID=8YFLogxK
UR - https://www.scopus.com/record/pubmetrics.uri?eid=2-s2.0-42749094942&origin=recordpage
U2 - 10.1016/j.patcog.2008.01.018
DO - 10.1016/j.patcog.2008.01.018
M3 - RGC 21 - Publication in refereed journal
SN - 0031-3203
VL - 41
SP - 2684
EP - 2692
JO - Pattern Recognition
JF - Pattern Recognition
IS - 8
ER -