TY - JOUR
T1 - NIC-QF
T2 - A design of FPGA based Network Interface Card with Query Filter for big data systems
AU - Zhan, Jinyu
AU - Jiang, Wei
AU - Li, Ying
AU - Wu, Junting
AU - Zhu, Jianping
AU - Yu, Jinghuan
PY - 2022/11
Y1 - 2022/11
N2 - This paper presents an approach to accelerate query processing on storage and computing separated big data systems. Different from traditional co-processor methods, we propose an FPGA based Network Interface Card with Query Filter (NIC-QF) to pre-filter data on storage nodes. Without modifying the hardware architecture in storage nodes, the traditional NIC can be easily replaced with our NIC-QF to reduce the workload of computing nodes and the corresponding communication overhead. Integrated with the PCIe core, query filter and NIC communicator, NIC-QF can filter the original data on storage nodes and directly send the filtered data to computing nodes of big data systems to reduce the extra communication overheads inside the storage nodes. The filter units in a query filter perform multiple SQL tasks in parallel, and each filter unit is internally pipelined, which further speeds up data processing. The filter units are designed to support general SQL queries in various data formats including TextFile (a row-based storage format) and RCFile (a column-based storage format). Experiments on the TPC-H benchmark and the Tencent data set demonstrate the superiority of our design, saving up to 87.84% of time overhead compared with the traditional approaches.
AB - This paper presents an approach to accelerate query processing on storage and computing separated big data systems. Different from traditional co-processor methods, we propose an FPGA based Network Interface Card with Query Filter (NIC-QF) to pre-filter data on storage nodes. Without modifying the hardware architecture in storage nodes, the traditional NIC can be easily replaced with our NIC-QF to reduce the workload of computing nodes and the corresponding communication overhead. Integrated with the PCIe core, query filter and NIC communicator, NIC-QF can filter the original data on storage nodes and directly send the filtered data to computing nodes of big data systems to reduce the extra communication overheads inside the storage nodes. The filter units in a query filter perform multiple SQL tasks in parallel, and each filter unit is internally pipelined, which further speeds up data processing. The filter units are designed to support general SQL queries in various data formats including TextFile (a row-based storage format) and RCFile (a column-based storage format). Experiments on the TPC-H benchmark and the Tencent data set demonstrate the superiority of our design, saving up to 87.84% of time overhead compared with the traditional approaches.
KW - Acceleration
KW - Big data systems
KW - FPGA
KW - Network Interface Card
KW - Query filter
KW - Storage and computing separated architecture
UR - http://www.scopus.com/inward/record.url?scp=85132235344&partnerID=8YFLogxK
UR - https://www.scopus.com/record/pubmetrics.uri?eid=2-s2.0-85132235344&origin=recordpage
U2 - 10.1016/j.future.2022.06.001
DO - 10.1016/j.future.2022.06.001
M3 - RGC 21 - Publication in refereed journal
SN - 0167-739X
VL - 136
SP - 153
EP - 169
JO - Future Generation Computer Systems
JF - Future Generation Computer Systems
ER -