TY - JOUR
T1 - An evaluation of deep neural network models for music classification using spectrograms
AU - Li, Jingxian
AU - Han, Lixin
AU - Li, Xiaoshuang
AU - Zhu, Jun
AU - Yuan, Baohua
AU - Gou, Zhinan
PY - 2022/2
Y1 - 2022/2
N2 - Deep Neural Network (DNN) models have lately received considerable attention for that the network structure can extract deep features to improve classification accuracy and achieve excellent results in the field of image. However, due to the different content forms of music and images, transferring deep learning to music classification is still a problem. To address this issue, in the paper, we transfer the state-of-the-art DNN models to music classification and evaluate the performance of the models using spectrograms. Firstly, we convert the music audio files into spectrograms by modal transformation, and then classify music through deep learning. In order to alleviate the problem of overfitting during training, we propose a balanced trusted loss function and build the balanced trusted model ResNet50_trust. Finally, we compare the performance of different DNN models in music classification. Furthermore, this work adds music sentiment analysis based on the newly constructed music emotion dataset. Extensive experimental evaluations on three music datasets show that our proposed model Resnet50_trust consistently outperforms other DNN models.
AB - Deep Neural Network (DNN) models have lately received considerable attention for that the network structure can extract deep features to improve classification accuracy and achieve excellent results in the field of image. However, due to the different content forms of music and images, transferring deep learning to music classification is still a problem. To address this issue, in the paper, we transfer the state-of-the-art DNN models to music classification and evaluate the performance of the models using spectrograms. Firstly, we convert the music audio files into spectrograms by modal transformation, and then classify music through deep learning. In order to alleviate the problem of overfitting during training, we propose a balanced trusted loss function and build the balanced trusted model ResNet50_trust. Finally, we compare the performance of different DNN models in music classification. Furthermore, this work adds music sentiment analysis based on the newly constructed music emotion dataset. Extensive experimental evaluations on three music datasets show that our proposed model Resnet50_trust consistently outperforms other DNN models.
KW - Deep learning
KW - DNN models
KW - Music classification
KW - Spectrograms
KW - Transfer learning
UR - http://www.scopus.com/inward/record.url?scp=85101479960&partnerID=8YFLogxK
UR - https://www.scopus.com/record/pubmetrics.uri?eid=2-s2.0-85101479960&origin=recordpage
U2 - 10.1007/s11042-020-10465-9
DO - 10.1007/s11042-020-10465-9
M3 - RGC 21 - Publication in refereed journal
SN - 1380-7501
VL - 81
SP - 4621
EP - 4647
JO - Multimedia Tools and Applications
JF - Multimedia Tools and Applications
IS - 4
ER -