Generative Adversarial Networks for Technical Ear Training in Perceiving How Reverberation Affects Frequency Contents in Music Production Education
Research output: Conference Papers › RGC 32 - Refereed conference paper (without host publication) › peer-review
Author(s)
Related Research Unit(s)
Detail(s)
Original language | English |
---|---|
Publication status | Published - Jan 2023 |
Conference
Title | International Conference on Music Education Technology 2023 (ICMdT2023) |
---|---|
Location | Hybrid, Education University of Hong Kong |
Place | Hong Kong |
Period | 10 - 12 January 2023 |
Link(s)
Permanent Link | https://scholars.cityu.edu.hk/en/publications/publication(7c5acc17-1572-4a58-8ffd-0a628e6268d5).html |
---|
Abstract
Technical ear training is a type of “perceptual learning focused timbral, dynamics and special attributes of sound”, which is an integral in music production education. [1] Reverberation is an audio effect frequently used in music production and it affects spectrum of audio and the timbre of music. In this case, for music production students, the technical ear training in how reverb works on the perception on spectrum of audio is important since sound engineers need to perceive subtle changes in timbre caused by reverb to reach the target texture that they design in music production. However, simply differentiating the audio pieces between non-reverb (original ones) and added reverb ones is not enough to compare only the change in frequency contents due to the loss of spatial perception in non-reverb ones.
Therefore, the perception on two different frequency contents can be compared between the processed audio with reverb and the processed audio with reverb but invert the spectrum changes triggered by reverb, which facilitates students who study music production to have technical ear training in how reverb works on perception of spectrum.
Additionally, as the deep learning is introduced in music production applications, a Dilated Residual Network (DRN) [2] and a Convolutional Neural Network (CNN) [3] was applied. However, a generative adversarial network (GAN) has not been applied in music production education.
In this paper we aim to designing an equalization filter by implementing a GAN to indicate optimal filter coefficient which provides flat frequency response of audio with reverb. We feed the GAN with sine sweep signals which are added six different reverb pre-sets including “chamber”, “hall”, “random hall”, “concert hall”, “plate” and “vintage plate” via Lexicon PCM Native Reverb plugin [4], while neural networks output the coefficient of filter. The goal of optimization is acquiring filter coefficients that lead to frequency response of the original sine sweep signal (target frequency response).
In conclusion, we propose a method based on GAN for the design of filter to invert frequency changes after adding reverb to facilitate technical ear training for music production students. This implementation can be extended to other audio effects like delay to explore the modifications in frequency after adding effects in terms of technical ear training.
Therefore, the perception on two different frequency contents can be compared between the processed audio with reverb and the processed audio with reverb but invert the spectrum changes triggered by reverb, which facilitates students who study music production to have technical ear training in how reverb works on perception of spectrum.
Additionally, as the deep learning is introduced in music production applications, a Dilated Residual Network (DRN) [2] and a Convolutional Neural Network (CNN) [3] was applied. However, a generative adversarial network (GAN) has not been applied in music production education.
In this paper we aim to designing an equalization filter by implementing a GAN to indicate optimal filter coefficient which provides flat frequency response of audio with reverb. We feed the GAN with sine sweep signals which are added six different reverb pre-sets including “chamber”, “hall”, “random hall”, “concert hall”, “plate” and “vintage plate” via Lexicon PCM Native Reverb plugin [4], while neural networks output the coefficient of filter. The goal of optimization is acquiring filter coefficients that lead to frequency response of the original sine sweep signal (target frequency response).
In conclusion, we propose a method based on GAN for the design of filter to invert frequency changes after adding reverb to facilitate technical ear training for music production students. This implementation can be extended to other audio effects like delay to explore the modifications in frequency after adding effects in terms of technical ear training.
Research Area(s)
- Audio processing, Deep learning, Technical ear training, General adversarial networks, Music production
Bibliographic Note
Research Unit(s) information for this publication is provided by the author(s) concerned.
Citation Format(s)
Generative Adversarial Networks for Technical Ear Training in Perceiving How Reverberation Affects Frequency Contents in Music Production Education. / Chen, Manni; Lindborg, Per Magnus.
2023. Paper presented at International Conference on Music Education Technology 2023 (ICMdT2023), Hong Kong.
2023. Paper presented at International Conference on Music Education Technology 2023 (ICMdT2023), Hong Kong.
Research output: Conference Papers › RGC 32 - Refereed conference paper (without host publication) › peer-review