Title | Deep and CNN fusion method for binaural sound source localisation |
Authors | Jiang, Shilong Wu, Lulu Yuan, Peipei Sun, Yongheng Liu, Hong |
Affiliation | PKU KUST Shenzhen Hong Kong Inst, Shenzhen, Peoples R China Peking Univ, Shenzhen Grad Sch, Key Lab Machine Percept, Shenzhen, Peoples R China |
Keywords | MODEL |
Issue Date | Jul-2020 |
Publisher | JOURNAL OF ENGINEERING-JOE |
Abstract | In binaural sound source localisation, front-back confusion is often the challenging problem when localising sources in the noisy or reverberant environments. Hence, a novel algorithm fusing deep and convolutional neural network (CNN) is proposed to address this issue. First, joint features, which consist of interaural level differences (ILDs) and cross-correlation function (CCF) within a lag range, are extracted from binaural signals. Second, with the extracted CCF-ILD features, CNN is used for the front-back classification task, while deep neural network is used for azimuth classification task. The front-back features extracted by the CNN can be leveraged as additional information for the sound source localisation task. Also, an angle-loss function is designed to avoid the overfitting problem and to improve the generalisation ability of this method in adverse acoustic conditions. Finally, two branches are concatenated and then followed by an output layer, which generates the posterior probability of azimuth angles, and the azimuth corresponding to the maximum posterior probability is chosen as the direction of sound source. Experimental results demonstrate the effectiveness of the authors' method for front-back decision and azimuth estimation in noisy and reverberant environments. |
URI | http://hdl.handle.net/20.500.11897/591589 |
DOI | 10.1049/joe.2019.1207 |
Indexed | CCR ESCI IC |
Appears in Collections: | 深圳研究生院待认领 机器感知与智能教育部重点实验室 |