Deep and CNN fusion method for binaural sound source localisation

Jiang, Shilong; Wu, Lulu; Yuan, Peipei; Sun, Yongheng; Liu, Hong

Title	Deep and CNN fusion method for binaural sound source localisation
Authors	Jiang, Shilong Wu, Lulu Yuan, Peipei Sun, Yongheng Liu, Hong
Affiliation	PKU KUST Shenzhen Hong Kong Inst, Shenzhen, Peoples R China Peking Univ, Shenzhen Grad Sch, Key Lab Machine Percept, Shenzhen, Peoples R China
Keywords	MODEL
Issue Date	Jul-2020
Publisher	JOURNAL OF ENGINEERING-JOE
Abstract	In binaural sound source localisation, front-back confusion is often the challenging problem when localising sources in the noisy or reverberant environments. Hence, a novel algorithm fusing deep and convolutional neural network (CNN) is proposed to address this issue. First, joint features, which consist of interaural level differences (ILDs) and cross-correlation function (CCF) within a lag range, are extracted from binaural signals. Second, with the extracted CCF-ILD features, CNN is used for the front-back classification task, while deep neural network is used for azimuth classification task. The front-back features extracted by the CNN can be leveraged as additional information for the sound source localisation task. Also, an angle-loss function is designed to avoid the overfitting problem and to improve the generalisation ability of this method in adverse acoustic conditions. Finally, two branches are concatenated and then followed by an output layer, which generates the posterior probability of azimuth angles, and the azimuth corresponding to the maximum posterior probability is chosen as the direction of sound source. Experimental results demonstrate the effectiveness of the authors' method for front-back decision and azimuth estimation in noisy and reverberant environments.
URI	http://hdl.handle.net/20.500.11897/591589
DOI	10.1049/joe.2019.1207
Indexed	CCR ESCI IC
Appears in Collections:	深圳研究生院待认领机器感知与智能教育部重点实验室

Web of Science®

Scopus®

百度学术™

Google Scholar™