Please use this identifier to cite or link to this item:
Title: Two-Stage Monaural Source Separation in Reverberant Room Environments using Deep Neural Networks
Authors: Sun, Yang
Wang, Wenwu
Chambers, Jonathon
Naqvi, Syed Mohsen
First Published: 17-Oct-2018
Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Citation: IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2019, 27(1) , pp. 125 - 139
Abstract: Deep neural networks (DNNs) have been used for dereverberation and separation in the monaural source separation problem. However, the performance of current state-of-the-art methods is limited, particularly when applied in highly reverberant room environments. In this paper, we propose a two-stage approach with two DNN-based methods to address this problem. In the first stage, the dereverberation of the speech mixture is achieved with the proposed dereverberation mask (DM). In the second stage, the dereverberant speech mixture is separated with the ideal ratio mask (IRM). To realize this two-stage approach, in the first DNN-based method, the DM is integrated with the IRM to generate the enhanced time-frequency (T-F) mask, namely the ideal enhanced mask (IEM), as the training target for the single DNN. In the second DNN-based method, the DM and the IRM are predicted with two individual DNNs. The IEEE and the TIMIT corpora with real room impulse responses and noise from the NOISEX dataset are used to generate speech mixtures for evaluations. The proposed methods outperform the state-of-the-art specifically in highly reverberant room environments.
DOI Link: 10.1109/TASLP.2018.2874708
ISSN: 1558-7916
Version: Post-print
Status: Peer-reviewed
Type: Journal Article
Rights: Copyright © 2018, Institute of Electrical and Electronics Engineers (IEEE). Deposited with reference to the publisher’s open access archiving policy. (
Appears in Collections:Published Articles, Dept. of Engineering

Files in This Item:
File Description SizeFormat 
Final version Yang Sun.pdfPost-review (final submitted author manuscript)2.54 MBAdobe PDFView/Open

Items in LRA are protected by copyright, with all rights reserved, unless otherwise indicated.