Steve Renals (School of Informatics, University of Edinburgh, United Kingdom)
Neural networks for distant speech recognition
Distant conversational speech recognition is a highly challenging task owing to the presence of multiple, overlapping talkers, additional non-speech acoustic sources, and the effects of reverberation. In this talk I’ll review work on distant speech recognition, with an emphasis on approaches which combine multichannel signal processing with acoustic modelling, and present some recent work on the use of hybrid neural network / hidden Markov model acoustic models for distant speech recognition of meetings recorded using microphone arrays. I’ll specifically focus on ways in which deep neural networks can learn suitable representations for distant speech recognition based on multichannel input.
Steve Renals is Professor of Speech Technology at the University of Edinburgh. He has research interests in speech and language technology, with over 200 publications in the area, with recent work on neural network acoustic models, cross-lingual speech recognition, and meeting recognition. He leads the EPSRC-funded Natural Speech Technology programme in the UK, is senior area editor of IEEE Transactions on Audio, Speech, and Language Processing, and is a member of the ISCA Advisory Council. He was previously co-editor-in-chief of the ACM Transactions on Speech and Language Processing, and an associate editor of IEEE Signal Processing Letters.