×

IIIT Bangalore's video: Samvaad-Talk by Prof V Ramasubramanian and Shreekantha Nadig December 7 2020

@Samvaad-Talk by Prof. V. Ramasubramanian and Shreekantha Nadig (December 7, 2020)
Title: Multi-task learning architectures for end-to-end Automatic Speech Recognition Abstract: Automatic speech recognition (ASR) is the task of transcribing a speech signal using machines. Although the ASR problem has been studied for decades, the adoption of ASR systems has seen a significant increase in the past decade fueled by the research in deep neural networks and an increase in computational power. Traditionally, the ASR problem was solved using a cascaded system that contained an acoustic (AM), pronunciation (PM), and language models (LM). The end-to-end ASR architecture subsumes these in a single neural network. These models are a fraction of size and have been shown to perform comparably, if not better than their conventional counterparts. In this context, we present a couple of studies on multitask learning with end-to-end ASR architectures. Specifically, Our work on a multi-target ASR system that can predict multiple target sequences (phone-level and grapheme-level) using a single model. We present how such architecture can help in solving different ASR problems. Our study on the nature of neural-attention in ASR systems and the drawbacks of how attention has been thought of with these architectures. We identify that the neural-attention model has been a byproduct of the ASR cost function and is never a target of optimisation. In this regard, we present our work on using frame-level alignments learned from a conventional HMM-based architecture and how it can be used in multitask learning framework to learn better alignment models in an end-to-end ASR architecture with attention. We show that such models converge faster and perform better than the baseline. We briefly present studies on multi-lingual ASR systems and some open problems in the area of multitask learning with interest to end-to-end ASR. Speaker Biography: Shreekantha A Nadig is currently working as a Speech Recognition Engineer 2 with Dialpad Inc. and a part-time MS by Research student at IIIT-Bangalore under the supervision of Prof Sachit Rao and Prof V. Ramasubramanian. He was an EHRC and MINRO scholar while being a full-time MS student. His MS thesis is on the topic of multitask learning for end-to-end ASR systems with Attention and bringing in external knowledge to these data-driven architectures. He has also worked on other ASR problems such as multi-lingual systems, keyword spotting, on-the-edge inference with small-footprint models, and semi-supervised learning with end-to-end architectures. He has co-authored three papers and one of his papers on his MS thesis topic won the “Best Student Paper Award – Honorable Mention” at SPCOM 2020. Before joining IIITB, he was an SVT Engineer with Sonus Networks. He received a B.E. degree in Telecommunication Engineering from VTU. His current research interests are in multitask learning, end-to-end ASR systems for conversational & telephony speech, and ASR on the edge for low-powered devices. V. Ramasubramanian is currently Professor, IIITB, since 2017. Prior to this, he has held various positions in academia and industry, notably at TIFR (Research Scholar, Fellow and Reader), Univ. Valencia (Valencia, Spain), ATR Labs (Kyoto, Japan), IISc, Siemens Corporate Research (Senior Member Technical Staff and Head of Speech Research at Bangalore), PES-U (Professor). He has a Ph.D. from TIFR, Mumbai. His current research interests are in automatic speech recognition, machine learning, deep learning and associative memory formulations.

1

0
IIIT Bangalore
Subscribers
6.2K
Total Post
712
Total Views
9.8K
Avg. Views
271.7
View Profile
This video was published on 2021-02-12 14:55:15 GMT by @IIIT-Bangalore on Youtube. IIIT Bangalore has total 6.2K subscribers on Youtube and has a total of 712 video.This video has received 1 Likes which are lower than the average likes that IIIT Bangalore gets . @IIIT-Bangalore receives an average views of 271.7 per video on Youtube.This video has received 0 comments which are lower than the average comments that IIIT Bangalore gets . Overall the views for this video was lower than the average for the profile.IIIT Bangalore #IIITB #IIITBangalore #IIITBSamvaad #Samvaadtalks #IIITBResearch #IIITTalks Title: has been used frequently in this Post.

Other post by @IIIT Bangalore