Empowering Communication and the Role of Speech Recognition in Accessibility

Ms. Tejasree Mankenapalli

doi:https://doi.org/10.51976/jfsa.612305

Submit Manuscript Login / Register Subscribe

Home page

Editorial Board Members

Mission, Aims & Scope

Current Issue

Empowering Communication and the Role of Speech Recognition in Accessibility

Tejasree Mankenapalli

https://doi.org/10.51976/jfsa.612305

Published Online: July 14, 2023

Author Details ( * ) denotes Corresponding author

1. * Tejasree Mankenapalli, Student, CSE , KL University, Vijayawada, Andhra Pradesh (New), India (klucse2000030605@gmail.com)

This research paper explores advancements in speech recognition technology. Speech recognition, a pivotal area of artificial intelligence, involves converting spoken language into text or commands. The paper delves into foundational techniques like Hidden Markov Models (HMMs) and their evolution into modern Deep Learning approaches. It discusses the challenges posed by variations in accents, languages, and background noise, and showcases the integration of large datasets and sophisticated neural architectures. The study also emphasizes real-time ap- plicability and improved human-machine interaction. Through this investigation, the paper contributes to the understanding of cutting-edge methods in speech recognition and their practical implications.

Keywords

Speech processing, Speech recognition, Communication, Deep learning, CNN

Lakkhanawannakun, P. (June 2019). Speech Recognition using Deep Learning.
Sharma, R. E., Ahmad, T. & Alam, F. (June 2018). Emotion Analysis and Speech Signal Processing,
Poorjam, A.H. (2019). Quality Control in Remote Speech Data Collection.
Philipos C. Loizou Speech Quality Assessment, Vol 346
Benkerzaz, S., Elmir, Y. & Dennai, A. (2019). A Study on Automatic Speech Recognition.
Hu, Y. (2008). Evaluation of Objective Quality Measures for Speech Enhance- ment.
Dimmita, N. & Siddaiah, P. (2019). Speech Recognition Using Convolutional Neural Network. https://www.kaggle.com/datasets/uwrfkaggler/ravdessemotional-speech- audio
Hossain, M. S. & Muhammad, G. (2019). Emotion recognition using deep learning approach from audio–visual emotional big data. Inf. Fusion, 49, 69–78.
Chen, M., Zhou, P. & Fortino, G. (2016). Emotion communication system. IEEE Access, 5, 326–337.
Lalitha, S., Madhavan, A., Bhushan, B. & Saketh, S. (2014). Speech emotion recognition. In Proc. Int. Conf. Adv. Electron. Comput. Commun. (ICAECC), 1–4.
Scherer, K. R. (2005). What are emotions? And how can they be measured? Social Sci. Inf., 44(4), 695–729.
Koolagudi, S. G. & Rao, K. S. (2012). Emotion recognition from speech: A review. Int. J. speech Technol., 15(2), 99–117.
Schmidhuber, J. (2015). Deep learning in neural networks: An overview. Neural Netw., 61, 85–117.
Demircan, S. & Kahramanlı, H. (2014). Feature extraction from speech data for emotion recognition. J. Adv. Comput. Netw., 2(1), 28–30.