As humans, we tend to express our emotions or feelings through facial expressions or through sign language or through speaking, here we are basing our concept paper on speech emotion re
i need a industrial review, complete draft paper and final paper for the project by 5th december. the data sets is in the link in that document.
ADTA 5940: CAPSTONE OCT-09-2022 Venkata Sai Palivela, Manideep Janjala, Vamsi Krishna Reddy Gujjula Capstone_Group1
1
CONCEPT PAPER
Speech Emotion Recognition
Introduction:
As humans, we tend to express our emotions or feelings through facial expressions or through
sign language or through speaking, here we are basing our concept paper on speech emotion
recognition, and we may know what pitch or tone of the words can represent certain emotions
figuratively. With more and more AI models coming in to picture these days, The usage and
implementation of these models simultaneously arouse. With this type of classification and
predictions, it can pave a path for upcoming models to be more focused on recognizing the
emotions through voice command, Using AI has become a daily routine beginning from the
morning to evening we will be using AI for calling, texting, navigation, news, playing songs, playing
updates, reading out an article and much more. With this much of advancement tables have been
turned around in recognizing the emotions of the user, as this can help the users better in
understanding their daily needs, understanding the tone of the speaker, emotions and can better
interact with them in gaining thorough insights and make significant changes accordingly, by
playing songs relating to situation or playing videos based on the current mood or showing
relevant places nearby and much more, this is being a new start from past years and we are
basing this as our theme to our final capstone project.
ADTA 5940: CAPSTONE OCT-09-2022 Venkata Sai Palivela, Manideep Janjala, Vamsi Krishna Reddy Gujjula Capstone_Group1
2
Initial Analysis of the Dataset:
Here we are using audio dataset which is retrieved from “TORONTO EMOTION SPEECH SET
(TESS)” collection which is available at
https://tspace.library.utoronto.ca/handle/1807/24487
The data was created in 2010 by University of Toronto, Psychology Department, and the authors
include: Kate Dupuis, M. Kathleen Pichora-Fuller and this collection of datasets is published under
Creative Commons license Attribution-NonCommercial-NoDerivatives 4.0 International.
This dataset is capturing 7 emotions in audio files as a part of dataset which includes: Happy, Sad,
Fear, Angry, Disgust, Pleasant Surprise and Neutral, there are total 200 target words in the
dataset relating to these 7 emotions which are spoken by two actresses aged 26 and 64 and there
are 2800 stimuli (Audio Files) in total.
Objective:
Our project is based on a classification model which can help us analyze the emotion of speech
through the acquired audio files and predict the state of emotion respectively according to the
7 main emotions as we discussed above (Happy, Sad, Fear, Angry, Disgust, Pleasant Surprise
and Neutral).
Our objective of this project is to predict the state of emotions through the voice signatures,
audio files, and identifying human emotions based on the respective gender.
ADTA 5940: CAPSTONE OCT-09-2022 Venkata Sai Palivela, Manideep Janjala, Vamsi Krishna Reddy Gujjula Capstone_Group1
3
Work Plan:
Firstly, we will be starting with exploration of data, also find speech signature for each kind of
emotion and will try to compare both (Speech Signature, Different kind of emotions) to monitor
any significant difference and then we will be extracting features from the audio files which we
can categorize them as training data for our model and thereafter we will test and validate our
model to check for accuracy. Then we will finally use the model to predict the state of emotion
with training and testing data for both respective genders voice.
Here we are planning to perform classification methods by using Neural Networks to gain
better insights from the dataset.
The libraries which we are going to use for the classification and prediction includes ‘librosa,
‘numpy’, ‘matplotlib’, ‘pandas’, ‘tensorflow’, ‘scikit learn’ which is used for audio analysis and
in turn help us in visualizing audio files and for the feature extraction we will be using Keras
framework to work with Neural Networks, and we are going to use speech recognition module
using python which can better help us in taking voice as an input for the model and help in final
testing for the model.
Collepals.com Plagiarism Free Papers
Are you looking for custom essay writing service or even dissertation writing services? Just request for our write my paper service, and we'll match you with the best essay writer in your subject! With an exceptional team of professional academic experts in a wide range of subjects, we can guarantee you an unrivaled quality of custom-written papers.
Get ZERO PLAGIARISM, HUMAN WRITTEN ESSAYS
Why Hire Collepals.com writers to do your paper?
Quality- We are experienced and have access to ample research materials.
We write plagiarism Free Content
Confidential- We never share or sell your personal information to third parties.
Support-Chat with us today! We are always waiting to answer all your questions.