Date of Award


Document Type


Degree Name

Doctor of Philosophy (PhD)


Computer Engineering and Sciences

First Advisor

Veton Kepuska

Second Advisor

Samuel Kozaitis

Third Advisor

Josko Zec

Fourth Advisor

Maria Pozo De Fernandez


Emotion plays a vital role in humans’ daily lives. Understanding emotions and recognizing how to react to others’ feelings are fundamental to engaging in successful social interactions. Emotion recognition through facial expression and speech play a significant role in human communication. This subject is becoming important in academic research as new techniques such as emotion recognition from speech context inspire us to recognize how emotions are related to the content we are uttering. The demand and importance of emotion recognition have highly increased in many applications in recent years, such as video games, human-computer interactions, cognitive computing, and affective computing. Recognizing emotion is achieved from many sources including text, speech, hand, and body gestures as well as facial expressions. Most of the emotion recognition methods only use one of the sources mentioned previously. Human emotions change almost every second and using a single way to process the emotion recognition may not reflect it correctly. The motivation for this research is based on my desire to understand and evaluate emotions in multiple ways such as facial and speech expressions. The topic of my dissertation is an examination of Real-Time facial expression and speech emotion recognition on a mobile phone using cloud computing. The proposed framework can recognize emotion from facial expression as well as speech in real time, that was embedded into an application that was developed for mobile phone. There are three parts in the design of the system: the facial emotion recognizer, the speech emotion recognizer, and merging both systems; the combined facial expression and speech recognition that runs on a smartphone using Cloud Computing (the app. name called Emotii). The Emotii Facial Expression and Speech part uses the results from the facial expression recognition and speech emotion recognition. Then, a novel method is used to integrate the results, when a final decision of the emotion is given after the fusion of those features. The application works in real-time on any mobile phone that has an android operating system and is capable of displaying correct emotion. The result is given as a percentage of all emotions such as neutral, happy, sad, angry, surprise, disgusted, and fear. The experiment results demonstrate that the emotional face and speech recognition on a mobile phone has been successful and it gives up to 97.26% correct results as measured from standard corpora: a. Cohn-Kanade (CK+), b. Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS).


Copyright held by author