Date of Award

12-2022

Document Type

Dissertation

Degree Name

Doctor of Philosophy (PhD)

Department

Computer Engineering and Sciences

First Advisor

Veton Z. Këpuska

Second Advisor

Marius C. Silaghi

Third Advisor

Georgios C. Anagnostopoulos

Fourth Advisor

Ivica N. Kostanic

Abstract

The efficient use of a communication bandwidth starts with the data source. The features of the speech signals can be extracted and reconstructed to lower the Internet traffic of the acoustic artificial agents and to increase the quality of the automatic speech recognition systems. The Speech Quefrency Transform (SQT) is hereby introduced in the work to enrich the communication space between the artificial agents and mankind. We describe the motivation, methodology, and deep learning approach in detail as we apply the SQT technology to several applications: sharp pitch track extraction, real-time speech communications, and emotion recognition. The results were excellent. The work proves that the acceleration is the unit of quefrency and advocates for the adoption of the geometric scale for the cepstrum domain. It also proposes spectral banking to model the quefrency filters by the means of controlling the spectral leakage. This dissertation shows how to generate, combine, and apply the filters.

Share

COinS