The Mathematics of Speech

Lauren Luecke

Professor Marek Rychlik

 

 

            For our research project, we plan on investigating the mathematics of speech generation and recognition.  To go about this, we intend on using such techniques as Fourier analysis, Cepstral analysis, and Takens embedding.  Fourier analysis involves the idea of representing signals, like speech samples in our case, as a sum of sine waves.  These sine waves are easier to deal with than other wave forms.  More specific than this is cepstral analysis, which observes logarithmic speech spectrums that consist of the source and filter spectra added together.  The source is the pitch of the sound and the filter involves the shape of the vocal tract filter and the location of the formants, which are frequency regions in a sound spectrum.  Takens embedding states that it is possible to reconstruct a state-space representation that is a one-to-one transformation between the embedded signal and the actual state-space trajectory.  The embedding of a signal is used to qualitatively study any nonlinearities of the system generating a signal.

 

 

 

References:

 

Cassidy, Steve. “Fundamentals of Speech Science.”

<www.ling.mq.edu.au/units/slp801/acoustics/ch05s05.html>

 

Harrington, J. and Cassidy, S. Techniques in Speech Acoustics, Kluwer, 1999.

 

McCabe, David. “Fourier Analysis.”

            <http://sunlightd.virtualave.net/Fourier/Introduction.htm>

 

Smolenski, Brett Y. “Nonlinear State Space Embedding Features and Their

            Application to Robust Speech Segmentation.”

            <www.temple.edu/speech_lab/Smolenski_ICASSP_2004.pdf+takens+embedding

            +speech&hl=en&ie=UTF-8>