The Mathematics of Speech

Lauren Luecke

Professor Marek Rychlik

For our
research project, we plan on investigating the mathematics of speech generation
and recognition. To go about this, we
intend on using such techniques as Fourier analysis, Cepstral analysis, and
Takens embedding. Fourier analysis
involves the idea of representing signals, like speech samples in our case, as
a sum of sine waves. These sine waves
are easier to deal with than other wave forms.
More specific than this is cepstral analysis, which observes logarithmic
speech spectrums that consist of the source and filter spectra added
together. The source is the pitch of the
sound and the filter involves the shape of the vocal tract filter and the
location of the formants, which are frequency regions in a sound spectrum. Takens embedding states that it is possible
to reconstruct a state-space representation that is a one-to-one transformation
between the embedded signal and the actual state-space trajectory. The embedding of a signal is used to
qualitatively study any nonlinearities of the system generating a signal.

References:

Cassidy, Steve. “Fundamentals
of Speech Science.”

<www.ling.mq.edu.au/units/slp801/acoustics/ch05s05.html>

Harrington, J. and Cassidy, S. * Techniques in Speech
Acoustics*, Kluwer, 1999.

McCabe, David. “Fourier
Analysis.”

<http://sunlightd.virtualave.net/Fourier/Introduction.htm>

Smolenski, Brett Y.
“Nonlinear State Space Embedding Features and Their

Application to Robust Speech
Segmentation.”

<www.temple.edu/speech_lab/Smolenski_ICASSP_2004.pdf+takens+embedding

+speech&hl=en&ie=UTF-8>