Fractal
Speech Processing
by Marwan Al-Akaidi
Cambridge University Press, Cambridge,
UK, 2004
214 pp., illus. 63 b/w, Trade, $110.00
ISBN: 0-521-81458-8.
Reviewed
by Stefaan Van Ryssen
Hogeschool Gent
Jan Delvinlaan 115, 9000 Gent, Belgium
stefaan.vanryssen@pandora.be
The field of digital
Speech Processing
has different branches: text-to-speech
or speech synthesis, speech recognition,
and the identification of speakers are
the main ones. Each discipline has seen
spectacular advances in the past decades,
from the earliest synthetic voices of
the 60s to very advanced real time dictation
programs. The industry has been booming
with companies rising and disappearing
at a rate similar if not as dramatic as
the internet bubble. Fiascos like the
Belgian Lernout and Hauspie fraudulent
failure have reached the newspaper headlines,
all but shielding many successful stories
from the public eye. Even if media attention
has now somewhat abated, it is easy to
forget that the underlying mathematical
theory of speech processing has advanced
at a steady pace.
Marwan Al-Akaidi, professor at de Monfort
University, UK, senior member of the Institute
for Electrical and Electronic Engineering
and Chairman of the IEEE UKRI Signal Processing
Society, is certainly the right person
to introduce the newly developing technique
of fractal analysis in digital speech
processing. Although fractal techniques
have been widely used in image processing,
the application of fractals in speech
processing is relatively new. This book
represents the fruit of research carried
out at De Monfort University.
The first half of the book (chapters one-three)
covers traditional techniques like the
fast Fourier transform, digital filtering,
and estimation algorithms. Written for
engineers and academics, the pace is quite
quick, with a focus on computational methods
rather than applications and practical
results. These chapters can easily be
skipped by readers who are well acquainted
with the field, since they only summarize
established knowledge. Chapters four and
five give a quick overview of the history
of fractals and the fundamentals of fractal
analysis, connecting the concepts of wave
form, Fourier transform, and fractal dimension.
This is where the book really starts.
In chapter six, 'Speech processing with
fractals', the basic techniques for the
use of fractals in speech processinghere
mainly recognitionare covered,
while chapter seven is about speech synthesis.
It is unclear from this book what the
advantages for speech synthesis with the
help of fractal techniques actually are.
The field appears somewhat stalled, with
Al-Akaidi discussing the use of syllables,
demisyllables (initial and final parts
of syllables), phones and diphones (basically
transitional elements connecting two more
or less 'stable' sounds like the middle
of a vowel) but not pointing to any progress.
Finally, in chapter eight, Al-Akaidi discusses
some possible applications of fractal
signal processing in cryptology and chaos
theory.
All in all, this is a valuable book of
reference for engineers and academics
with the intention of contributing to
applied research. The range of examples
and applications Al-Akaidi points at is
impressive and may be a source of inspiration
for future developments in this adolescent
technology.