Natural Sounding Standard Malay Speech Synthesis Based On Utmk Ebmt Architecture System
Loading...
Date
2011-01
Authors
Tiun, Sabrina
Journal Title
Journal ISSN
Volume Title
Publisher
Universiti Sains Malaysia
Abstract
In this research work, we make natural sounding speech synthesis as the main goal.
This goal was chosen following the type of demanded speech synthesis application
systems by the industrial market; the limited-domain speech synthesis application
systems. The limited-domain speech synthesis application system has restricted
number of vocabularies (less flexible) but requires a highly natural sounding of
speech synthesis. Based on the evolution of speech synthesis technique, one can
conclude that using natural speech units without applying any signal processing is
the technique to produce the most natural sounding of synthetic speech. As such, we
opt to use a synthesis technique that avoids (or lessen) the concatenation points and
prosodic manipulation process. The technique is implemented by using larger chunk
of synthesis unit and making as much as possible the instances of one particular type of
speech unit. However, the big question is how to choose the right instances of speech
units for the targeted sentence? In this thesis, we address the speech unit selection
problem by using corpus-based speech synthesis approach, in which, we use the
example-based parser of Unit Terjemahan Melalui Komputer (UTMK) Example-based
Machine Translation (EBMT) system, and speech corpus represented by a syntaxprosody
tree structure. There are three significant research works conducted in this
thesis; viz. the creation of syntax-prosody speech corpus, adapting the UTMK EBMT
system architecture to create speech synthesiser system and proposing the non-audible
distortion of subword unit concatenation. These contributions serve for one goal,
which is to build a natural sounding Malay speech synthesiser model with a little bit of
flexibility characteristic. We assess the performance of our speech synthesis approach
by conducting a modified MOS test, prosodic-acoustic analysis and smoothness test.
Based on the statistical analysis using ANOVA and T-tests, significantly our synthesis
output was perceived more natural than the other synthesis output of Standard Malay
speech synthesiser systems, but, less natural than the natural speech.
Description
Keywords
Natural sounding standard malay Speech synthesis , based on utmk ebmt Architecture system