Natural Sounding Standard Malay Speech Synthesis Based On Utmk Ebmt Architecture System

Loading...
Thumbnail Image
Date
2011-01
Authors
Tiun, Sabrina
Journal Title
Journal ISSN
Volume Title
Publisher
Universiti Sains Malaysia
Abstract
In this research work, we make natural sounding speech synthesis as the main goal. This goal was chosen following the type of demanded speech synthesis application systems by the industrial market; the limited-domain speech synthesis application systems. The limited-domain speech synthesis application system has restricted number of vocabularies (less flexible) but requires a highly natural sounding of speech synthesis. Based on the evolution of speech synthesis technique, one can conclude that using natural speech units without applying any signal processing is the technique to produce the most natural sounding of synthetic speech. As such, we opt to use a synthesis technique that avoids (or lessen) the concatenation points and prosodic manipulation process. The technique is implemented by using larger chunk of synthesis unit and making as much as possible the instances of one particular type of speech unit. However, the big question is how to choose the right instances of speech units for the targeted sentence? In this thesis, we address the speech unit selection problem by using corpus-based speech synthesis approach, in which, we use the example-based parser of Unit Terjemahan Melalui Komputer (UTMK) Example-based Machine Translation (EBMT) system, and speech corpus represented by a syntaxprosody tree structure. There are three significant research works conducted in this thesis; viz. the creation of syntax-prosody speech corpus, adapting the UTMK EBMT system architecture to create speech synthesiser system and proposing the non-audible distortion of subword unit concatenation. These contributions serve for one goal, which is to build a natural sounding Malay speech synthesiser model with a little bit of flexibility characteristic. We assess the performance of our speech synthesis approach by conducting a modified MOS test, prosodic-acoustic analysis and smoothness test. Based on the statistical analysis using ANOVA and T-tests, significantly our synthesis output was perceived more natural than the other synthesis output of Standard Malay speech synthesiser systems, but, less natural than the natural speech.
Description
Keywords
Natural sounding standard malay Speech synthesis , based on utmk ebmt Architecture system
Citation