Indexing Of Bilingual Knowledge Bank Based On The Synchronous Sstc Structure

Loading...
Thumbnail Image
Date
2006-07
Authors
Ye, Hong Hoe
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
The basic idea of Example-Based Machine Translation (EBMT) is to translate a sentence by using similar translation examples. To increase memory of EBMT, one simply needs to add new translation examples into a database. However, when the example database becomes larger, it becomes more difficult to retrieve proper examples as references for a translation. In a previous work, translation examples were annotated with a flexible structure called synchronous Structured String-Tree Correspondence (SSTC), and were stored in a database called Bilingual Knowledge Bank. However, its word-based indexing was not an efficient way of retrieving translation examples. Extending on the previous work, we exploited correspondences (that include mappings between source and target parts of translation examples) in the synchronous SSTC structure to improve retrieval of translation examples. In addition, we generalized translation examples to increase coverage of input text without increasing the example database. Based on the correspondences and generalization, two criteria, viz. word and structure, were used to index the translation examples. Indexing using words gave us a good coverage of input text while indexing using structures may be used to produce well-formed translations. We classified structural indexes according to different types and structures of examples. Structural indexes include a phrasal index and generalized indexes (which in turn include template indexes and a rule index). Besides indexing, we added some linguistic information (i.e. bilingual lexicon and root forms) into our English-Malay EBMT system in order to further increase coverage of input text. In our translation process, given an input sentence, we first carried out lexical matching using a word index and a phrasal index. Then, we performed structural matching using generalized indexes to find translation examples that are structurally close to the input sentence. The effectiveness of our approach was evaluated against the previous work. Our system outperforms the previous work in terms of well-formed ness and accuracy of translation outputs.
Description
Keywords
Computer Sciences , The Synchronous Sstc Structure
Citation