Natural language generation in machine translation (MT) based.on the string-tree corres.pondence grammar (STCG)
Loading...
Date
1995-10
Authors
Ai Looi, Heng
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
A Machine Translation (MT) system is the application of computers to the
translation of texts from one natural language in.to another. The String-Tree
Correspondence Grammar (STCG) is a grammar formalism used as a specification
language for writing linguistic programs in MT systems. The formalism derives its
properties from its predecessor, the Static Grammar (SG). It is a declarative bidirectional
grammar formalism made up of a set of grammar rules for defining a set of string (a
language), a set of trees and the mapping (correspondence) between the two. Structured
String-Tree Correspondence (SSTC) is a structure specified by a STCG rule which is
used formally to record. the explicit mappings between substrings in the string and
subtrees in the tree. Generation based on STCG is the process of producing all possible
resultant explicit SSTCs from a STCG grammar which begins with the axiom rule(s). A
language (strings) can then be obtained from the explicit SSTCs.
In this thesis, we aim at building a generator in MT based on the STCG, which is
the more basic generation process for the STCG formalism as opposed to the synthesis
process in MT. We set the research in this direction so that the process may be better
understood not on..l y in formal but .also in algorithmic terms. We hope that the STCG
't generator will serve as a good foundation for the synthesis process in MT. ,ยท;-
A formal means of designing the generator is by using inference rules. It is based
on the approach by [Tang 94] for analysis in MT based on the STCG, which in tum was
inspired by [Shieber 89]. In the propos~ design, the generator relies entirely on the set
of inference rules. We also propose a preprocessing phase for the generator for handling
cyclic grammars. In this case, the choices for applying the termination parts in non
terminating cases caused by the presence of cyclic/recursive rules are put in formal terms.
The generator and the preprocessing for the generator are proposed to be implemented
using the C language in MPW (Macintosh Programmer's Workshop).
Description
Keywords
Language generation , Translation (mt) based , Correspondence grammar (STCG)