On-Line Computer Recognition Of Hand-Written Arabic Text
Loading...
Date
1997-07
Authors
Al- Fakhri, Faris Hassan Fakhriddin
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
A novel algorithm for the on-line recognition of hand-written Arabic text is presented.
The text is written on a graphic tablet and the computer recognizes the characters online.
The algorithm is based on a structural technique, and operates by segmenting the
words into a set of basic strokes. These strokes represent whole character shapes in most
cases, but some characters are constructed from a number of these strokes. There are 13
such basic strokes, which are divided into three groups according to where they occur, at
the beginning, middle or end of the sub-word. These basic strokes are derived from
basic pen movements that are charactersitics to Arabic writing.
The choice of these basic strokes has simplified the segmentation process appreciably.
Compacted codes derived from six optimized directions were used to locate the basic
strokes. The latter are next verified by checking a number of features estimated for these
strokes. The algorithm was tested on data consisting of 125 handwritten words collected
from 42 users (thus totaling 24486 characters). The achieved correct segmentation rate
is 94.5% which also includes the identification of the strokes.
The recognition process involves the association of the basic strokes with the
complementary characters. The overall recognition rate of the initial system was
70.64%. The main reason for the low recognition rate was found to be user-related
errors. The main type of these errors, which seriously affected the recognition
performance, is complementary character misplacement. The complementary characters
are secondary characters placed above or below the basic shape of characters to
differentiate between those characters that share the same basic shapes. A
complementary character may be a dot, a vertical line, a horizontal line or a zigzag.
There are stipulated rules in the language for placing these characters, but often they are
misplaced by the writer, and may occupy places that extend to the territories of
neighboring charaGters. This misplacement problem occurred in about 19.29% of the
test data. However, when these and other user-related errors (such as wrongly drawn
strokes) were discounted from the data, the recognition rate was 90.69%.
A look ahead backtrack algorithm was developed to correct the misplacement problem.
This algorithm corrected about 61% of the misplaced characters (or about 12% of the
total characters in the test data).
By including the correction algorithm in the recognizer, the recognition rate increased
from 70.64% to 82.55%. The remaining errors included those misplaced complementary
characters that can only be corrected by reading as well as context, and genuine user
errors in drawing the strokes. These user-related errors were marked and the algorithm
re-run. When the marked characters were ignored by the algorithm, the rate was
91.79%.
Description
Keywords
A novel algorithm for the on-line recognition , of hand-written Arabic text is presented