On-Line Computer Recognition Of Hand-Written Arabic Text

Date

1997-07

Authors

Al- Fakhri, Faris Hassan Fakhriddin

Abstract

A novel algorithm for the on-line recognition of hand-written Arabic text is presented. The text is written on a graphic tablet and the computer recognizes the characters online. The algorithm is based on a structural technique, and operates by segmenting the words into a set of basic strokes. These strokes represent whole character shapes in most cases, but some characters are constructed from a number of these strokes. There are 13 such basic strokes, which are divided into three groups according to where they occur, at the beginning, middle or end of the sub-word. These basic strokes are derived from basic pen movements that are charactersitics to Arabic writing. The choice of these basic strokes has simplified the segmentation process appreciably. Compacted codes derived from six optimized directions were used to locate the basic strokes. The latter are next verified by checking a number of features estimated for these strokes. The algorithm was tested on data consisting of 125 handwritten words collected from 42 users (thus totaling 24486 characters). The achieved correct segmentation rate is 94.5% which also includes the identification of the strokes. The recognition process involves the association of the basic strokes with the complementary characters. The overall recognition rate of the initial system was 70.64%. The main reason for the low recognition rate was found to be user-related errors. The main type of these errors, which seriously affected the recognition performance, is complementary character misplacement. The complementary characters are secondary characters placed above or below the basic shape of characters to differentiate between those characters that share the same basic shapes. A complementary character may be a dot, a vertical line, a horizontal line or a zigzag. There are stipulated rules in the language for placing these characters, but often they are misplaced by the writer, and may occupy places that extend to the territories of neighboring charaGters. This misplacement problem occurred in about 19.29% of the test data. However, when these and other user-related errors (such as wrongly drawn strokes) were discounted from the data, the recognition rate was 90.69%. A look ahead backtrack algorithm was developed to correct the misplacement problem. This algorithm corrected about 61% of the misplaced characters (or about 12% of the total characters in the test data). By including the correction algorithm in the recognizer, the recognition rate increased from 70.64% to 82.55%. The remaining errors included those misplaced complementary characters that can only be corrected by reading as well as context, and genuine user errors in drawing the strokes. These user-related errors were marked and the algorithm re-run. When the marked characters were ignored by the algorithm, the rate was 91.79%.

Keywords

A novel algorithm for the on-line recognition , of hand-written Arabic text is presented

URI

http://hdl.handle.net/123456789/2774

Collections

Pusat Pengajian Kejuruteraaan Elektrik dan Elektronik - Tesis

Full item page