Parallel suffix array construction algorithms for pc-cluster
Loading...
Date
2004
Authors
Jun Lee, Kok
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Suffix array (lvlanber & Myers. 1993) is the most compact and simple jidl rexr indices.
It is widely used for fast and detail queries over large databases. The only draw back of
sutlix array is the time consumed during its construction phase. Due to this problem,
many specific algorithms have been designed to overcome the computation bottleneck.
Another option to solve this problem effectively is to utilize parallel processing.
The intent of this thesis is to design parallel suffix array construction algorithms
for low cost network PC-cluster. Two parallel algorithms are proposed and the
performance investigated.
Firstly, the effort focus on proposing an algorithm that always performs weii for
practical input string that is independent of bandwidth issues. A parallel suffix sorting
algorithm called Steady algorithm that involved no communication is proposed. In the
implementation of steady algorithm, some unrelated communication is involved. This
unrelated communication is created by message passing library for tasks spawning and
user I/0 transfer which are not the part of the algorithm. To the best of our knowledge,
this algorithm is the first suffix array constntction algorithm that involves no
communication. The experimental result we obtained shows the Steady algorithm
performs well under low bandwidth PC-cluster.
Our second attempt in this thesis is to improve the sorting time in the Steady
algorithm. Another parallel suffix sorting algorithm called Parallel Two-stage
algorithm is proposed. We adopt a specific suffix sorting technique of Itoh & Tanaka
(1999) Two-Stage algorithm. We experimentally show that this algorithm improved the
timing of Steady algorithm under a limited number of processors. Besides, we show that
Parallel Two-Stage algorithm obtained a satisfactory speed up using a limited number
of processors, which is not achieved by the specific algorithm of Kitajima & Navarro
(1999).
Description
Keywords
Array construction , Algorithms