BM-KMP hybrid algorithm for exact and subsequence string matching
Loading...
Date
2009
Authors
W. Mahmood, Ammar
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
String matching algorithm is a basic operation in computer science. Some of the
example applications of this operation are text searching, sequence alignment in
computational biology and internet search engine. The size of the databases of these
applications are growing very rapidly, having different types of data, therefore more
efficient string matching algorithm is required to solve this problem. This study focuses
in hybridizing two well-known exact string matching algorithms which are Boyer-
Moore and Knuth-Morris-Pratt string matching algorithms. The hybrid algorithm
employs main ideas of the two phases of Boyer-Moore and Knuth-Morris~Pratt
algorithms. The hybrid algorithm employs good prefix idea from Knuth-Morris-Pratt
algorithm and bad character shifting from Boyer-Moore algorithm. Then the proposed
hybrid algorithm was adapted for subsequence matching with some modifications in
preprocessing phase and search phase. Both the hybrid and enhanced algorithms were
tested using four types of alphabet which are binary alphabets, DNA alphabet, protein
alphabet and English alphabet. The results of these algorithms show better results
compared to Boyer-Moore and Knuth-Morris-Pratt algorithms in terms of time and the
number of characters compared in different sizes of alphabet. The two algorithms also
showed better results than Knuth-Morris-Pratt algorithm in all types of alphabet, and
better result than Boyer-Moore algorithm ·_in binary, DNA' ·and protein alphabet.
However, the algorithms perfonned worse than Boyer-Moore algorithm in English
alphabet.
Description
Keywords
Algorithm , String matching