BM-KMP hybrid algorithm for exact and subsequence string matching

Loading...
Thumbnail Image
Date
2009
Authors
W. Mahmood, Ammar
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
String matching algorithm is a basic operation in computer science. Some of the example applications of this operation are text searching, sequence alignment in computational biology and internet search engine. The size of the databases of these applications are growing very rapidly, having different types of data, therefore more efficient string matching algorithm is required to solve this problem. This study focuses in hybridizing two well-known exact string matching algorithms which are Boyer- Moore and Knuth-Morris-Pratt string matching algorithms. The hybrid algorithm employs main ideas of the two phases of Boyer-Moore and Knuth-Morris~Pratt algorithms. The hybrid algorithm employs good prefix idea from Knuth-Morris-Pratt algorithm and bad character shifting from Boyer-Moore algorithm. Then the proposed hybrid algorithm was adapted for subsequence matching with some modifications in preprocessing phase and search phase. Both the hybrid and enhanced algorithms were tested using four types of alphabet which are binary alphabets, DNA alphabet, protein alphabet and English alphabet. The results of these algorithms show better results compared to Boyer-Moore and Knuth-Morris-Pratt algorithms in terms of time and the number of characters compared in different sizes of alphabet. The two algorithms also showed better results than Knuth-Morris-Pratt algorithm in all types of alphabet, and better result than Boyer-Moore algorithm ·_in binary, DNA' ·and protein alphabet. However, the algorithms perfonned worse than Boyer-Moore algorithm in English alphabet.
Description
Keywords
Algorithm , String matching
Citation