Evaluating the suitability of normalized google similarity and individual match ratio average as measures for protein similari

dc.contributor.authorLee, Jun Choi
dc.date.accessioned2016-10-31T02:02:13Z
dc.date.available2016-10-31T02:02:13Z
dc.date.issued2008-06
dc.description.abstractBiological sequence comparIson faces various challenges. Although dynamic programming solution was claims to be the optimal solution for comparison process, the computation limitation and some fundamental challenges still make it inefficient for mass sequence comparison. Statistical method explores the statistics of sequences by the frequency of the words or partition in the sequence, it not only provides a solution without loss of statistical information, but also caters some of the fundamental problems in sequence comparison. Normalized Google Distance is a way of finding semantic similarity in web pages, with significant related characteristics. In this study, the suitability of Normalized Google Similarity and Individual Match Ratio Average in representing statistical significance of proteins in protein sequence comparison is studied. The potential of the proposed similarity measurements is evaluated through correlation coefficient and accuracy with FAST A as the reference benchmark. This study shows that the protein similarity measurement based on overlapping K-tuple has an overall better result compares to non-overlapping K-tuple. Both Normalized Google Similarity and Individual Match Ratio Average shows capability in representing protein sequence comparison.en_US
dc.identifier.urihttp://hdl.handle.net/123456789/2921
dc.subjectNormalized Google Distance is a way ofen_US
dc.subjectfinding semantic similarity in web pagesen_US
dc.titleEvaluating the suitability of normalized google similarity and individual match ratio average as measures for protein similarien_US
dc.typeThesisen_US
Files
License bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: