Publication: Classification Of Cpg Island And Promoter Regions Using Rare K-Mer Motifs
Loading...
Date
2015-07
Authors
Mohamed Hashim,Ezzeddin Kamil
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Empirical analysis on DNA k-mers is proven to be an effective means to discover functional
elements in the human genomes. Among the empirical works, “rare k-mer" (RKM) is a very
interesting subject to be studied due to their unique sequence properties. RKMs were
referred as DNA k-mers (of k=7 to 11) that have a low frequency in k-mer frequency
distributions of mammalian genomes, yet there are large variations of their mass at the lower
k-mer spectra. Our first objective is to discover RKM motifs in the human genome; to
identify potential RKM computational applications in biology; and to infer representations
for the identified applications. In short, the first goal was achieved by using comparative
strategy and several bioinformatic tools (of UCSC browsers, EpiGRAPH, and Galaxy) which
correlated RKMs with several genomic features of CpG Islands (CGIs), promoter, 5’ Un-
Translated Regions (5’UTRs), and open chromatin regions; and by using intrinsic string
mining approaches which identified several unique RKM topological, compositional, and
clustering properties in the correlated genomic features.
Description
Keywords
Classification Of Cpg , Promoter Regions Using Rare K-Mer Motifs