Filter-Wrapper Methods For Gene Selection In Cancer Classification
Loading...
Date
2018-09
Authors
Osama Ahmad Suleiman Alomari
Journal Title
Journal ISSN
Volume Title
Publisher
Universiti Sains Malaysia
Abstract
In microarray gene expression studies, finding the smallest subset of informative
genes from microarray datasets for clinical diagnosis and accurate cancer classification
is one of the most difficult challenges in machine learning task. Many researchers have
devoted their efforts to address this problem by using a filter method, a wrapper method
or a combination of both approaches. A hybrid method is a hybridisation approach between
filter and wrapper methods. It benefits from the speed of the filter approach
and the accuracy of the wrapper approach. Several hybrid filter-wrapper methods have
been proposed to select informative genes. However, hybrid methods encounter a number
of limitations, which are associated with filter and wrapper approaches. The gene
subset that is produced by filter approaches lacks predictiveness and robustness. The
wrapper approach encounters problems of complex interactions among genes and stagnation
in local optima. To address these drawbacks, this study investigates filter and
wrapper methods to develop effective hybrid methods for gene selection. This study
proposes new hybrid filter-wrapper methods based on Maximum Relevancy Minimum
Redundancy (MRMR) as a filter approach and adapted bat-inspired algorithm (BA) as
a wrapper approach. First, MRMR hybridisation and BA adaptation are investigated
to resolve the gene selection problem. The proposed method is called MRMR-BA.
Second, the modification of the filter approach (i.e., MRMR) is examined. An ensem ble of filter approaches (i.e., ReliefF, Chi-Square and Kullback-Liebler) is hybridised
with the filtering mechanism of MRMR to increase its robustness, and this method is
referred to as rMRMR-BA. Third, the modification of the wrapper approach (i.e., BA)
is investigated. Additional optimization operators, which are based on TRIZ inventive
solution, further explored the interaction between genes. This method is referred to
as rMRMR-MBA. Finally, this study investigates BA hybridisation with local search
algorithm (i.e., b Hill Climbing) to enhance local exploitation capability. This method
is referred to as rMRMR-HBA. The obtained results of this study are compared with
those of 10 other methods by using 14 benchmark microarray datasets of different
sizes and complexity. The proposed rMRMR-HBA achieved the best results on 8 out
of the 14 datasets. Moreover, the proposed method yielded competitive results on the
remaining datasets.
Description
Keywords
Biomedical engineering , Cancer genes