Filter-Wrapper Methods For Gene Selection In Cancer Classification

Loading...
Thumbnail Image
Date
2018-09
Authors
Osama Ahmad Suleiman Alomari
Journal Title
Journal ISSN
Volume Title
Publisher
Universiti Sains Malaysia
Abstract
In microarray gene expression studies, finding the smallest subset of informative genes from microarray datasets for clinical diagnosis and accurate cancer classification is one of the most difficult challenges in machine learning task. Many researchers have devoted their efforts to address this problem by using a filter method, a wrapper method or a combination of both approaches. A hybrid method is a hybridisation approach between filter and wrapper methods. It benefits from the speed of the filter approach and the accuracy of the wrapper approach. Several hybrid filter-wrapper methods have been proposed to select informative genes. However, hybrid methods encounter a number of limitations, which are associated with filter and wrapper approaches. The gene subset that is produced by filter approaches lacks predictiveness and robustness. The wrapper approach encounters problems of complex interactions among genes and stagnation in local optima. To address these drawbacks, this study investigates filter and wrapper methods to develop effective hybrid methods for gene selection. This study proposes new hybrid filter-wrapper methods based on Maximum Relevancy Minimum Redundancy (MRMR) as a filter approach and adapted bat-inspired algorithm (BA) as a wrapper approach. First, MRMR hybridisation and BA adaptation are investigated to resolve the gene selection problem. The proposed method is called MRMR-BA. Second, the modification of the filter approach (i.e., MRMR) is examined. An ensem ble of filter approaches (i.e., ReliefF, Chi-Square and Kullback-Liebler) is hybridised with the filtering mechanism of MRMR to increase its robustness, and this method is referred to as rMRMR-BA. Third, the modification of the wrapper approach (i.e., BA) is investigated. Additional optimization operators, which are based on TRIZ inventive solution, further explored the interaction between genes. This method is referred to as rMRMR-MBA. Finally, this study investigates BA hybridisation with local search algorithm (i.e., b Hill Climbing) to enhance local exploitation capability. This method is referred to as rMRMR-HBA. The obtained results of this study are compared with those of 10 other methods by using 14 benchmark microarray datasets of different sizes and complexity. The proposed rMRMR-HBA achieved the best results on 8 out of the 14 datasets. Moreover, the proposed method yielded competitive results on the remaining datasets.
Description
Keywords
Biomedical engineering , Cancer genes
Citation