Artificial Bee Colony With Differential Evolution Algorithm For Feature Extraction And Selection Of Mass Spectrometry Data
Loading...
Date
2016-05
Authors
Mohamed Yusoff, Syarifah Adilah
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
The advancement in mass spectrometry technique for proteomic studies has proliferated
the discovery of biomarkers from quantitative proteomics pattern. Highthroughput
data for a given molecule can give rise to a series of inter-related and
overlapping peaks in a mass spectrum. The spectrum suffers from high dimensionality
data relative to small sample size. Several studies have proposed statistical and machine
learning techniques such as Principle Component Analysis (PCA), Independent
Component Analysis (ICA) and wavelet-coefficient in order to extract the potential features.
However, none of these methods take into account the huge number of features
relative to small sample size. This study focused on two stages of mass spectrometry
analysis. Firstly, feature extraction methods extract peaks as potential features to infer
biological meaning of the data. Shrinkage estimation of covariance was proposed to
assemble m=z windows and identify the correlation coefficient among peaks of mass
spectrometry data for feature extraction. Secondly, feature selection techniques search
parsimonious features through a learning model that exhibits the most accurate results.
A computational technique that mimics survival and natural processing known
as Artificial Bee Colony (ABC) integrated with linear SVM classifier was proposed for
feature selection. Later, this was hybrid with Differential Evolution (DE) techniques
(deABC) algorithm in order to expand the exploration of basic ABC. The proposed
method was tested with several real-world high resolution mass spectrometry datasets
which are ovarian cancer, liver (HCC) and Drug-induced toxicity (TOX) datasets to
evaluate the discrimination power, accuracy, sensitivity and specificity. For feature extraction,
the analysis was made with reported studies. The shrinkage estimation has
performed better discriminative analysis on the similar features. For feature selection,
the comparisons have been made with Particle Swarm Optimisation (PSO), Ant Colony
Optimisation (ACO) algorithms and reported studies. The proposed feature selection
deABC algorithm exhibited accuracy of 98.44, 88.89 and 93.75 percent on ovarian
cancer, TOX and liver (HCC) datasets respectively and in average outperformed the
PSO, ACO and similar reported study.
Description
Keywords
Feature extraction methods extract peaks as potential features , to infer biological meaning of the data.