Microarray gene-expression data classification using less gene expressions by combining feature selection methods and classifiers

Автор: Aarti Bhalla, R. K. Agrawal

Журнал: International Journal of Information Engineering and Electronic Business(IJIEEB) @ijieeb

Статья в выпуске: 5 vol.5, 2013 года.

Бесплатный доступ

Microarray Data, often characterised by high-dimensions and small samples, is used for cancer classification problems that classify the given (tissue) samples as deceased or healthy on the basis of analysis of gene expression profile. The goal of feature selection is to search the most relevant features from thousands of related features of a particular problem domain. The focus of this study is a method that relaxes the maximum accuracy criterion for feature selection and selects the combination of feature selection method and classifier that using small subset of features obtains accuracy not statistically indicatively different than the maximum accuracy. By selecting the classifier employing small number of features along with a good accuracy, the risk of over fitting (bias) is reduced. This has been corroborated empirically using some common attribute selection methods (ReliefF, SVM-RFE, FCBF, and Gain Ratio) and classifiers (3 Nearest Neighbour, Naive Bayes and SVM) applied to 6 different microarray cancer data sets. We use hypothesis testing to compare several configurations and select particular configurations that perform well with small genes on these data sets.

Еще

Microarrays, Feature Selection, Hypothesis testing, Classification with less genes

Короткий адрес: https://sciup.org/15013212

IDR: 15013212

Статья научная