FEATURE SELECTION AND THE CHESSBOARD PROBLEM

Authors

  • Mariusz Kubus

DOI:

https://doi.org/10.18778/0208-6018.311.03

Keywords:

chessboard problem, feature selection, feature relevance

Abstract

Feature selection methods are usually classified into three groups: filters, wrappers and embedded methods. The second important criterion of their classification is an individual or multivariate approach to evaluation of the feature relevance. The chessboard problem is an illustrative example, where two variables which have no individual influence on the dependent variable can be essential to separate the classes. The classifiers which deal well with such data structure are sensitive to irrelevant variables. The generalization error increases with the number of noisy variables. We discuss the feature selection methods in the context of chessboard-like structure in the data with numerous irrelevant variables.

Downloads

Download data is not yet available.

References

Blum A.L., Langley P. (1997), Selection of relevant features and examples in machine learning, Artificial Intelligence, v.97 n.1-2, p.245-271.
Google Scholar

Caruana R.A., Freitag D. (1994), How useful is relevance? Working Notes of the AAAI Fall Symposium on Relevance (pp. 25-29). New Orleans, LA: AAAI Press.
Google Scholar

Forman G. (2003), An extensive empirical study of feature selection metrics for text classification. Journal of Machine Learning Research, 3:1289–1305.
Google Scholar

Gatnar E. (2005), Dobór zmiennych do zagregowanych modeli dyskryminacyjnych, In: Jajuga K., Walesiak M. (Eds.), Taksonomia 12, Klasyfikacja i analiza danych – teoria i zastosowania, Prace Naukowe Akademii Ekonomicznej we Wrocławiu, N. 1076, p.79-85.
Google Scholar

Guyon I., Elisseeff A. (2006), An introduction to feature extraction, In I. Guyon, S. Gunn, M. Nikravesh, L. Zadeh (Eds.), Feature Extraction: Foundations and Applications, Springer, New York.
Google Scholar

Guyon I., Weston J., Barnhill S., Vapnik V. (2002), Gene Selection for Cancer Classification using Support Vector Machines, Machine Learning, 46:389–422.
Google Scholar

Hall M. (2000), Correlation-based feature selection for discrete and numeric class machine learning, Proceedings of the 17th International Conference on Machine Learning, Morgan Kaufmann, San Francisco.
Google Scholar

Hellwig Z. (1969), Problem optymalnego wyboru predykant, ,,Przegląd Statystyczny”, N. 3-4.
Google Scholar

Jensen D. D., Cohen P. R. (2000), Multiple comparisons in induction algorithms. Machine Learning, 38(3): p.309–338.
Google Scholar

John G.H., Kohavi R., Pfleger P. (1994), Irrelevant features and the subset selection problem. In Machine Learning: Proceedings of the Eleventh International Conference, Morgan Kaufmann, p. 121-129.
Google Scholar

Kira K., Rendell L. A. (1992), The feature selection problem: Traditional methods and a new algorithm. In Proc. AAAI-92, p. 129–134. MIT Press.
Google Scholar

Koller D., Sahami M. (1996), Toward optimal feature selection. In 13th International Conference on Machine Learning, p. 284–292.
Google Scholar

Kononenko I. (1994), Estimating attributes: Analysis and extensions of RELIEF, In Proceedings European Conference on Machine Learning, p. 171-182.
Google Scholar

Ng K. S., Liu H. (2000), Customer retention via data mining. AI Review, 14(6):569 – 590.
Google Scholar

Quinlan J.R., Cameron-Jones R.M. (1995), Oversearching and layered search in empirical learning. In Mellish C. (Ed.), Proceedings of the 14th International Joint Conference on Artificial Intelligence, Morgan Kaufman, p.1019-1024.
Google Scholar

Xing E., Jordan M., Karp R. (2001), Feature selection for high-dimensional genomic microarray data. In Proceedings of the Eighteenth International Conference on Machine Learning, p. 601–608.
Google Scholar

Yu L., Liu H. (2004), Redundancy based feature selection for microarray data. In Proceedings of the Tenth ACM SIGKDD Conference on Knowledge Discovery and Data Mining, p. 737–742.
Google Scholar

Downloads

Published

2016-01-07

How to Cite

Kubus, M. (2016). FEATURE SELECTION AND THE CHESSBOARD PROBLEM. Acta Universitatis Lodziensis. Folia Oeconomica, 1(311). https://doi.org/10.18778/0208-6018.311.03

Issue

Section

MSA2015