item | Сибирский аэрокосмический журнал ISSN 2712-8970

UDK 519.6

ABOUT BOOSTED LEARNING OF NONPARAMETRIC ESTIMATORS

E. S. Mangalova*, O. V. Shesterneva

Reshetnev Siberian State Aerospace University 31, Krasnoyarsky Rabochy Av., Krasnoyarsk, 660037, Russian Federation *E-mail: e.s.mangalova@hotmail.com

Versatility of identification methods allows to apply them in different technical areas (including the aerospace in-dustry), as well as in medicine, economics, etc. In recent years, ensemble learning is becoming one of the most common methods of identification. Ensemble methods train multiple learners and further combine their use. One of the main tasks of combining multiple models with the same type is to eliminate certain drawbacks of individual models. This pa-per deals with some peculiar properties of the Nadaraya–watson kernel estimator. These peculiar properties are re-lated with existence of sparse areas in the space of input variables (some regions contain a small number of observa-tions in the training set) and with the behavior of the Nadaraya–watson kernel estimator near the boundary of the input variables space. The ensemble learning approach proposed by the authors is based on boosted learning of nonparamet-ric estimators. There is a formalized approach to ensemble building with configurable parameters and there are some guidelines for choosing these parameters. The numerical researches shows that proposed boosted ensemble is signifi-cantly more accurate than a single Nadaraya–watson kernel estimator both in case of sparse areas in the space of input variables, and in case of areas with a large number of observations in the training set. Also the numerical research demonstrates high accuracy of proposed boosted ensemble near the boundary of the input variables space and shows the possibility of using boosted ensemble in the extrapolation problem.

Keywords: regression, ensemble learning, Nadaraya–watson estimator, bandwidth.

References

References

1. Hastie T., Tibshirani R., Friedman J. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Second Edition, 2009, 312 р.

2. Polikar R. Ensemble Based Systems in Decision Making. IEEE Circuits and Systems Magazine, third quarter 2006, P. 21–45.

3. Kuncheva L. I. Combining Pattern Classifiers, Methods and Algorithms. New York, NY : Wiley Interscience, 2005, 360 р.

4. Mangalova E. S., Agafonov E. D. [Variety of individual models ensembles for identification generation problem]. Trudy XII Vserossiyskogo soveshchaniya po problemam upravleniya. [Proceedings of the XII All-Russian meeting on governance]. Available at: http:// vspu2014.ipu.ru/proceedings/prcdngs/3214.pdf (In Russ.)

5. Breiman L., Friedman J. H., Olshen R. A., Stone C. J. Classification and Regression Trees. Wadsworth Inc, Belmont, 1984.

6. Breiman L. Random Forests. Machine Learning, 2001, No. 45 (1), P. 5–32.

7. Friedman J. H. Greedy Function Approximation: A Gradient Boosting Machine. Available at: http://www-stat.stanford.edu/~jhf/ftp/trebst.pdf (accessed 10.1.2015).

8. Friedman J. H. Stochastic Gradient Boosting. Available at: http://www-stat.stanford.edu/~jhf/ftp/stobst.pdf (accessed 10.1.2015).

9. Nadaraya E. A. Neparametricheskie otsenki plotnosti veroyatnosti i krivoj regressii [Non-parametric estimation of the probability density and the regression curve]. Tbilisi, Izd Tbil. un-ta Publ., 1983, 194 p.

10. Barsegyan A. A., Kupriyanov M. S., Holod I. I. Analiz dannih i protsessov [Analysis of data and processes]. St. Petersburg, BHV, 2009, 512 p. (In Russ.).

11. Medvedev A. V. [Data analysis in the identification problem]. Komp'yuternyi analiz dannykh modelirovaniya. 1995, Vol. 2, P. 201–206 (In Russ.).

12. Korneeva A. A. Sergeeva N. A., Chzhan E. A. [Nonparametric data analysis in the identification problem]. Vestnik TGU 2013, Vol. 1 (22), P. 86–96 (In Russ.).

13. Medvedev A. V. Neparametricheskie sistemy adaptacii [Nonparametric adaptation systems]. Novosibirsk, Nauka Publ., 1983, 174 p. (In Russ.).

14. Hardle W. Prikladnaya neperametricheskaya regressiya [Applied nonparametric regression]. Mir Publ., 1993, 349 p.

15. Schapire R. E. The strength of weak learnability. Machine Learning, Vol. 5, No. 2, P. 197–227, 1990.

Mangalova Ekaterina Sergeevna – postgraduate student, Reshetnev Siberian State Aerospace University. Е-mail: e.s.mangalova@hotmail.com.

Shesterneva Olesya Viktorovna – Cand. Sc., Docent, Reshetnev Siberian State Aerospace University.
E-mail:kuznetcova_o@mail.ru.

Сибирский государственный университет
науки и технологий имени академика М.Ф. Решетнева

Контакты

Контакты

Сибирский АЭРОКОСМИЧЕСКИЙ журнал

ISSN 2712-8970

Социальные сети