UDK 519.854.33
DETECTION OF PATTERNS IN DATA FOR RECOGNITION OF OBJECTS AS A CONDITIONAL PSEUDO-BOOLEAN OPTIMIZATION PROBLEM
A. N. Antamoshkin, I. S. Masich*
Siberian State Aerospace University named after academician M. F. Reshetnev 31, Krasnoyarsky Rabochy Av., Krasnoyarsk, 660014, Russian Federation E-mail: i-masich@yandex.ru
Creating and using logical classification algorithms are based on revealing patterns in the input data, and a decision function is formed from a set of these patterns. Search for patterns can be viewed as a combinatorial optimization problem. To produce the most effective solutions, the choice of the optimization algorithm should be made on the basis of the characteristic properties inherent in the considered optimization problem. In this paper we consider some properties of optimization problems that are solved in the course of finding logical patterns in the data. We consider the recognition problem for objects described by binary attributes and divided into two classes. Regularities are the elementary blocks for construction logical recognition algorithms. The problem of finding the maximum patterns can be written in the form of a constrained pseudo-Boolean optimization problem. We study the properties of the optimization model, which describes the search for logical patterns in the data. Research results show that in the search space there is a set of constancy of the objective function. This hampers performance of the optimization algorithms, which begin the search from a feasible point and leading it to neighboring points, since the calculation of the objective function in the system of neighborhoods consisting of neighboring points, gives no information on best search direction. While solving practical problems of large dimension, this set of constancy may own most of the points of the feasible region. We consider the possibilities of improving algorithms of search for patterns. Experimental investigations were conducted on the real-world recognition problems. Experimental results show that the use of information about the proximity of the sample objects to patterns can overcome the difficulties associated with the characteristic features of optimization problem solved and manifested in the presence of constancy sets. And this allows finding the best patterns in the data to use them in solving the recognition problems.
classification, logical patterns, pseudo-Boolean optimization.
References
  1. Dupuis C., Gamache M., Páge J. F. Logical analysis of data for estimating passenger show rates in the airline industry, Journal of Air Transport Management 18, 2012. P. 78–81.
  2. Hammer P. L., Bonates T. O. Logical analysis of data - An overview: From combinatorial optimization to medical applications. Annals of Operations Research 148, 2006.
  3. Esmaeili S. Development of equipment failure prognostic model based on logical analysis of data, Master of Applied Science Thesis, Dalhousie University, Halifax, Nova Scotia, July 2012.
  4. Boros E., Hammer P. L., Ibaraki T., Kogan A., Mayoraz E., Muchnik I. An implementation of logical analysis of data, IEEE Trans. on Knowledge and Data Engineering 12, 2000. P. 292–306.
  5. Hammer P. L. The Logic of Cause-effect Relationships. Lecture at the International Conference on Multi-Attribute Decision via Operations Research-based Expert Systems. Passau, Germany : Universitat Passau, 1986.
  6. Hammer P. L., Kogan A., Simeone B., Szedmák S. Pareto-optimal patterns in logical analysis of data. Discrete Applied Mathematics, 2004, vol. 144, no. 1, p. 79–102.
  7. Alexe G., Hammer P. L. Spanned patterns for the logical analysis of data, Discrete Appl. Math. 2006, vol. 154, p. 1039–1049.
  8. Guoa C., Ryoo H. S. Compact MILP models for optimal and Pareto-optimal LAD patterns, Discrete Applied Mathematics, 2012, vol. 160, p. 2339–2348.
  9. Eckstein J., Hammer P. L., Liu Y., Nediak M., Simeone B. The maximum box problem and its application to data analysis. Computational Optimization and Applications. 2002, vol. 23, p. 285–298.
  10. Bonates T. O., Hammer P. L., Kogan A. Maximum Patterns in Datasets. Discrete Applied Mathematics, 2008, vol. 156, no. 6, p. 846–861.
  11. Antamoshkin A. N. Regulyarnaya optimizatsiya psevdobulevykh funktsiy. [Regular optimization of pseudo-functions]. Krasnoyarsk, Izd-vo Krasnoyarskogo un-ta Publ., 1989, 284 p.
  12. Antamoshkin A. N., Masich I. S. Pseudo-Boolean optimization in case of unconnected feasible sets. Models and Algorithms for Global Optimization. Series: Springer Optimization and Its Applications. Springer, 2007, vol. 4, p. 111–122.
  13. Antamoshkin A. N., Masich I. S. [Efficient algorithms for constrained optimization pseudo-monotone functions]. Vestnik SibGAU. 2003, no. 4, p. 60–67 (In Russ.).
  14. Masich I. S. [The heuristic algorithms of boundary points search for an constraint pseudo-boolean optimization problem.] Vestnik SibGAU. 2006, no. 1(8), p. 39–43. (In Russ.)
  15. Bache K., Lichman M. UCI Machine Learning Repository. Irvine, CA: University of California, School of Information and Computer Science, 2013. Available at: http://archive.ics.uci.edu/ml.
  16. Golovenkin S. Е. et al. Oslozhneniya infarkta miokarda: baza dannykh dlya aprobatsii sistem raspoznavaniya i prognoza [Complications of myocardial infarction: a database for testing recognition systems and forecasting]. Krasnoyarsk, Vychislitel'nyy tsentr SO RAN, Preprint no. 6, 1997 (In Russ.).
  17. Golovenkin S. Е. et al. [Model of logical analysis for solving problem of prognosis of myocardial infarction complication]. Vestnik SibGAU. 2010, no. 4(30), p. 68–73 (In Russ.).
  18. Masich I. S., Kuzmich R. I., Kraeva E. M. Logicheskiy analiz dannykh v zadachakh klassifikatsii. Svidetel'stvo gosudarstvennoy registratsii programmy dlya EVM №2011612265 [Logical analysis of data in classification problems. Certificate of state registration of the computer № 2011612265]. SibGAU, 2011 (In Russ.).

Antamoshkin Alexander Nikolaevich – Dr. Sc., professor, Siberian State Aerospace University named after academician M. F. Reshetnev. E-mail: oleslav@mail.ru

Masich Igor Sergeevich – Cand. Sc., Docent, Siberian State Aerospace University named after academician M. F. Reshetnev. E-mail: i-masich@yandex.ru