A Global Optimization Algorithm for Tuning Hyperparameters of Machine Learning Methods

Автор: Usova M.A., Lebedev I.G., Shtanyuk A.A., Barkalov K.A.

Журнал: Проблемы информатики @problem-info

Рубрика: Прикладные информационные технологии

Статья в выпуске: 4 (69), 2025 года.

Бесплатный доступ

The paper discusses the problem of finding the best combination of hyperparameters of machine learning and artificial intelligence methods. In many cases, the efficiency (in a given metric) of the resulting solution for different values of hyperparameters can be quite different. In such problems, a significant issue is the potential for incorrect operation of the investigated artificial intelligence and machine learning methods within certain (a priori unknown) subregions of the hyperparameters search domain. Furthermore, the computational complexity of tuning makes manual or exhaustive search inappropriate. These characteristics have required the development of various intelligent automatic hyperparameter optimization methods. From a mathematical point of view, such a task can be represented as the problem of finding a global minimum of a function, given in the form of a “black box” and computable only in some part of the search domain. In this case, each computation of the objective function value at some point of the feasible domain may require significant computing resources. The objective function is assumed to satisfy the Lipschitz condition. The existence of subdomains where the objective function is undefined can be interpreted as the existence of some hidden, a priori unknown constraints of the problem. The authors propose an approach to solving this type of problem, which is an extension of the information-statistical global search algorithm (GSA) and takes into account the presence of undefined values of the objective function at some points. The algorithm partitions the search space with trial points and evaluates the characteristics of subregions based on the objective function values computed at their boundaries. If the function value at a point is unknown, the algorithm employs an estimate for this value, considering the size of the subregion under investigation. To minimize the number of redundant trials in subdomains where the function is not defined, the method parameter 𝛼 was used that regulates the number of trial points in regions of non-computability. The solution of multidimensional problems implemented through reducing them to one-dimensional optimization problems using space-filling curves (Peano curves). The article provides a detailed description and a flowchart of the operation of the proposed search algorithm. The implementation of the global search algorithm for the case of a not everywhere computable objective function (GSA-N) was based on the iOpt open source framework of intelligent optimization methods. To carry out the experiments, a generator of test problems with hidden constraints GKLS-HC was developed. It is based on the GKLS generator, which allows generating multiextremal functions with specified properties (number of minima, their regions of attraction, etc.). In the GKLS-HC generator, these functions were spoiled by areas of non-computability in the form of ellipsoids (the coordinates of centers and radii were generated randomly). The experimental results presented in the paper, obtained on a series of GKLS-HC test problems, demonstrate the reliability of global search and the efficiency of the developed algorithm. By adjusting the parameter 𝛼, it was possible to achieve the same performance of GSA-N operation as the basic GSA. The paper also considered the behavior of global optimization algorithms of the scipy.optimize library when solving problems with non-computable domains. In this study, the differential evolution and brute force methods showed the worst results, failing to solve this type of problem at all. The DIRECT and SHGO methods, although they solved the problems, were less effective than the developed GSA-N. Experiments were also conducted with hyperparameter tuning problems where undefined values of the quality metric arise. In these problems, certain hyperparameter combinations caused the method to return infinite values for the objective function. The LinearSVC classification algorithm was successfully tuned. GSA-N effectively solved the problem, identifying a better hyperparameter combination compared to standard scikit-learn algorithms. A time series prediction method from the FEDOT framework was also configured. During this experiment, GSA-N was compared to tuning algorithms from the popular Optuna framework. GSA-N achieved comparable performance in terms of the obtained the target metric value, significantly surpassing Optuna in solution time.

Еще

Machine learning, hyperparameter tuning, global optimization, black-box functions, partially defined functions

Короткий адрес: https://sciup.org/143185319

IDR: 143185319 | УДК: 519.853.4 | DOI: 10.24412/2073-0667-2025-4-52-72