On nonparametric control of a dynamic system
Автор: Agafonov E.D., Shishkina A.V.
Журнал: Сибирский аэрокосмический журнал @vestnik-sibsau
Рубрика: Математика, механика, информатика
Статья в выпуске: 4 т.18, 2017 года.
Бесплатный доступ
The paper considers the problem of dual control of an inertia-free object whose input is affected by a control vari- able and an observable but uncontrollable variable. The idea of dual control belongs to A. Feldbaum and was devel- oped on the basis of Bayesian approach. In this case, the probability densities of interference, as well as input output variables are known. In particular, the case of Gaussian probability densities was investigated. As a result, algorithms of dual control with the simplest objects of the inertia-free class were obtained. In the case of combined control systems, these studies were carried out by Feldbaum’s followers. Further development of dual control theory was described by Y. Tsypkin. The probability density of the interference was unknown, but the stage of selecting the structure of the control device and the equation of the describing object were necessary. In Tsypkin’s works, the corresponding parametric algorithms for dual control are given. In this case, a simultaneous estimation of the coefficient of the model and the regulator was made on the basis of the method of sto- chastic approximations. Later on nonparametric dual control algorithms were proposed. It is this way of control design that is discussed in the current paper. The parametric model of the object and the parametric structure of the controller were unknown in the problem statement. Nevertheless, it was known from a priori information that the characteristic of the object was one-to-one in control. Below we consider nonparametric control algorithms that combine the processes of simultaneous control and the study of the object with an accuracy of the structure unknown to the parameters, but a priori with a given number of delayed elements of the output variable of the object. In other words, the depth of the memory of the managed object was determined. In this case, a nonparametric algorithm for dual control can function in conditions of passive and active accumulation of information. The technique for representing a one-dimensional inertial-free dynamic system to a multidimensional static one is presented in the paper. Some results of numerical investigation of nonparametric algorithms of dual control are also presented.
Object with memory, dual control, combined system, nonparametric algorithms, bandwidth parameter, parameter setting
Короткий адрес: https://sciup.org/148177752
IDR: 148177752
Текст научной статьи On nonparametric control of a dynamic system
Introduction. The problem to be solved in the paper is the design and implementation of nonparametric control algorithms. An object under control is assumed to be inertia-free and it is described by an equation with an unknown structure with respect to its parameters. Together with the control effect the object is influenced by uncontrollable but observable input. The nonparametric dual control algorithms under consideration were investigated for various tactics of determining the bandwidth parameters at each clock cycle. A concept of dual control was created by A. Feldbaum [1] and developed by Y. Tsypkin [2]. It was originally intended for Bayes control problem statement when the object under control was inertia-free. The main idea of the concept was simultaneous control and learning of the object. That involves procedures of parametric identification [3] together with implementation of control methods in the parametric formulation [4–7]. It should be noted that a dual control system is an example of a control device with memory.
Problem statement. The following notations are introduced: let x = ( x 1 ... x n ) e R n be output of the object, u = ( u 1 ... u k ) e R k be controlled input effect, ц = ( ц 1 ... ц m ) e R m be uncontrollable but observable input, and x * = ( x * ... х П ) e R n be reference of the object under control (fig. 1) [8].
In fig. 1 the control device is denoted by ‘ yy ’, the object under control is denoted by ‘ О ’, and random stationary noise effects influencing both the object itself and the object measurement channels are denoted by ^ t , ht . We assume these noise effects to be unbiased and to have limited variance. The unknown model curve of the object x = f ( u , ц ) is assumed to be one-to-one with respect to control effect u eQ ( u ) e Rk for the fixed vector ц e О ( ц ) e R m in the feasible domain of u e Q ( u ) .

Fig. 1. The scheme of a nonparametric control system, where ( t ) denotes continuous time, and the subscript t indicates discreet time moments of measurements
Рис. 1. Схема непараметрической системы управления: t – непрерывное время; t в качестве индекса – дискретное время контроля измерения
Nonparametric combined control algorithms. A nonparametric control algorithm is based upon Nadaraya and Watson [9] nonparametric estimates of regression. A nonparametric dual control algorithm in the form [10] for a single-input case is represented by the expression:
ut + 1 =
t
Z ф
i = 1
t
Z u ф i=1

ц t+1 -ц i с ц ct

( ц t + 1 -ц ) l сц J V ct J
+ A ut + 1, i = 1, t .
For the multiple-input object a nonparametric dual control algorithm can be expressed by
n ut+1 = tn
Z u n№ i = 1 j = 1



tn
ZП Ф i = 1 j = 1

A m
Пф
J j = 1

+A ut + 1 , (2)
c t = 1 2 |ц t + 1 цр| , (5) where l 2 is a coefficient to be found experimentally: l 2 > 1. ц 0 is found from the optimization process ц 0 = min |ц —ц i |, i = 1, p . Finally we get a new reduced data set { x i , u i , ц i } , i = 1, 5 , whose size 5 satisfies the inequality 5 < p .
Judging from our experience, the algorithm (1) is insensitive to the sequence of c t and c f bandwidth parameters fitting.
Let us pay attention to learning process of the nonparametric dual control algorithm. It was discussed in [10], and is represented by the iterative scheme:
ut + 1 = u * + A ut + 1 . (6)
Information about the object under control is contained in ut * , and learning capabilities are fulfilled by the search additive A ut + 1:
A u t + 1 =a ( x * + 1 - x t + 1 ) , (7) where a is a coefficient defining search amplitude that should be fitted. We require that A ut + 1 ^ 0 with the growth of t .
It is a well-known fact that the output of a discrete dynamic object can be represented as follows [3]:
xt = f ( xt - 1 ... xt - k , ut ). (8)
In this case xt - 1 ... xt - k can be interpreted as supplementary incontrollable inputs in terms of previously introduced static object modeling routine. Fig. 2 describes the approach.
In fig. 2 the following notations are given: xt * is a reference output variable of the object; t with round brackets is a continuous time variable; subscript t denotes discrete time indices; htu , htx are random noise in measurement channels corresponding to the variables of the object; ^ ( t ) is an unobservable random effect.
Thus, for the uncontrollable but observable variable xt - i , i = 1, k as the input effect of the object the algorithm (1) can be rewritten in the following form [15]:
ut + 1 =
t
Z ^ф i=1

k
П ф | t^ - j j = 1 V
xi - j
t
ZФ i=1

k
ПФ j=1


+ A ut + 1, i = 1, t .
A general control theory of similar objects control is explicated in [2]. Implementation of the corresponding dual control algorithms can be found in [16; 17]. Below we will focus on numerical experiments with a nonparametric dual control algorithm (9). During the experiments the object under control will be substituted by either an inertia-free (memory-free) operator or a dynamic operator.
It should be noted that the control algorithm does not possess information on the equation (operator) of the object under control excepting the type of the operator. The use of dual control algorithms is presented in [16; 17].
Numerical experiments. To perform the first batch of numerical experiments the object was substituted by the expression xt+1 ut+1 + ц t+1 ,
where x(t) is an output variable; u(t) is a controllable input; μ(t) is observable but uncontrollable effect, taken as the process цt = 0.5 + 0.3sin(0.21).
The control procedure starts with the first point ( x 1 , u 1 , ц 1 ). Further data accumulation results in active learning of the control algorithm. As a consequence the object can be more effectively driven to a reference state or follow a reference trajectory.

Fig. 2. A control scheme for an object with memory (dynamic object)
Functioning of a nonparametric control algorithm is illustrated by fig. 3 and 4. A particular experiment was conducted to demonstrate the ability to follow the stepwise reference trajectory х* ( t ).
Fig. 4, a shows an enlarged scale of reference х* ( t ) and control x ( t ) processes depicted in fig. 4, b . A nonparametric dual control algorithm can effectively solve the control problem even for a noticeable level of noise. The case when random noise with amplitude up to 3 % of the output value is presented in fig 5.
Let the reference to the control process be given by the expression x * = 2 + sin(0.1 t ). For the case the corresponding control process is depicted in fig. 6.
Except stepwise functions and continuous reference functions one can construct other references using even random functions. To illustrate capabilities of the control algorithm (1) to follow random reference the following experiment was carried out (fig. 7). xt * is defined here as a sequence of sine function and purely random effect evenly distributed in the interval [0.5; 2.5].
Рис. 2. Управление объектом с памятью

Fig. 3. Uncontrollable effect μ( t )
Рис. 3. Неуправляемое входное воздействие

Fig. 6. Control process for continuous reference
Fig. 4. Control process for a stepwise reference value
Рис. 4. Управление при задающем воздействии в виде ступенчатой функции
Fig. 5. Control process in case of 3 % random noise applied to the object output
Рис. 5. Управление при задающем воздействии с помехой
Рис. 6. Управление при задающем воздействии в виде траектории
Experiments with the control algorithm operation make it evident that it is able to deal even with random references. On the contrary, standard P, PI, PID controllers cannot reach the level of control quality, because they are not based on data accumulation and analysis. Moreover, settling time is expected to be much worse for the controllers.
The control process in fig. 7 demonstrates satisfactory quality. That is an exceptional functionality of the control algorithm (1) can be noticed. It should be noted that none of already existing controllers can reach the same level of control accuracy and velocity.
Let us take into consideration another case when the object is represented by the dynamic operator (8). The equation is accepted in the form of the first-order discrete operator:
x t = f ( x t - 1 , u t ) . (12)
Particularly, the linear first-order object is described by
X t =P 1 U t +P 2 X t - 1 , (13)
where P 1 and P 2 are finite constants.
For the case we describe peculiarities of the control procedure. The learning process begins with a pair of measurements, namely ( x 0, u 0) and ( x 1, u 1). The initial phase of control is devoted to data accumulation needed to bring the object to the target state. Further, time to reach the target diminishes to a great extent.
Let us demonstrate functioning of the algorithm (9). Let control reference be a stepwise function. Control process for the reference is depicted in fig. 8.
Fig. 9 demonstrates functioning of the algorithm when reference is a combination of a sine function and a random function. Again, the control process can be qualified as highly effective.

Fig. 7. Control procedure for combined reference containing random noise
Рис. 7. Управление при задающем воздействии в виде траектории и случайного задания

Fig. 8. Control process in case of stepwise reference
Рис. 8. Управление при задающем воздействии в виде ступенчатой функции

Fig. 9. A random reference test for dynamic system dual control
Рис. 9. Результаты управления при случайном задании
Thus, the algorithm (9) is able to control a dynamic object with memory, providing good quality due to data accumulation and proper model-based control synthesis.
Conclusion. The problem of dynamic system control in case of nonparametric uncertainty conditions is discussed in the paper. After re-designation of object variables this problem can be reformulated in terms of multidimensional inertia-free object control. Bandwidth determination techniques for both controllable and uncontrollable input effects are proposed. Two variants of nonparametric control algorithm learning are discussed. Illustrations of some numerical experiments with the algorithm prove that it can be used in various computer-added systems of adaptive control. The key point of the algorithm is the capability to control continuous production processes with discrete-time measurement equipment.
Список литературы On nonparametric control of a dynamic system
- Фельдбаум А. А. Основы теории оптимальных автоматических систем. М.: Физматгиз, 1963. 552 с.
- Цыпкин Я. З. Адаптация и обучение в автоматических системах. М.: Наука, 1968. 400 с.
- Эйкхофф П. Основы идентификации систем управления. М.: Мир, 1975. 683 с.
- Воронов А. А. Основы теории автоматического управления. Ч. 1. Линейные системы регулирования одной величины. М.; Л.: Энергия, 1965. 396 с.
- Воронов А. А. Основы теории автоматического управления. Ч. 2. Специальные линейные и нелинейные системы автоматического регулирования одной величины. М.; Л.: Энергия, 1966. 364 с.
- Методы классической и современной теории автоматического управления: учебник. В 5 т. Т. 3. Синтез регуляторов систем автоматического управления. 2-е изд., перераб. и доп. М.: МГТУ им. Н. Э. Баумана, 2004. 616 с.
- Методы классической и современной теории автоматического управления. В 5 т. Т. 4. Теория оптимизации систем автоматического управления. 2-е изд., перераб. и доп. М.: МГТУ им. Н. Э. Баумана, 2004. 742 с.
- Медведев А. В. Адаптация в условиях непараметрической неопределенности. Адаптивные системы и их приложения. Новосибирск: Наука. СO АНССР, 1978. С. 4-34.
- Надарая Э. А. Непараметрическое оценивание плотности вероятностей и кривой регрессии. Тбилиси: Изд-во Тбил. ун-та, 1983.
- Медведев А. В. Основы теории адаптивных систем. Красноярск, 2015. 526 с.
- Васильев В. А., Добровидов А. В., Кошкин Г. М. Непараметрическое оценивание функционалов от распределений стационарных последовательностей. М.: Наука, 2004.
- Банникова А. В., Медведев А. В. О непараметрических алгоритмах управления динамической системой//Проблемы управления и моделирования в сложных системах: Тр. XVI Междунар. конф. Самара, 2014. С. 15-21.
- Медведев А. В. Теория непараметрических систем. Управление -I//Вестник СибГАУ. 2013. Вып. 2 (48). С. 57-63.
- Медведев А. В. Теория непараметрических систем. Управление -II//Вестник СибГАУ. 2013. Вып. 3 (49). С. 85-90.
- Медведев А. В. Элементы теории непараметрических систем управления//Актуальные проблемы информатики, прикладной математики и механики. В 3 ч. Ч. 3. Информатика. Новосибирск; Красноярск: Изд-во СО РАН, 1996. С. 87-112.
- Wenk C. J., Bar-Shalom Y. A multiple model adaptive dual control algorithm for stochastic systems with unknown parameters Automatic Control//IEEE Transactions. 2003. Vol. 25, iss. 4. P. 703-710.
- Tse E. Bar-Shalom. Y. An actively adaptive control for linear systems with random parameters via the dual control approach Automatic Control//IEEE Trans-actions. 2003. Vol. 18, iss. 2. P. 109-117.