Dantzig method. The transport type problem is a special case of a linear programming problem

1. Which statements are incorrect? Danzig method

Answer: can be classified as a gradient

2. Which of the following statements are true:

Answer: An LP problem with an inconsistent system of constraints is called open

3. Which of the listed methods are not active

Answer: golden ratio

4. Which of the following statements are true:

Answer: the transport type problem is a special case of the problem linear programming

5. Which of the following statements are true: Least squares method

Answer: ultimately comes down to solving the system n linear equations when approximating results by nth order polynomials

6. Which of the following methods are not gradient

Answer: simplex method (Nelder-Mead method)

7. Which of these methods allow you to find the global extremum of a multimodal function

Answer: scans

8. Which methods among those listed are coordinate search methods

Answer: tangent

9. Check the correct statements

Answer: the brute force method cannot be used when finding an extremum according to the Gauss-Seidel procedure

10. State the true statement

Answer: any plan is called feasible solution tasks

11. State the incorrect statement

Answer: a plane containing at least one corner point convex polyhedron is called the reference plane of this polyhedron

12. Indicate the numbers of the correct statements

Answer: transport-type problems cannot be solved using the Danzig method, since they belong to discrete programming problems (1). Original plan in the simplex method we obtain that all basic variables are equal to zero (3)

13. Identify the correct statement?

Answer: the basic solution to the LP problem is degenerate if at least one of the free variables is equal to zero

14. Which of the following is incorrect:

Answer: any point on a line is a convex linear combination of two points through which this line is drawn

15. Which of the statements below is true?

Answer: The traveling salesman problem belongs to the field of discrete programming.

16. Which of the following is true:

Answer: one of the main optimization problems is the “dimensionality problem”

17. What is incorrect in the above statements?

Answer: If the objective function of a LP problem reaches an extremum at several points, then it reaches the same value at any point that is a convex linear combination of these points.

18. Which of the following statements is incorrect?

Answer: The LP problem can be solved by a procedure for orderly transition from one plan to another.

19. Which of the following is true?

Answer: there cannot be an extremum within the region of feasible solutions to the LP problem

20. Which of the following is false?

Answer: To find the extremum of a linear objective function using the simplex method, it is necessary to perform n-m iterations, n is the number of unknowns of the problem, m is the number of general constraints

Gradient methods searching for the optimum of the objective function are based on the use of two basic properties gradient function.

1. The gradient of a function is a vector that at each point of the domain of definition of the function
directed normal to the level surface drawn through this point.

Gradient Projections
on the coordinate axis are equal to the partial derivatives of the function
according to the corresponding variables, i.e.

. (2.4)

Gradient methods include: relaxation method, gradient method, steepest descent method and a number of others.

Let's look at some of the gradient methods.

Gradient method

In this method, descent is carried out in the direction of the fastest change in the objective function, which naturally speeds up the process of finding the optimum.

The search for the optimum is carried out in two stages. At the first stage, the values of partial derivatives with respect to all independent variables are found, which determine the direction of the gradient at the point in question. At the second stage, a step is taken in the direction opposite to the direction of the gradient (when searching for the minimum of the objective function).

When a step is executed, the values of all independent variables change simultaneously. Each of them receives an increment proportional to the corresponding component of the gradient along a given axis.

The formulaic notation of the algorithm can look like:

,
. (2.5)

In this case, the step size
at a constant value of the parameter h, it changes automatically with changes in the gradient value and decreases as it approaches the optimum.

Another formula for the algorithm is:

,
. (2.6)

This algorithm uses a normalized gradient vector that indicates only the direction of the fastest change in the objective function, but does not indicate the rate of change in this direction.

In the step change strategy
in this case it is used that gradients
And
differ in direction. The search step is changed in accordance with the rule:

(2.7)

Where
– angle of rotation of the gradient at the k-th step, determined by the expression

,
– permissible limits of the gradient rotation angle.

The nature of the search for the optimum in the gradient method is shown in Fig. 2.1.

The moment of the end of the search can be found by checking at each step of the relation

Where – specified calculation error.

Rice. 2.1. The nature of the movement towards the optimum in the gradient method with a large step size

The disadvantage of the gradient method is that when using it, only the local minimum of the objective function can be detected. In order to find other local minima for a function, it is necessary to search from other starting points.

Another disadvantage of this method is the significant amount of calculations, because at each step, the values of all partial derivatives of the optimized function with respect to all independent variables are determined.

Method of steepest descent

When applying the gradient method, at each step it is necessary to determine the values of the partial derivatives of the function being optimized with respect to all independent variables. If the number of independent variables is significant, then the volume of calculations increases significantly and the time required to find the optimum increases.

Reducing the amount of computation can be achieved using the steepest descent method.

The essence of the method is as follows. After the gradient of the optimized function has been found at the initial point and the direction of its fastest decrease at the specified point has been determined, a descent step is taken in this direction (Fig. 2.2).

If the value of the function decreases as a result of this step, another step is taken in the same direction, and so on until a minimum is found in this direction, after which the gradient is calculated and a new direction of the fastest decrease of the objective function is determined.

Rice. 2.2. The nature of the movement towards the optimum in the steepest descent method (–) and the gradient method (∙∙∙∙)

Compared to the gradient method, the steepest descent method is more advantageous due to the reduction in the amount of calculations.

An important feature of the steepest descent method is that when it is applied, each new direction of movement towards the optimum is orthogonal to the previous one. This is explained by the fact that movement in one direction is carried out until the direction of movement is tangent to any line of a constant level.

The same condition as in the method discussed above can be used as a criterion for ending the search.

In addition, you can also take the search termination condition in the form of the relation

Where
And
– coordinates of the starting and ending points of the last segment of descent. The same criterion can be used in combination with monitoring the values of the objective function at points
And

The joint application of search termination conditions is justified in cases where the function being optimized has a clearly defined minimum.

Rice. 2.3. To determine the end of the search in the steepest descent method

As a strategy for changing the descent step, you can use the methods outlined above (2.7).

Let us consider the problem of unconditional minimization of a differentiable function of many variables. Let the gradient value at the point be approached to the minimum point. It was already noted above that in a small neighborhood of the point the direction of the fastest decrease of the function is given by the antigradient. This property is significantly used in a number of minimization methods. In the gradient method considered below, the direction of descent from the point is directly chosen. Thus, according to the gradient method

Exist various ways choosing a step, each of which specifies specific option gradient method.

1. Method of steepest descent.

Let's consider a function of one scalar variable and choose as the value for which the equality holds

This method, proposed in 1845 by O. Cauchy, is now commonly called the steepest descent method.

In Fig. Figure 10.5 shows a geometric illustration of this method for minimizing a function of two variables. From the starting point perpendicular to the level line in the direction of descent, the descent is continued until the minimum value of the function along the ray is reached. At the found point, this ray touches the level line. Then, a descent is carried out from the point in a direction perpendicular to the level line until the corresponding ray touches the level line passing through this point at a point, etc.

Note that at each iteration, the choice of step involves solving the one-dimensional minimization problem (10.23). Sometimes this operation can be performed analytically, for example for quadratic function.

Let's use the steepest descent method to minimize the quadratic function

with a symmetric positive definite matrix A.

According to formula (10.8), in this case, therefore, formula (10.22) looks like this:

notice, that

This function is a quadratic function of the parameter a and reaches a minimum at a value for which

Thus, in relation to minimizing the quadratic

function (10.24), the steepest descent method is equivalent to calculation using formula (10.25), where

Remark 1. Since the minimum point of function (10.24) coincides with the solution of the system, the steepest descent method (10.25), (10.26) can also be used as iterative method solutions to linear systems algebraic equations with symmetric positive definite matrices.

Remark 2. Note that where is the Rayleigh ratio (see § 8.1).

Example 10.1. Let's use the steepest descent method to minimize the quadratic function

Note that Therefore, we know the exact value of the minimum point in advance. Let's write it down this function in the form (10.24), where matrix and vector As is easy to see,

Let's take the initial approximation and carry out calculations using formulas (10.25), (10.26).

I iteration.

II iteration.

It can be shown that for all iterations the values will be obtained

Note that for Thus,

the sequence obtained by the steepest descent method converges with the speed geometric progression, whose denominator

In Fig. Figure 10.5 shows exactly the descent trajectory that was obtained in this example.

For the case of minimizing a quadratic function, the following holds: overall result.

Theorem 10.1. Let A be a symmetric positive definite matrix and the quadratic function (10.24) is minimized. Then, for any choice of initial approximation, the steepest descent method (10.25), (10.26) converges and the following error estimate is correct:

Here and Lado are the minimum and maximum eigenvalues of matrix A.

Note that this method converges at the rate of a geometric progression, the denominator of which, if they are close, is small and the method converges quite quickly. For example, in Example 10.1 we have and therefore If Aschakh, then 1 and we should expect slow convergence of the steepest descent method.

Example 10.2. Application of the steepest descent method to minimize the quadratic function during the initial approximation gives a sequence of approximations where The descent trajectory is shown in Fig. 10.6.

The sequence converges here at the speed of a geometric progression, the denominator of which is equal to, i.e., significantly slower,

I'll try it out than in the previous one. Since here the result obtained is quite consistent with estimate (10.27).

Remark 1. We formulated a theorem on the convergence of the steepest descent method in the case when the objective function is quadratic. IN general case, if the function to be minimized is strictly convex and has a minimum point x, then also, regardless of the choice of the initial approximation, the sequence obtained by this method converges to x at . In this case, after entering a sufficiently small neighborhood of the minimum point, the convergence becomes linear and the denominator of the corresponding geometric progression is estimated from above by the value and where both the minimum and maximum eigenvalues Hessian matrices

Remark 2. For the quadratic objective function (10.24), the solution to the one-dimensional minimization problem (10.23) can be found in the form of a simple explicit formula (10.26). However, for most others nonlinear functions This cannot be done and to calculate the steepest descent method you have to use numerical methods one-dimensional minimization of the type discussed in the previous chapter.

2. The problem of "ravines".

From the above discussion it follows that the gradient method converges quite quickly if, for the function being minimized, the level surfaces are close to spheres (if the level lines are close to circles). For such functions and 1. Theorem 10.1, Remark 1, as well as the result of Example 10.2 indicate that the rate of convergence drops sharply as the value increases. Indeed, it is known that the gradient method converges very slowly if the level surfaces of the function being minimized are strongly elongated in some directions . In the two-dimensional case, the relief of the corresponding surface resembles the terrain with a ravine (Fig. 10.7). Therefore, such functions are usually called gully functions. Along the directions characterizing the “bottom of the ravine,” the gully function changes slightly, but in other directions characterizing the “slope of the ravine,” a sharp change in the function occurs.

If the starting point falls on the “slope of the ravine”, then the direction of the gradient descent turns out to be almost perpendicular to the “bottom of the ravine” and the next approach falls on the opposite “slope of the ravine”. The next step towards the “bottom of the ravine” returns the approach to the original “slope of the ravine”. As a result, instead of moving along the “bottom of the ravine” towards the minimum point, the descent trajectory makes zigzag jumps across the “ravine”, almost never approaching the goal (Fig. 10.7).

To speed up the convergence of the gradient method while minimizing gully functions, a number of special “gully” methods have been developed. Let's give an idea of one of the simplest techniques. From two close starting points they make a gradient descent to the “bottom of the ravine”. A straight line is drawn through the found points, along which a large “gully” step is taken (Fig. 10.8). From the point found in this way, one step of gradient descent is again taken to the point. Then a second “gully” step is taken along the straight line passing through the points. As a result, the movement along the “bottom of the ravine” to the minimum point is significantly accelerated.

More detailed information about the problem of “ravines” and “gully” methods can be found, for example, in,.

3. Other approaches to determining the descent step.

As is easy to understand, at each iteration it would be desirable to choose a direction of descent close to the direction along which movement leads from point to point x. Unfortunately, the antigradient (is, as a rule, an unsuccessful direction of descent. This is especially pronounced for gully functions. Therefore, doubt arises about the advisability of a thorough search for a solution to the one-dimensional minimization problem (10.23) and there is a desire to take only such a step in the direction that would ensure " significant decrease" of the function Moreover, in practice, sometimes they are content with defining a value that simply ensures a decrease in the value of the objective function.

Gauss-Seidel method

The method consists in alternately finding partial extrema of the objective function for each factor. At the same time, at each stage (k-1) factors are stabilized and only one i-th factor is varied

Calculation procedure: in the local region of the factor space, based on preliminary experiments, select a point corresponding the best result process, and from there they begin to move towards the optimum. The step of movement for each factor is set by the researcher. First, all factors are fixed at the same level and one factor is changed until there is an increase (decrease) in the response function (Y), then another factor is changed when the others are stabilized, etc. until they get desired result(Y). The main thing is to choose the right step of movement for each factor.

This method is the simplest and most obvious, but the movement towards the optimum takes a long time and the method rarely leads to the optimal point. Currently, it is sometimes used in machine experiments.

These methods ensure movement towards the optimum along a straight line perpendicular to the lines of equal response, i.e. in the direction of the gradient of the response function.

Gradient methods have several varieties, differing in the rules for choosing the stages of variation and working steps at each stage of movement towards the extremum.

The essence of all methods is as follows: initially, based on preliminary experiments, they select base point. Then, at each stage, trial experiments are organized around the next base point, based on the results of which the new direction of the gradient is estimated, after which one working step is taken in this direction.

The gradient method (usual) is carried out according to the following scheme:

a) select a base point;

b) select movement steps for each factor;

c) determine the coordinates of the test points;

d) conduct experiments at trial points. As a result, the values of the optimization parameter (Y) at each point are obtained.

e) based on the results of the experiments, estimates of the components of the vector gradient in t. M are calculated for each i-th factor:

where H i is the step of movement along X i.

X i – coordinates of the previous operating point.

g) the coordinates of this operating point are taken as a new base point, around which experiments are carried out at trial points. Calculate the gradient, etc., until the desired optimization parameter (Y) is reached. The direction of movement is corrected after each step.

Advantages of the method: simplicity, higher speed of movement towards the optimum.

Disadvantages: high sensitivity to interference. If the curve has complex shape, the method may not lead to the optimum. If the response curve is flat, the method is ineffective. The method does not provide information about the interaction of factors.

a) Steep ascent method (Box - Wilson).

b) Making decisions after a steep climb.

c) Simplex optimization method.

d) Advantages and disadvantages of methods.

5.7.3 Steep ascent method (Box-Wilson)

This method is a synthesis best features gradient methods, the Gauss-Seidel method and the PFE and DFE methods - as a means of obtaining a mathematical model of the process. The optimization problem is solved using this method so that the stepwise movement is carried out in the direction of the fastest increase (decrease) of the optimization parameter. The direction of movement is adjusted (unlike gradient methods) not after each step, but upon reaching a particular extremum of the objective function. Next, at the points of a particular extremum, a new factorial experiment is carried out, a new mathematical model and the steep ascent is repeated again until the global optimum is reached. Movement along the gradient begins from the zero point (center of the plan).

The steep ascent method involves moving towards the optimum along a gradient.

Where i,j,k-unit vectors in the direction of the corresponding coordinate axes.

Calculation procedure.

The initial data is a mathematical model of the process obtained by any method (PFE, DFE, etc.).

Calculations are carried out in the following order:

a) it is better to translate the regression equation into natural form using variable coding formulas:

Where x i -coded value of variable x i ;

X i - natural value of the variable x i;

X i C is the central level of the factor in its natural form;

l i - interval of variation of factor x i in its natural form.

b) calculate the steps of movement towards the optimum for each factor.

To do this, calculate the products of the regression equation coefficients in natural form and the corresponding variation intervals

B i *.l I ,

Then, from the resulting products, the maximum modulus is selected, and the factor corresponding to this product is taken as the base factor (B a l a). For the basic factor, you should set the movement step, which is recommended to be set smaller or equal to the interval variation of the basic factor

The sign of the movement step l a ’ must coincide with the sign of the regression equation coefficient corresponding to the base factor (B a). The size of steps for other factors is calculated proportionally to the base one using the formula:

The signs of the movement steps must also coincide with the signs of the corresponding coefficients of the regression equation.

c) calculate the response function in the center of the plan, i.e., for factor values equal to the central level of factors, since the movement towards the optimum begins from the center of the plan.

Next, the optimization parameter is calculated, increasing the values of the factors by the value of the corresponding movement step if they want to obtain Y max. Otherwise, if it is necessary to obtain Y min , the values of the factors are reduced by the value of the movement step.

The procedure is repeated, successively increasing the number of steps until the desired value of the optimization parameter (Y) is reached. Each of the factors after g steps will matter:

If Y® max X i =X i c +gl i ` ’

if Y® min . X i =X i c -gl i ` .(5.36)

The gradient method and its variations are among the most common methods for searching for the extremum of functions of several variables. The idea of the gradient method is to move each time in the direction of the greatest increase in the objective function in the process of searching for an extremum (to determine the maximum).

The gradient method involves calculating the first derivatives of the objective function from its arguments. It, like the previous ones, refers to approximate methods and allows, as a rule, not to reach the optimum point, but only to approach it beyond final number steps.

Rice. 4.11.

Rice. 4.12.

(two-dimensional case)

First, select the starting point. If in the one-dimensional case (see subsection 4.2.6) it was possible to

move only to the left or right (see Fig. 4.9), then in the multidimensional case the number of possible directions of movement is infinitely large. In Fig. 4.11, illustrating the case of two variables, with arrows emanating from the starting point A, showing various possible directions. Moreover, movement along some of them gives an increase in the value of the objective function relative to the point A(for example, directions 1-3), and in other directions leads to its decrease (directions 5-8). Considering that the position of the optimum point is unknown, the direction in which the objective function increases the fastest is considered to be the best. This direction is called gradient functions. Note that at each point coordinate plane the direction of the gradient is perpendicular to the tangent to the level line drawn through the same point.

IN mathematical analysis it has been proven that the components of the gradient vector of the function at =/(*, x 2, ..., x p) are its partial derivatives with respect to the arguments, i.e.

&ad/(x 1 ,x 2 ,.= (du/dhu,du/dh 2, ...,du/dh p). (4.20)

Thus, when searching for a maximum using the gradient method, at the first iteration, the gradient components are calculated using formulas (4.20) for the starting point and a working step is taken in the found direction, i.e. transition to new point -0)

Y" with coordinates:

1§gas1/(x (0)),

or in vector form

Where X- permanent or variable parameter, which determines the length of the working step, ?i>0. At the second iteration, they calculate again

the gradient vector is already for the new point.U, after which, by analogy,

according to the formula they go to point x^ > etc. (Fig. 4.12). For free To- th iteration we have

If not the maximum, but the minimum of the objective function is sought, then at each iteration a step is taken in the direction opposite to the direction of the gradient. This is called the direction of the antigradient. Instead of formula (4.22) in this case it will be

There are many varieties of the gradient method, differing in the choice of working step. You can, for example, go to each subsequent point when constant value X, and then

working step length - distance between adjacent points x^

their 1 " - will be proportional to the magnitude of the gradient vector. You can, on the contrary, at each iteration choose X such that the length of the working step remains constant.

Example. We need to find the maximum of a function

y = 110-2(l, -4) 2 -3(* 2 -5) 2.

Of course, using a necessary condition extremum, we immediately obtain the desired solution: X ] - 4; x 2= 5. However, on this simple example It is convenient to demonstrate the gradient method algorithm. Let's calculate the gradient of the objective function:

grad y = (du/dh-,du/dh 2) =(4(4 - *,); 6(5 - x 2)) and select the starting point

L*» = (x)°> = 0; 4°> = O).

The value of the objective function for this point, as can be easily calculated, is equal to y[x^ j = 3. Let us assume X= const = 0.1. Gradient value at a point

Zc (0) is equal to grad y|x^j = (16; 30). Then at the first iteration we obtain, according to formulas (4.21), the coordinates of the point

x 1)= 0 + 0.1 16 = 1.6; x^ = 0 + 0.1 30 = 3.

y(x (1)) = 110 - 2(1.6 - 4) 2 - 3(3 - 5) 2 = 86.48.

As you can see, it is significantly larger than the previous value. At the second iteration, we have from formulas (4.22):

1,6 + 0,1 4(4 - 1,6) = 2,56;