Biographies Characteristics Analysis

Central limit theorem for dummies. Unit process statistical model

In addition to theorems relating to the law of large numbers, there is another group of theorems that form the so-called central limit theorem. This group of theorems determines the conditions under which the normal distribution law arises. Such conditions are quite often encountered in practice, which, in fact, is the explanation for the fact that the normal law is most often used in random phenomena in practice. The difference between the forms of the central limit theorem consists in the formulation of different conditions imposed on the sum of the considered random variables. The most important place among all these forms belongs to Lyapunov's theorem.

Lyapunov's theorem. If a X 1 , X 2 , … , X n are independent random variables with finite mathematical expectations and variances, while none of the values ​​sharply differs from all the others in its value, i.e. has a negligible effect on the sum of these quantities, then with an unlimited increase in the number of random variables n, the law of distribution of their sum indefinitely approaches the normal one.

Consequence. If all random variables X 1 , X 2 , … , X n are equally distributed, then the distribution law of their sum indefinitely approaches the normal one with an unlimited increase in the number of terms.

Lyapunov's theorem is of great practical importance. Empirically, it was found that the approximation to the normal law is quite fast. Under the conditions of the Lyapunov theorem, the distribution law for the sum of even ten terms can already be considered normal.

There is a more complicated and more general form of Lyapunov's theorem.

General Lyapunov theorem. If a X 1 , X 2 , … , X n are independent random variables with mathematical expectations a i , variances σ 2 i , central moments of the third order t i and

then the distribution law of the sum X 1 + X 2 + … + X n at n approaches the normal indefinitely with expectation and dispersion .

The meaning of condition (2.1) is that in the sum of random variables there would not be a single term whose influence on the dispersion of the sum of variables would be overwhelmingly large compared to the influence of all other random variables. In addition, there should not be a large number of terms whose influence on the dispersion of the sum is very small compared to the total influence of the rest.

One of the earliest forms of the central limit theorem was Laplace's theorem.

Laplace's theorem. Let it be produced n independent experiments, in each of which an event BUT appears with a probability R, then for large n the approximate equality

(2.2)

where Y n is the number of occurrences of the event BUT in n experiments; q=1-p; F( X) is the Laplace function.

Laplace's theorem allows one to find approximately the probabilities of the values ​​of binomially distributed random variables for large values ​​of the quantity n. However, at the same time, the probability R should be neither small enough nor big enough.

For practical problems, formula (2.2) is often written in another form, namely

(2.3)

Example 2.1. The machine gives out for a shift n=1000 items, of which 3% are defective on average. Find approximately the probability that at least 950 good (without defects) items will be produced during a shift, if the items turn out to be good items independently of each other.

Decision . Let be Y- the number of good products. According to the task R= 1-0.03=0.97; number of independent experiments n=1000. We apply formula (2.3):

Example 2.2, In the conditions of the previous example, find out how many good products k must contain the box so that the probability of its overflow in one shift does not exceed 0.02.

Decision . It is clear from the condition that . Find from this condition the number k. We have
, i.e. .

According to the table of the Laplace function, by the value of 0.48 we find the argument equal to 2.07. We get
. ■

Example 2.3. In a bank, 16 people are standing at a certain cash desk to receive certain amounts of money. There are currently 4,000 den in this box office. units Sums X i , which must be paid to each of 20 people, are random variables with mathematical expectation t= 160 cash units and standard deviation σ = 70 den.un. Find the probability that there is not enough money in the cash drawer to pay everyone in line.

Decision . We apply Lyapunov's theorem for identically distributed random variables. the value n= 20 can be considered quite large, therefore, the total amount of payments Y= X 1 + X 2 + … + X 16 can be considered a random variable distributed according to the normal law with mathematical expectation t y= nt= 20 160= 3200 and standard deviation .

Limit theorems of probability theory

Chebyshev's inequality

Let us consider a number of statements and theorems from a large group of so-called limit theorems of probability theory, establishing a connection between the theoretical and experimental characteristics of random variables with a large number of tests on them. They form the basis of mathematical statistics. Limit theorems are conventionally divided into two groups. The first group of theorems, called law of large numbers, establishes the stability of the mean values, i.e. with a large number of trials, their average result ceases to be random and can be predicted with sufficient accuracy. The second group of theorems, called central limit, establishes the conditions under which the law of distribution of the sum of a large number of random variables approaches the normal one indefinitely.

First, consider the Chebyshev inequality, which can be used to: a) roughly estimate the probabilities of events associated with random variables whose distribution is unknown; b) proofs of a number of theorems of the law of large numbers.

Theorem 7.1. If the random variable X has mathematical expectation and variance DX, then the Chebyshev inequality

. (7.1)

Note that the Chebyshev inequality can be written in another form:

for frequencies or events in n independent trials, in each of which it can occur with a probability , whose variance is , the Chebyshev inequality has the form

Inequality (7.5) can be rewritten as

. (7.6)

Example 7.1. Using the Chebyshev inequality, estimate the probability that the deviation of a random variable X from its mathematical expectation will be less than three standard deviations, i.e. smaller .

Decision:

Assuming in formula (7.2), we obtain

This assessment is called three sigma rule.

Chebyshev's theorem

The main statement of the law of large numbers is contained in Chebyshev's theorem. In it and other theorems of the law of large numbers, the concept of "convergence of random variables in probability" is used.

random variables converge in probability to the value A (random or non-random), if for any the probability of an event at tends to unity, i.e.

(or ). Convergence in probability is symbolically written as follows:

It should be noted that convergence in probability requires that the inequality hold for the vast majority of members sequences (in mathematical analysis - for all n > N, where N- a certain number), and for almost all members of the sequence must fall into ε- neighborhood BUT.

Theorem 7.3 (Law of large numbers in the form of P.L. Chebyshev). If random variables are independent and there is a number C> 0, which , then for any

, (7.7)

those. the arithmetic mean of these random variables converges in probability to the arithmetic mean of their mathematical expectations:

.

Proof. Since then

.

Then, applying the Chebyshev inequality (7.2) to the random variable, we have

those. the arithmetic mean of random variables converges in probability to the mathematical expectation a:

Proof. As

and the variances of random variables , i.e., are bounded, then applying the Chebyshev theorem (7.7), we obtain assertion (7.9).

The corollary of Chebyshev's theorem justifies the principle of "arithmetic mean" of random variables Х i constantly used in practice. Yes, let it be done n independent measurements of some quantity, the true value of which a(it is unknown). The result of each measurement is a random variable Х i. According to the corollary, as an approximate value of the quantity a you can take the arithmetic mean of the measurement results:

.

Equality is the more accurate, the more n.

Chebyshev's theorem is also based on the widely used in statistics sampling method, the essence of which is that the quality of a large amount of homogeneous material can be judged by its small sample.

Chebyshev's theorem confirms the connection between randomness and necessity: the average value of a random variable practically does not differ from a non-random variable.

Bernoulli's theorem

Bernoulli's theorem is historically the first and simplest form of the law of large numbers. It theoretically substantiates the stability property of the relative frequency.

Theorem 7.4 (Law of large numbers in the form of J. Bernoulli). If the probability of an event occurring BUT in one test is R, the number of occurrence of this event at n independent trials is equal to , then for any number we have the equality

, (7.10)

i.e. the relative frequency of the event BUT converges in probability to probability R events BUT: .

Proof. We introduce random variables as follows: if in i-th trial an event occurred BUT, and if it does not appear, then . Then the number BUT(number of successes) can be represented as

The mathematical expectation and variance of random variables are: , . The law of distribution of random variables X i has the form

Х i
R R

for any i. Thus, random variables X i independent, their variances are limited to the same number , since

.

Therefore, Chebyshev's theorem can be applied to these random variables

.

,

Hence, .

Bernoulli's theorem theoretically substantiates the possibility of an approximate calculation of the probability of an event using its relative frequency. So, for example, the relative frequency of this event can be taken as the probability of having a girl, which, according to statistical data, is approximately equal to 0.485.

Chebyshev's inequality (7.2) for random variables

takes the form

where pi- event probability BUT in i- m test.

Example 7.2. The probability of a typographical error on one page of the manuscript is 0.2. Estimate the probability that in a manuscript containing 400 pages, the frequency of occurrence of a misprint differs from the corresponding probability modulo less than 0.05.

Decision:

We use formula (7.11). In this case , , , . We have , i.e. .

Central limit theorem

The central limit theorem is the second group of limit theorems that establish a connection between the distribution law of the sum of a random variable and its limiting form - the normal distribution law.

Let us formulate the central limit theorem for the case when the terms of the sum have the same distribution. This theorem is most often used in practice. In mathematical statistics, sample random variables have the same distributions, since they are obtained from the same general population.

Theorem 7.5. Let random variables be independent, equally distributed, have finite mathematical expectation and variance , . Then the distribution function of the centered and normalized sum of these random variables tends as to the distribution function of the standard normal random variable.

Since many random variables in applications are formed under the influence of several weakly dependent random factors, their distribution is considered normal. In this case, the condition must be observed that none of the factors is dominant. Central limit theorems in these cases justify the application of the normal distribution.

Encyclopedic YouTube

  • 1 / 5

    Let there be an infinite sequence of independent identically distributed random variables with a finite mathematical expectation and variance . Denote the last µ (\displaystyle \mu ) and σ 2 (\displaystyle \sigma ^(2)), respectively. Let also

    . S n − μ n σ n → N (0 , 1) (\displaystyle (\frac (S_(n)-\mu n)(\sigma (\sqrt (n))))\to N(0,1) ) by distribution at ,

    where N (0 , 1) (\displaystyle N(0,1))- normal distribution with zero mathematical expectation and standard deviation equal to one. Denoting the sample mean of the first n (\displaystyle n) quantities, that is X ¯ n = 1 n ∑ i = 1 n X i (\displaystyle (\bar (X))_(n)=(\frac (1)(n))\sum \limits _(i=1)^( n)X_(i)), we can rewrite the result of the central limit theorem in the following form:

    n X ¯ n − μ σ → N (0 , 1) (\displaystyle (\sqrt (n))(\frac ((\bar (X))_(n)-\mu )(\sigma ))\to N(0,1)) by distribution at n → ∞ (\displaystyle n\to \infty ).

    The rate of convergence can be estimated using the Berry- Esseen inequality.

    Remarks

    • Informally speaking, the classical central limit theorem states that the sum n (\displaystyle n) independent identically distributed random variables has a distribution close to N (n μ , n σ 2) (\displaystyle N(n\mu ,n\sigma ^(2))). Equivalently, X¯n (\displaystyle (\bar(X))_(n)) has a distribution close to N (μ , σ 2 / n) (\displaystyle N(\mu ,\sigma ^(2)/n)).
    • Since the distribution function of the standard normal distribution is continuous, convergence to this distribution is equivalent to pointwise convergence of the distribution functions to the distribution function of the standard normal distribution. Putting Z n = S n − μ n σ n (\displaystyle Z_(n)=(\frac (S_(n)-\mu n)(\sigma (\sqrt (n))))), we get F Z n (x) → Φ (x) , ∀ x ∈ R (\displaystyle F_(Z_(n))(x)\to \Phi (x),\;\forall x\in \mathbb (R) ), where Φ (x) (\displaystyle \Phi (x)) is the distribution function of the standard normal distribution.
    • The classical formulation of the central limit theorem is proved by the method of characteristic functions (Levy's continuity theorem).
    • Generally speaking, the convergence of the densities does not follow from the convergence of distribution functions. Nevertheless, in this classical case, this is the case.

    Local C.P.T.

    Under the assumptions of the classical formulation, suppose in addition that the distribution of random variables ( X i ) i = 1 ∞ (\displaystyle \(X_(i)\)_(i=1)^(\infty )) absolutely continuous, that is, it has a density. Then the distribution is also absolutely continuous, and moreover,

    f Z n (x) → 1 2 π e − x 2 2 (\displaystyle f_(Z_(n))(x)\to (\frac (1)(\sqrt (2\pi )))\,e^ (-(\frac (x^(2))(2)))) at n → ∞ (\displaystyle n\to \infty ),

    where f Z n (x) (\displaystyle f_(Z_(n))(x))- density of random variable Z n (\displaystyle Z_(n)), and on the right side is the density of the standard normal distribution.

    Generalizations

    The result of the classical central limit theorem is valid for situations much more general than complete independence and equal distribution.

    C. P. T. Lindeberg

    Let independent random variables X 1 , … , X n , … (\displaystyle X_(1),\ldots ,X_(n),\ldots ) are defined on the same probability space and have finite mathematical expectations and variances: E [ X i ] = μ i , D [ X i ] = σ i 2 (\displaystyle \mathbb (E) =\mu _(i),\;\mathrm (D) =\sigma _(i)^( 2)).

    Let be S n = ∑ i = 1 n X i (\displaystyle S_(n)=\sum \limits _(i=1)^(n)X_(i)).

    Then E [ S n ] = m n = ∑ i = 1 n μ i , D [ S n ] = s n 2 = ∑ i = 1 n σ i 2 (\displaystyle \mathbb (E) =m_(n)=\sum \ limits _(i=1)^(n)\mu _(i),\;\mathrm (D) =s_(n)^(2)=\sum \limits _(i=1)^(n)\ sigma _(i)^(2)).

    And let it run Lindeberg condition:

    ∀ ε > 0 , lim n → ∞ ∑ i = 1 n E [ (X i − μ i) 2 s n 2 1 ( | X i − μ i | > ε s n ) ] = 0 , (\displaystyle \forall \varepsilon >0,\;\lim \limits _(n\to \infty )\sum \limits _(i=1)^(n)\mathbb (E) \left[(\frac ((X_(i)-\ mu _(i))^(2))(s_(n)^(2)))\,\mathbf (1) _(\(|X_(i)-\mu _(i)|>\varepsilon s_ (n)\))\right]=0,)

    where 1 ( | X i − μ i | > ε s n ) (\displaystyle \mathbf (1) _(\(|X_(i)-\mu _(i)|>\varepsilon s_(n)\))) function - indicator.

    by distribution at n → ∞ (\displaystyle n\to \infty ).

    Ts. P. T. Lyapunova

    Let the basic assumptions of Ts. P. T. Lindeberg be fulfilled. Let random variables ( X i ) (\displaystyle \(X_(i)\)) have a finite third moment. Then the sequence

    r n 3 = ∑ i = 1 n E [ | X i − μ i | 3 ] (\displaystyle r_(n)^(3)=\sum _(i=1)^(n)\mathbb (E) \left[|X_(i)-\mu _(i)|^(3 )\right]).

    If limit

    lim n → ∞ r n s n = 0 (\displaystyle \lim \limits _(n\to \infty )(\frac (r_(n))(s_(n)))=0) (Lyapunov condition), S n − m n s n → N (0 , 1) (\displaystyle (\frac (S_(n)-m_(n))(s_(n)))\to N(0,1)) by distribution at n → ∞ (\displaystyle n\to \infty ).

    C.P.T. for martingales

    Let the process (X n) n ∈ N (\displaystyle (X_(n))_(n\in \mathbb (N) )) is a martingale with bounded increments. In particular, let us assume that

    E [ X n + 1 − X n ∣ X 1 , … , X n ] = 0 , n ∈ N , X 0 ≡ 0 , (\displaystyle \mathbb (E) \left=0,\;n\in \mathbb (N) ,\;X_(0)\equiv 0,)

    and the increments are uniformly bounded, that is

    ∃ C > 0 ∀ n ∈ N | X n + 1 − X n | ≤ C (\displaystyle \exists C>0\,\forall n\in \mathbb (N) \;|X_(n+1)-X_(n)|\leq C) τ n = min ( k | ∑ i = 1 k σ i 2 ≥ n ) (\displaystyle \tau _(n)=\min \left\(k\left\vert \;\sum _(i=1)^ (k)\sigma _(i)^(2)\geq n\right.\right\)). X τ n n → N (0 , 1) (\displaystyle (\frac (X_(\tau _(n)))(\sqrt (n)))\to N(0,1)) by distribution at n → ∞ (\displaystyle n\to \infty ).

    Charles Whelan Chapter from a book
    Publishing house "Mann, Ivanov and Ferber"

    Finally, it is time to sum up what has been said. Because the sample means are normally distributed (thanks to the central limit theorem), we can take advantage of the rich potential of the bell curve. We expect that approximately 68% of the means of all samples will be within one standard error of the population mean; 95% - at a distance not exceeding two standard errors; and 99.7% - at a distance not exceeding three standard errors.

    Now let's return to the deviation (scatter) in the example with the missing bus - however, this time we will call for help not intuition, but numbers. (In itself, this example remains absurd; in the next chapter, we will look at many more realistic cases.) Suppose that the organizers of the Americans "Changing Lives study invited all of its participants to Boston for a weekend to have fun and at the same time provide some missing data: Participants are randomly assigned to buses and taken to the test center, where they will be weighed, measured for height, etc. To the dismay of the organizers of the event, one of the buses goes missing somewhere on the way to the test center. about the same time, returning in your car from the Sausage Lovers Festival, you notice a broken down bus on the side of the road, it looks like its driver was forced to swerve to avoid a moose that suddenly appeared on the road From such a sharp maneuver, all passengers lost consciousness or lost the gift of speech, although none of them, unfortunately astya, did not receive serious injuries. (I need to make this assumption solely for the sake of clarity in the example here, and the hope that the passengers would not be seriously injured is due to my innate philanthropy.) The ambulance doctors who arrived promptly on the scene told you that the average weight of 62 passengers on the bus is 194 pounds. In addition, it turned out (to the great relief of all animal lovers) that the elk, from which the bus driver tried to dodge, was practically not injured (except for a slight bruise of the hind leg), but also lost consciousness from a strong fright and lies next to the bus .

    Fortunately, you know the average weight of the bus passengers, as well as the standard deviation for the entire population of Americans "Changing Lives. In addition, we have a general understanding of the central limit theorem and we know how to give first aid to an injured animal. The average weight of participants in the Americans study "Changing Lives is 162 pounds; the standard deviation is 36. Based on this information, you can calculate the standard error for a sample of 62 people (number of bus passengers who passed out): .

    The difference between this sample mean (194 pounds) and the population mean (162 pounds) is 32 pounds, well over three standard errors. You know from the central limit theorem that 99.7% of the means of all samples will be within three standard errors of the population mean. Thus, it is highly unlikely that the bus you encounter is carrying a group of participants in the Americans' Changing Lives study. As a prominent community activist in the city, you call the organizers of the event to report that some other group of people is most likely on the bus you encounter. However, in this case, you can rely on statistical results, and not your “intuitive guesses.” You tell the organizers that you deny the probability that the bus you found is the one they are looking for, with a 99.7% confidence level. in this case, you are talking to people who are familiar with statistics, then you can be sure they understand that you are right.(Always nice to deal with smart people!)

    Your findings are further supported when paramedics take blood samples from bus passengers and find that their mean cholesterol levels are five standard errors higher than the mean cholesterol levels of the participants in the Americans' Changing Lives study. unconscious passengers - participants of the Sausage Lovers Festival (this was later irrefutably proven).

    [This story had a happy ending. When the bus passengers regained consciousness, the organizers of the Americans "Changing Lives study advised them to consult nutritionists about the dangers of eating foods high in saturated fat. After such consultations, many of the sausage lovers decided to break with their shameful past and return to a healthier diet. The injured moose was taken out at a local veterinary clinic and released to freedom under the approving exclamations of members of the local Society for the Protection of Animals. Yes, for some reason history is silent about the fate of the bus driver. Perhaps because statistics do not deal with the fate of individual people. Moose - It's quite another thing, it will not be possible to silence his fate!

    In this chapter, I have tried to talk only about the basics. You may have noticed that the central limit theorem only applies in cases where the sample size is large enough (usually at least 30). Also, we need a relatively large sample if we are going to assume that its standard deviation will be about the same as the population standard deviation.

    There are quite a few statistical corrections that can be applied if these conditions are not met, but it's all like icing on a cake (and maybe even chocolate chips that are sprinkled on top of this icing). The "big picture" here is simple and extremely effective.

    1. If you form large (by volume) random samples based on any population, then their averages will be distributed according to the normal law near the average of the corresponding population (whatever the distribution of the original population).
    2. Most of the sample means will be located close enough to the population mean (what exactly should be considered "close enough" in this or that case is determined by the standard error).
    3. The central limit theorem tells us about the probability that the sample mean will be within a certain distance of the population mean. It is relatively unlikely that the sample mean will be more than two standard errors away from the population mean, and it is extremely unlikely that the sample mean will be more than three standard errors away from the population mean.
    4. The less likely it is that some outcome was purely random, the more we can be sure that it was not without the influence of some other factor.

    This, by and large, is the essence of statistical inference. The central limit theorem basically makes all this possible. And until LeBron James wins as many NBA champions as Michael Jordan (six), the central limit theorem will impress us much more than the famous basketball player.

    LeBron Raymone James is an American professional basketball player who plays as a small and power forward for the NBA's Cleveland Cavaliers. Note. transl.

    Note the very ingenious use of false precision in this case.

    When the standard deviation of the corresponding population is calculated based on a smaller sample, the formula we have given is somewhat modified: This helps to account for the fact that the variance in a small sample may "underestimate" the variance of the entire population. This has little to do with the more universal provisions discussed in this chapter.

    My colleague at the University of Chicago, Jim Sally, made a very important critique of the missing bus examples. He pointed out that a missing bus is extremely rare these days. Therefore, if we have to look for some missing bus, then any bus that we meet, which turns out to be missing or broken, will most likely be the bus that interests us, whatever the weight of the passengers in this bus. Maybe Jim is right. (To use this analogy, if you lost your child in a supermarket and the store's management radios that someone's lost child is standing near checkout number six, then you will probably immediately decide that it is your child.) Therefore, we have no choice but to add another element of absurdity to our examples, believing that the loss of the bus is a completely ordinary event.

    Plan:

    1. The concept of the central limit theorem (Lyapunov's theorem)

    2. Law of large numbers, probability and frequency (theorems of Chebyshev and Bernoulli)

    1. The concept of the central limit theorem.

    The normal distribution of probabilities is of great importance in probability theory. The normal law obeys the probability when shooting at a target, in measurements, etc. In particular, it turns out that the distribution law for the sum of a sufficiently large number of independent random variables with arbitrary distribution laws is close to the normal distribution. This fact is called the central limit theorem or Lyapunov's theorem.

    It is known that normally distributed random variables are widely used in practice. What explains this? This question has been answered

    Central limit theorem. If a random variable X is the sum of a very large number of mutually independent random variables, the influence of each of which on the whole sum is negligible, then X has a distribution close to the normal distribution.

    Example. Let some physical quantity be measured. Any measurement gives only an approximate value of the measured quantity, since many independent random factors (temperature, instrument fluctuations, humidity, etc.) influence the measurement result. Each of these factors generates a negligible "partial error". However, since the number of these factors is very large, their cumulative effect generates an already noticeable "total error".

    Considering the total error as the sum of a very large number of mutually independent partial errors, we can conclude that the total error has a distribution close to the normal distribution. Experience confirms the validity of this conclusion.

    Consider the conditions under which the "central limit theorem" is satisfied

    x1,X2, ..., Xn is a sequence of independent random variables,

    M(X1),M(X2), ...,M(Xn) are the final mathematical expectations of these quantities, respectively equal to M(Xk)= ak

    D (X1),D(X2), ...,D(Xn) - their final variances, respectively equal to D(X k)= bk2

    We introduce the notation: S= X1+X2 + ...+Xn;

    A k= X1+X2 + ...+Xn=; B2=D (X1)+D(X2)+ ...+D(Xn) =

    We write the distribution function of the normalized sum:

    They say to the sequence x1,X2, ..., Xn the central limit theorem is applicable if, for any x the distribution function of the normalized sum as n ® ¥ tends to the normal distribution function:

    Right "style="border-collapse:collapse;border:none;margin-left:6.75pt;margin-right: 6.75pt">

    Consider a discrete random variable X, given by the distribution table:

    Let us set ourselves the task of estimating the probability that the deviation of a random variable from its mathematical expectation does not exceed in absolute value a positive number ε

    If a ε small enough, we will thus estimate the probability that X will take values ​​close enough to its mathematical expectation. proved an inequality that allows us to give the estimate of interest to us.

    Lemma Chebyshev. Given a random variable X that takes only non-negative values ​​with the expectation M(X). For any number α>0, the expression takes place:

    Chebyshev's inequality. The probability that the deviation of a random variable X from its mathematical expectation in absolute value is less than a positive number ε , not less than 1 – D(X) / ε 2:

    P(|X-M(X)|< ε ) ³ 1 - D (X) / ε 2.

    Comment. Chebyshev's inequality is of limited practical value, since it often gives a rough and sometimes trivial (of no interest) estimate.

    The theoretical significance of the Chebyshev inequality is very large. Below we will use this inequality to derive the Chebyshev theorem.

    2.2. Chebyshev's theorem

    If X1, X2, ..., Xn.. are pairwise independent random variables, and their variances are uniformly limited (do not exceed a constant number C), then, no matter how small the positive number ε , the probability of inequality

    ÷ (X1+X2 + ...+Xn) / n - (M(X1)+M(X2)+ ...+M(Xn))/n |< ε

    will be arbitrarily close to unity if the number of random variables is large enough.

    P (÷ (X1+X2 + ...+Xn) / n - (M(X1)+M(X2)+ ...+M(Xn))/n |< ε )=1.

    Chebyshev's theorem states:

    1. We consider a sufficiently large number of independent random variables with limited variances,

    When formulating Chebyshev's theorem, we assumed that random variables have different mathematical expectations. In practice, it often happens that random variables have the same mathematical expectation. Obviously, if we again assume that the dispersions of these quantities are limited, then Chebyshev's theorem will be applicable to them.

    Let us denote the mathematical expectation of each of the random variables through a;

    In the case under consideration, the arithmetic mean of mathematical expectations, as it is easy to see, is also equal to a.

    One can formulate Chebyshev's theorem for the particular case under consideration.

    "If X1, X2, ..., Xn.. are pairwise independent random variables having the same mathematical expectation a, and if the dispersions of these variables are uniformly limited, then, no matter how small the number ε > Oh, the probability of inequality

    ÷ (X1+X2 + ...+Xn) / n - a | < ε

    will be arbitrarily close to unity if the number of random variables is large enough" .

    In other words, under the conditions of the theorem

    P (÷ (X1+X2 + ...+Xn) / n - a |< ε ) = 1.

    2.3. Essence of Chebyshev's theorem

    Although individual independent random variables may take values ​​that are far from their mathematical expectations, the arithmetic mean of a sufficiently large number of random variables with a high probability takes values ​​close to a certain constant number, namely the number

    (M(Xj) + M (X2)+... + M (Xn))/n or to the number and in particular case.

    In other words, individual random variables can have a significant spread, and their arithmetic mean is scattered small.

    Thus, one cannot confidently predict what possible value each of the random variables will take, but one can predict what value their arithmetic mean will take.

    So, the arithmetic mean of a sufficiently large number of independent random variables (the variances of which are uniformly limited) loses the character of a random variable.

    This is explained by the fact that the deviations of each of the quantities from their mathematical expectations can be both positive and negative, and in the arithmetic mean they cancel each other out.

    Chebyshev's theorem is valid not only for discrete, but also for continuous random variables; it is an example confirming the validity of the doctrine of the connection between chance and necessity.

    2.4. Significance of Chebyshev's theorem for practice

    Let us give examples of the application of the Chebyshev theorem to the solution of practical problems.

    Usually, to measure a certain physical quantity, several measurements are made and their arithmetic mean is taken as the desired size. Under what conditions can this method of measurement be considered correct? The answer to this question is given by Chebyshev's theorem (its particular case).

    Indeed, consider the results of each measurement as random variables

    X1, X2, ..., Xn

    To these quantities, the Chebyshev theorem can be applied if:

    1) They are pairwise independent.

    2) have the same mathematical expectation,

    3) their dispersions are uniformly limited.

    The first requirement is satisfied if the result of each measurement does not depend on the results of the others.

    The second requirement is met if the measurements are made without systematic (one sign) errors. In this case, the mathematical expectations of all random variables are the same and equal to the true size a.

    The third requirement is met if the device provides a certain measurement accuracy. Although the results of individual measurements are different, their scattering is limited.

    If all these requirements are met, we have the right to apply the Chebyshev theorem to the measurement results: for a sufficiently large P probability of inequality

    | (X1 + Xa+...+Xn)/n - a |< ε arbitrarily close to unity.

    In other words, with a sufficiently large number of measurements, it is almost certain that their arithmetic mean differs arbitrarily little from the true value of the measured quantity.

    Chebyshev's theorem indicates the conditions under which the described method of measurement can be applied. However, it is a mistake to think that, by increasing the number of measurements, one can achieve an arbitrarily high accuracy. The fact is that the device itself gives readings only with an accuracy of ± α, therefore, each of the measurement results, and hence their arithmetic mean, will be obtained only with an accuracy not exceeding the accuracy of the device.

    The sampling method widely used in statistics is based on the Chebyshev theorem, the essence of which is that a relatively small random sample is used to judge the entire population (general population) of the objects under study.

    For example, the quality of a bale of cotton is judged by a small bundle consisting of fibers randomly selected from different parts of the bale. Although the number of fibers in a bundle is much less than in a bale, the bundle itself contains a fairly large number of fibers, numbering in the hundreds.

    As another example, one can point to the determination of grain quality from a small sample. And in this case, the number of randomly selected grains is small compared to the entire mass of the grain, but in itself it is quite large.

    Already from the examples cited, one can conclude that for practice Chebyshev's theorem is of inestimable importance.

    2.5. TheoremBernoulli

    Produced P independent tests (not events, but tests). In each of them, the probability of occurrence of an event A is equal to R.

    The question arises, what will be the relative frequency of occurrence of the event? This question is answered by the theorem proved by Bernoulli, which was called the "law of large numbers" and laid the foundation for the theory of probability as a science.

    Bernoulli's theorem. If in each of P independent test probability R occurrence of an event BUT is constant, then the probability that the deviation of the relative frequency from the probability R will be arbitrarily small in absolute value if the number of trials is large enough.

    In other words, if ε >0 is an arbitrarily small number, then under the conditions of the theorem we have the equality

    P(|m / n - p|< ε)= 1

    Comment. It would be wrong, on the basis of Bernoulli's theorem, to conclude that with an increase in the number of trials, the relative frequency steadily tends to the probability R; in other words, Bernoulli's theorem does not imply the equality (t/n) = p,

    AT The theorem deals only with the probability that, with a sufficiently large number of trials, the relative frequency will differ arbitrarily little from the constant probability of the occurrence of an event in each trial.

    Task 7-1.

    1. Estimate the probability that after 3600 throws of the die, the number of occurrences of 6 will be at least 900.

    Decision. Let x be the number of occurrences of 6 points in 3600 coin tosses. The probability of getting 6 points in one toss is p=1/6, then M(x)=3600 1/6=600. We use Chebyshev's inequality (lemma) for a given α = 900

    = P(x³ 900) £ 600 / 900 =2 / 3

    Answer 2 / 3.

    2. 1000 independent tests were carried out, p=0.8. Find the probability of the number of occurrences of event A in these tests deviates from its mathematical expectation modulo less than 50.

    Decision. x is the number of occurrences of event A in n - 1000 trials.

    M (X) \u003d 1000 0.8 \u003d 800. D(x)=100 0.8 0.2=160

    We use the Chebyshev inequality for a given ε = 50

    P(|x-M(x)|< ε) ³ 1 - D (x) / ε 2

    R(|x-800|< 50) ³ / 50 2 = 1-160 / 2500 = 0,936.

    Answer. 0,936

    3. Using the Chebyshev inequality, estimate the probability that |X - M(X)|< 0,1, если D (X) = 0,001. Ответ Р³0,9.

    4. Given: P(|X- M(X)\< ε) ³ 0.9; D (X)= 0.004. Using the Chebyshev inequality, find ε . Answer. 0,2.

    Control questions and tasks

    1. Purpose of the central limit theorem

    2. Conditions for applicability of Lyapunov's theorem.

    3. The difference between the lemma and Chebyshev's theorem.

    4. Conditions for the applicability of the Chebyshev theorem.

    5. Conditions for the applicability of Bernoulli's theorem (the law of large numbers)

    Requirements for knowledge and skills

    The student must know the general semantic formulation of the central limit theorem. Be able to formulate partial theorems for independent identically distributed random variables. Understand the Chebyshev inequality and the law of large numbers in the Chebyshev form. Have an idea about the frequency of an event, the relationship between the concepts of "probability" and "frequency". Have an understanding of the law of large numbers in the form of Bernoulli.

    (1857-1918), outstanding Russian mathematician