Independent random variables. Operations on random variables

To solve many practical problems, it is necessary to know a set of conditions due to which the result of the combined effect of a large number of random factors is almost independent of the case. These conditions are described in several theorems, collectively called the law of large numbers, where the random variable k is equal to 1 or 0, depending on whether the result of the kth trial is success or failure. Thus, Sn is the sum of n mutually independent random variables, each of which takes the values 1 and 0 with probabilities p and q.

The simplest form of the law of large numbers is Bernoulli's theorem, which states that if the probability of an event is the same in all trials, then as the number of trials increases, the frequency of the event tends to the probability of the event and ceases to be random.

Poisson's theorem claims that the frequency of an event in a series of independent trials tends to the arithmetic mean of its probabilities and ceases to be random.

The limit theorems of the theory of probability, the theorems of Moivre-Laplace explain the nature of the stability of the frequency of occurrence of an event. This nature consists in the fact that the limiting distribution of the number of occurrences of an event with an unlimited increase in the number of trials (if the probability of an event in all trials is the same) is a normal distribution.

Central limit theorem explains the widespread distribution of the normal distribution. The theorem states that whenever a random variable is formed as a result of adding a large number of independent random variables with finite variances, the law of distribution of this random variable turns out to be practically a normal law.

Lyapunov's theorem explains the wide distribution of the normal distribution law and explains the mechanism of its formation. The theorem allows us to assert that whenever a random variable is formed as a result of adding a large number of independent random variables, the variances of which are small compared to the variance of the sum, the distribution law of this random variable turns out to be practically a normal law. And since random variables are always generated by an infinite number of causes, and most often none of them has a variance comparable to the variance of the random variable itself, most of the random variables encountered in practice are subject to the normal distribution law.

The qualitative and quantitative statements of the law of large numbers are based on Chebyshev's inequality. It defines the upper bound on the probability that the deviation of the value of a random variable from its mathematical expectation is greater than some given number. It is remarkable that the Chebyshev inequality gives an estimate of the probability of an event for a random variable whose distribution is unknown, only its mathematical expectation and variance are known.

Chebyshev's inequality. If a random variable x has variance, then for any x > 0 the following inequality holds, where M x and D x - mathematical expectation and variance of the random variable x .

Bernoulli's theorem. Let x n be the number of successes in n Bernoulli trials and p the probability of success in a single trial. Then for any s > 0 it is true.

Lyapunov's theorem. Let s 1 , s 2 , …, s n , … be an unlimited sequence of independent random variables with mathematical expectations m 1 , m 2 , …, m n , … and variances s 1 2 , s 2 2 , …, s n 2 … . Let's denote.

Then = Ф(b) - Ф(a) for any real numbers a and b, where Ф(x) is the distribution function of the normal law.

Let a discrete random variable be given. Consider the dependence of the number of successes Sn on the number of trials n. With each trial, Sn increases by 1 or 0. This statement can be written as:

Sn = 1 +…+ n . (1.1)

Law of Large Numbers. Let ( k ) be a sequence of mutually independent random variables with identical distributions. If the mathematical expectation = M(k) exists, then for any > 0 for n

In other words, the probability that the mean S n /n differs from the mathematical expectation by less than an arbitrarily given one tends to one.

Central limit theorem. Let ( k ) be a sequence of mutually independent random variables with identical distributions. Let's assume that and exist. Let Sn = 1 +…+ n , Then for any fixed

Ф () - Ф () (1.3)

Here Φ(x) is the normal distribution function. This theorem was formulated and proved by Linlberg. Lyapunov and other authors proved it earlier, under more restrictive conditions. It must be imagined that the theorem stated above is only a very special case of a much more general theorem, which in turn is closely related to many other limit theorems. Note that (1.3) is much stronger than (1.2), since (1.3) gives an estimate for the probability that the difference is greater than. On the other hand, the law of large numbers (1.2) is true even if the random variables k do not have finite variance, so it applies to a more general case than the central limit theorem (1.3). We illustrate the last two theorems with examples.

Examples. a) Consider a sequence of independent throws of a symmetrical die. Let k be the number of points rolled on the kth toss. Then

M(k)=(1+2+3+4+5+6)/6=3.5,

a D(k)=(1 2 +2 2 +3 2 +4 2 +5 2 +6 2)/6-(3.5) 2 =35/12 and S n /n

is the average number of points resulting from n rolls.

The law of large numbers states that it is plausible that for large n this average will be close to 3.5. The Central Limit Theorem establishes the probability that |Sn -- 3.5n |< (35n/12) 1/2 близка к Ф() -- Ф(-). При n = 1000 и а=1 мы находим, что вероятность неравенства 3450 < Sn < 3550 равна примерно 0,68. Выбрав для а значение а 0 = 0,6744, удовлетворяющее соотношению Ф(0)-- Ф(-- 0)=1/2, мы получим, что для Sn шансы находиться внутри или вне интервала 3500 36 примерно одинаковы.

b) Sample. Suppose that in the general population,

consisting of N families, Nk families have exactly k children

(k = 0, 1 ...; Nk = N). If a family is chosen at random, then the number of children in it is a random variable that takes on a value with probability p = N/N. With recursive selection, one can consider a sample of size n as a collection of n independent random variables or "observations" 1 , ..., n that all have the same distribution; S n /n is the sample mean. The law of large numbers states that for a sufficiently large random sample, its mean is likely to be close to, i.e., the population mean. The central limit theorem allows you to estimate the likely amount of discrepancy between these means and determine the sample size required for a reliable estimate. In practice, and and are usually unknown; however, in most cases it is easy to obtain a preliminary estimate for and can always be placed in reliable bounds. If we want the sample mean S n /n to differ from the unknown population mean by less than 1/10 with a probability of 0.99 or more, then the sample size should be taken such that

The root x of the equation Ф(х) - Ф(-- x) = 0.99 is equal to x = 2.57 ..., and, therefore, n must be such that 2.57 or n > 660. Careful pre-estimation makes it possible to find the required sample size.

c) Poisson distribution.

Assume that random variables k have a Poisson distribution (p(k;)). Then Sn has a Poisson distribution with mean and variance equal to n.

Writing instead of n, we conclude that for n

The summation is performed over all k from 0 to. F-la (1.5) also takes place when arbitrarily.

Let the standard deviations of several mutually independent random variables be known. How to find the standard deviation of the sum of these quantities? The answer to this question is given by the following theorem.

Theorem. The standard deviation of the sum of a finite number of mutually independent random variables is equal to the square root of the sum of the squared standard deviations of these variables.

Proof. Denote by X the sum of the considered mutually independent quantities:

The variance of the sum of several mutually independent random variables is equal to the sum of the variances of the terms (see § 5, corollary 1), so

or finally

Equally distributed mutually independent random variables

It is already known that according to the distribution law, one can find the numerical characteristics of a random variable. It follows that if several random variables have the same distributions, then their numerical characteristics are the same.

Consider P mutually independent random variables X v X v ..., X fi , which have the same distributions and, consequently, the same characteristics (mathematical expectation, variance, etc.). Of greatest interest is the study of the numerical characteristics of the arithmetic mean of these quantities, which we will do in this section.

Let us denote the arithmetic mean of the considered random variables as X:

The following three provisions establish a relationship between the numerical characteristics of the arithmetic mean X and the corresponding characteristics of each individual quantity.

1. The mathematical expectation of the arithmetic mean of identically distributed mutually independent random variables is equal to the mathematical expectation a of each of the variables:

Proof. Using the properties of the mathematical expectation (the constant factor can be taken out of the sign of the mathematical expectation; the mathematical expectation of the sum is equal to the sum of the mathematical expectations of the terms), we have

Taking into account that the mathematical expectation of each of the quantities by the condition is equal to a, we get

2. The variance of the arithmetic mean of n identically distributed mutually independent random variables is n times less than the variance D of each of the variables:

Proof. Using the properties of the variance (the constant factor can be taken out of the variance sign by squaring it; the variance of the sum of independent variables is equal to the sum of the variances of the terms), we have

§ 9. Equally distributed mutually independent random variables 97

Taking into account that the variance of each of the quantities is conditionally equal to D, we obtain

3. The standard deviation of the arithmetic mean of n identically distributed mutually independent random

values are 4n times less than the standard deviation a of each of the values:

Proof. As D(X) = D/n, then the standard deviation X equals

The general conclusion from formulas (*) and (**): remembering that the variance and standard deviation serve as measures of the dispersion of a random variable, we conclude that the arithmetic mean of a sufficiently large number of mutually independent random variables has

much less scattering than each individual value.

Let us explain by an example the significance of this conclusion for practice.

Example. Usually, to measure a certain physical quantity, several measurements are made, and then the arithmetic mean of the obtained numbers is found, which is taken as an approximate value of the measured quantity. Assuming that the measurements are made under the same conditions, prove:

a) the arithmetic mean gives a more reliable result than individual measurements;
b) with an increase in the number of measurements, the reliability of this result increases.

Solution, a) It is known that individual measurements give different values of the measured quantity. The result of each measurement depends on many random factors (temperature changes, device fluctuations, etc.), which cannot be completely taken into account in advance.

Therefore, we have the right to consider the possible results P individual measurements as random variables X v X 2,..., X p(the index indicates the measurement number). These quantities have the same probability distribution (measurements are made using the same technique and the same instruments), and, consequently, the same numerical characteristics; in addition, they are mutually independent (the result of each individual measurement does not depend on the other measurements).

We already know that the arithmetic mean of such values has less dispersion than each individual value. In other words, the arithmetic mean is closer to the true value of the measured value than the result of a single measurement. This means that the arithmetic mean of several measurements gives a more case result than a single measurement.

b) We already know that as the number of individual random variables increases, the spread of the arithmetic mean decreases. This means that with an increase in the number of measurements, the arithmetic mean of several measurements differs less and less from the true value of the measured quantity. Thus, by increasing the number of measurements, a more reliable result is obtained.

For example, if the standard deviation of a single measurement is a = 6 m, and the total P= 36 measurements, then the standard deviation of the arithmetic mean of these measurements is only 1 m. Indeed,

We see that the arithmetic mean of several measurements, as expected, turned out to be closer to the true value of the measured value than the result of a single measurement.

Course work

on the topic: "Laws of large numbers"

Equally distributed random variables

Poisson's theorem states that the frequency of an event in a series of independent trials tends to the arithmetic mean of its probabilities and ceases to be random.

The central limit theorem explains the widespread use of the normal distribution law. The theorem states that whenever a random variable is formed as a result of adding a large number of independent random variables with finite variances, the law of distribution of this random variable turns out to be practically a normal law.

Lyapunov's theorem explains the wide distribution of the normal distribution law and explains the mechanism of its formation. The theorem allows us to assert that whenever a random variable is formed as a result of adding a large number of independent random variables, the variances of which are small compared to the variance of the sum, the distribution law of this random variable turns out to be practically a normal law. And since random variables are always generated by an infinite number of causes, and most often none of them has a variance comparable to the variance of the random variable itself, most of the random variables encountered in practice are subject to the normal distribution law.

Chebyshev's inequality. If a random variable x has a variance, then for any x > 0 the inequality is true, where M x and D x - mathematical expectation and variance of the random variable x .

Bernoulli's theorem. Let x n be the number of successes in n Bernoulli trials and p the probability of success in a single trial. Then, for any s > 0, .

Lyapunov's theorem. Let s 1 , s 2 , …, s n , … be an unlimited sequence of independent random variables with mathematical expectations m 1 , m 2 , …, m n , … and variances s 1 2 , s 2 2 , …, s n 2 … . Denote , , , .

Then = Ф(b) - Ф(a) for any real numbers a and b, where Ф(x) is the distribution function of the normal law.

Let a discrete random variable be given. Consider the dependence of the number of successes Sn on the number of trials n. With each trial, Sn increases by 1 or 0. This statement can be written as:

Sn = 1 +…+ n . (1.1)

The law of large numbers. Let ( k ) be a sequence of mutually independent random variables with identical distributions. If the mathematical expectation = M(k) exists, then for any > 0 for n

In other words, the probability that the mean S n /n differs from the mathematical expectation by less than arbitrarily given tends to one.

Central limit theorem. Let ( k ) be a sequence of mutually independent random variables with identical distributions. Let's assume that and exist. Let Sn = 1 +…+ n , Then for any fixed

F () - F () (1.3)

Here Φ(x) is the normal distribution function. This theorem was formulated and proved by Linlberg. Lyapunov and other authors proved it earlier, under more restrictive conditions. It must be imagined that the theorem stated above is only a very special case of a much more general theorem, which in turn is closely related to many other limit theorems. Note that (1.3) is much stronger than (1.2), since (1.3) gives an estimate for the probability that the difference is greater than . On the other hand, the law of large numbers (1.2) is true even if the random variables k do not have finite variance, so it applies to a more general case than the central limit theorem (1.3). We illustrate the last two theorems with examples.

Examples. a) Consider a sequence of independent throws of a symmetrical die. Let k be the number of points scored on the kth toss. Then

M( k)=(1+2+3+4+5+6)/6=3.5,

a D( k)=(1 2 +2 2 +3 2 +4 2 +5 2 +6 2)/6-(3.5) 2 =35/12 and S n /n

is the average number of points resulting from n rolls.

The law of large numbers states that it is plausible that for large n this average will be close to 3.5. The Central Limit Theorem establishes the probability that |Sn - 3.5n |< (35n/12) 1/2 близка к Ф() - Ф(- ). При n = 1000 и а=1 мы находим, что вероятность неравенства 3450 < Sn < 3550 равна примерно 0,68. Выбрав для а значение а 0 = 0,6744, удовлетворяющее соотношению Ф( 0)- Ф(- 0)=1/2, мы получим, что для Sn шансы находиться внутри или вне интервала 3500 36 примерно одинаковы.

b) Sample. Suppose that in the general population,

consisting of N families, Nk families have exactly k children

(k = 0, 1 ...; Nk = N). If a family is chosen at random, then the number of children in it is a random variable that takes on a value with probability p = N /N. With recursive selection, one can consider a sample of size n as a collection of n independent random variables or "observations" 1 , ..., n that all have the same distribution; S n /n is the sample mean. The law of large numbers states that for a sufficiently large random sample, its mean is likely to be close to , i.e., the mean of the population. The central limit theorem allows you to estimate the likely amount of discrepancy between these means and determine the sample size required for a reliable estimate. In practice, and and are usually unknown; however, in most cases it is easy to obtain a preliminary estimate for and can always be placed within reliable bounds. If we want the sample mean S n /n to differ from the unknown population mean by less than 1/10 with a probability of 0.99 or more, then the sample size should be taken such that

The root x of the equation Ф(х) - Ф(- x) = 0.99 is equal to x = 2.57 ..., and, therefore, n must be such that 2.57 or n > 660. Careful pre-estimation makes it possible to find the required sample size.

c) Poisson distribution.

Assume that random variables k have a Poisson distribution (p(k; )). Then Sn has a Poisson distribution with mean and variance equal to n.

Writing instead of n , we conclude that for n

The summation is performed over all k from 0 to . F-la (1.5) also takes place when arbitrarily.

Above, we considered the question of finding the PDF for the sum of statistically independent random variables. In this section, we will again consider the sum of statistically independent variables, but our approach will be different and does not depend on the partial PDFs of the random variables in the sum. In particular, suppose that the terms of the sum are statistically independent and identically distributed random variables, each of which has bounded means and bounded variance.

Let be defined as a normalized sum called the sample mean

First, we determine the upper bounds on the probability of tails, and then we prove a very important theorem that determines the PDF in the limit when it tends to infinity.

The random variable defined by (2.1.187) often occurs when estimating the mean of a random variable over a series of observations , . In other words, can be viewed as independent sample realizations from the distribution , and is an estimate of the mean .

The mathematical expectation is

The dispersion is

If considered as an estimate of the mean , we see that its mathematical expectation is equal to , and its variance decreases with increasing sample size . If it increases indefinitely, the variance tends to zero. An estimate of a parameter (in this case, ) that satisfies the conditions that its mathematical expectation tends to the true value of the parameter, and the variance is strictly to zero, is called a consistent estimate.

The tail probability of a random variable can be estimated from above using the bounds given in Sec. 2.1.5. The Chebyshev inequality as applied to has the form

. (2.1.188)

In the limit when , from (2.1.188) it follows

. (2.1.189)

Therefore, the probability that the estimate of the mean differs from the true value by more than , tends to zero if it grows indefinitely. This provision is a form of the law of large numbers. Since the upper bound converges to zero relatively slowly, i.e. inversely . expression (2.1.188) is called weak law of large numbers.

If we apply the Chernoff bound containing an exponential dependence on to a random variable, then we obtain a dense upper bound for the probability of one tail. Following the procedure outlined in Sec. 2.1.5, we find that the tail probability for is given by

where and . But , are statistically independent and equally distributed. Hence,

where is one of the quantities. The parameter , which gives the most accurate upper bound, is obtained by differentiating (2.1.191) and equating the derivative to zero. This leads to the equation

(2.1.192)

Denote solution (2.1.192) by . Then the bound for the probability of the upper tail

, . (2.1.193)

Similarly, we will find that the lower tail probability has a bound

, . (2.1.194)

Example 2.1.7. Let , be a series of statistically independent random variables defined as follows:

We want to define a tight upper bound on the probability that the sum of is greater than zero. Since , then the sum will have a negative value for the mathematical expectation (average), therefore, we will look for the probability of the upper tail. For in (2.1.193) we have

, (2.1.195)

where is the solution of the equation

Hence,

. (2.1.197)

Therefore, for the boundary in (2.1.195) we obtain

We see that the upper bound decreases exponentially from , as expected. In contrast, according to the Chebyshev boundary, the tail probability decreases inversely with .

Central limit theorem. In this section, we consider an extremely useful theorem concerning the IGF of a sum of random variables in the limit where the number of terms in the sum increases without limit. There are several versions of this theorem. Let us prove the theorem for the case when the random summable variables , , are statistically independent and identically distributed, each of them has a limited mean and a limited variance .

For convenience, we define a normalized random variable

Thus, it has zero mean and unit variance.

Now let

Since each term of the sum has zero mean and unit variance, the normalized (by the factor ) value has zero mean and unit variance. We want to define the IDF for in the limit when .

The characteristic function is

, (2.1.200).

or, equivalently,

. (2.1.206)

But this is just the characteristic function of a Gaussian random variable with zero mean and unit variance. Thus we have an important result; The PDF of the sum of statistically independent and identically distributed random variables with limited mean and variance approaches the Gaussian at . This result is known as central limit theorem.

Although we assumed that the random variables in the sum are equally distributed, this assumption can be weakened provided that certain additional restrictions are still imposed on the properties of the random summable variables. There is one version of the theorem, for example, when the assumption of the same distribution of random variables is abandoned in favor of the condition imposed on the third absolute moment of the random variables of the sum. For a discussion of this and other versions of the central limit theorem, the reader is referred to Cramer (1946).

The central limit theorem is a group of theorems devoted to establishing the conditions under which a normal distribution law arises, and the violation of which leads to a distribution other than normal. Different forms of the central limit theorem differ from each other by the conditions imposed on the distributions of random terms that form the sum. Let us prove one of the simplest forms of this theorem, namely, the central limit theorem for independent identically distributed terms.

Consider a sequence of independent identically distributed random variables with mathematical expectation. Let's also assume that there is a variance. Let's introduce a notation. The law of large numbers for this sequence can be represented in the following form:

where convergence can be understood both in the sense of convergence in probability (weak law of large numbers) and in the sense of convergence with probability equal to one (strong law of large numbers).

Theorem (central limit theorem for independent identically distributed random variables). Let be a sequence of independent identically distributed random variables, . Then there is a uniform with respect to () convergence

where is the standard normal distribution function (with parameters):

If the condition of such convergence is satisfied, the sequence is called asymptotically normal.

Theorems of Lyapunov and Lindeberg

Consider the case when random variables have different distributions - independent with different distributions.

Theorem (Lindeberg). Let be a sequence of independent random variables with finite variances. If this sequence satisfies the Lindeberg condition:

where, then the central limit theorem holds for it.

Since it is difficult to directly verify the Lindeberg condition, we consider some other condition under which the central limit theorem holds, namely the condition of the Lyapunov theorem.

Theorem (Lyapunov). If the Lyapunov condition is satisfied for a sequence of random variables:

then the sequence is asymptotically normal, i.e. the central limit theorem holds.

The fulfillment of the Lyapunov condition implies the fulfillment of the Lindeberg condition, and the central limit theorem follows from it.