Biographies Characteristics Analysis

Types of convergence of sequences of random variables.

Sequences of random variables X 1 , X 2 , . . ., X n, . . ., given on some probability space to a random variable x, defined as follows: if for any at
In mathematical This convergence analysis is called convergence in measure. From S. to c. follows convergence in distribution.
V. I. Bityutskov.

Mathematical encyclopedia. - M.: Soviet Encyclopedia. I. M. Vinogradov. 1977-1985.

See what "PROBABILITY CONVERGENCE" is in other dictionaries:

    - ... Wikipedia

    Convergence with probability one, convergence of a sequence of random variables X1, X2, . . ., Х n. . . ., given on a certain probability space to a random variable X, defined as follows: (or a.s.), if B mathematical. ... ... Mathematical Encyclopedia

    In probability theory, a form of convergence of random variables. Contents 1 Definition 2 Notes ... Wikipedia

    In mathematics Convergence means that an infinite sequence or the sum of an infinite series or an improper integral has a limit. The concepts make sense for arbitrary sequences, series and integrals: Sequence limit ... ... Wikipedia

    This term has other meanings, see Convergence. A sequence of functions converges almost everywhere to a limit function if the set of points for which there is no convergence has measure zero. Contents 1 Definition 1.1 Term ... Wikipedia

    This term has other meanings, see Convergence. Convergence in functional analysis, probability theory and related disciplines is a form of convergence of measurable functions or random variables. Definition Let the space with ... ... Wikipedia

    - (in terms of probability) in functional analysis, probability theory and related disciplines, this is a type of convergence of measurable functions (random variables) given on a space with a measure (probability space). Definition Let a space with a measure. ... ... Wikipedia

    A mathematical concept that means that some variable has a Limit. In this sense, one speaks of the S. of a sequence, S. of a series, S. of an infinite product, S. of a continued fraction, S. of an integral, etc. The concept of S. arises, ... ... Great Soviet Encyclopedia

    Same as convergence in probability... Mathematical Encyclopedia

    The general principle, by virtue of which the joint action of random factors leads, under certain very general conditions, to a result almost independent of chance. The convergence of the frequency of occurrence of a random event with its probability with an increase in the number ... ... Mathematical Encyclopedia

Books

  • Probability Theory and Mathematical Statistics in Problems More than 360 problems and exercises, Borzykh D.. The proposed manual contains problems of various levels of complexity. However, the main emphasis is placed on tasks of medium complexity. This is intentionally done to encourage students to…
  • Probability Theory and Mathematical Statistics in Problems. More than 360 tasks and exercises, Borzykh D.A. The proposed manual contains tasks of various levels of complexity. However, the main emphasis is placed on tasks of medium complexity. This is intentionally done to encourage students to…

In probability theory, in contrast to mathematical analysis, several different types of convergence of a sequence of functions (random variables) and their distributions are considered. This is due to the fact that in probability theory it is customary to neglect unlikely events and this can be done in different ways. Pointwise convergence of random variables, convergence almost surely, and convergence of probability measures in variation have already been defined. Let us give two more important definitions of the convergence of random variables − convergence in probability and convergence in rms, and one definition of convergence of distributions – weak convergence.

Convergence in Probability

converges to a random variable

in probability if

Convergence in probability is denoted as

Convergence in RMS

Sequence of random variables

converges to a random variable

in rms (in L 2) if

Convergence in rms is denoted as

Weak convergence of distributions

Sequence of random variables

converges to a random variable

weakly (by distribution) if

at all points of continuity of the function

Weak convergence is denoted as

The main difference between weak convergence and other types of convergence is that random variables are not required to be defined on the same probability space, since convergence conditions are formulated using only their distribution functions.

The relationship of different types of convergence

The relationship between different types of convergence is shown in the following diagram.

Note that none of the arrows in this diagram can, generally speaking, be turned back, i.e. any two types of convergence are not equivalent. Of practical importance are mainly weak convergence and convergence in the mean square because they allow you to make approximate calculations of probabilities and mathematical expectations and replace one mathematical model with another. Other types of convergence are mainly used in proving weak convergence or studying the qualitative properties of the model. Therefore, we study in more detail the relationship between these two types of convergence in the rest.

Let us first show that convergence in probability implies weak convergence.

Theorem (P->W).

.

Proof.

Let x be a point of continuity of the function

.

Thus

At small and large n, the left and right sides of the inequality differ arbitrarily little from
, which proves the theorem.

The proof is complete.

The converse theorem is true under an additional condition.

Theorem (W->P).

Proof.

The proof is complete.

Let us show that convergence in rms implies convergence in probability.

Theorem (L 2 ->P).

Proof.

We use the Markov inequality

.

The proof is complete.

The following theorem gives an example of applying the previous theorem to prove the convergence of the relative frequency of an event to its probability in the Bernoulli scheme.

Law of large numbers in Bernoulli form

Let be - the number of successes in n tests according to the Bernoulli scheme with a probability of success p. Then

Proof.

The proof is complete.

Thus, to prove weak convergence, it suffices to prove convergence in probability or in mean square.

When proving theorems on weak convergence, the following important theorem is also used.

Theorem ((Helly-Bray).

Continuous bounded function. Then

.

Proof.

Any function continuous on the entire line
can be arbitrarily accurately approximated by a linear combination of step functions on any interval [-A,A) , A>0.

Let us choose A so that the points –A, A and the partition points

would be points of continuity of the distribution function

Then the integrals

are expressed in the same way in terms of the values ​​of the distribution functions
and
and can be made arbitrarily close by choosing a sufficiently large n. Therefore, the integrals are also close

Since the function
is bounded, then by choosing a sufficiently large A one can make the integrals arbitrarily small

The theorem has been proven.

The converse theorem is also true.

Theorem (Inverse Helly-Bray theorem)

Let for any

continuous bounded function

Proof.

The idea of ​​the proof is similar to the idea of ​​the proof of the previous theorem and is based on the possibility of approximating the step function
continuous function
. Indeed, again choosing appropriate points of continuity and setting

we see that the integrals close to each other

can be made arbitrarily close, respectively. to integrals

The theorem has been proven.

,

then the last two theorems give necessary and sufficient conditions for weak convergence in terms of the convergence of mathematical expectations of continuous bounded functions.

Theorem (f(W)).

continuous function. Then

.

Proof.

Since the substitution of a continuous function into a bounded continuous function leads again to a continuous bounded function, the proof of this theorem follows directly from the Helly-Bray theorems.

The theorem has been proven.

It is easy to show that the following theorem is also true

Theorem (f(P)).

continuous function. Then

.

Prove this and the next two theorems on your own as exercises.

Theorem (W+P->W).

Theorem (W*P->W).

In the future, we will have to operate extensively with derivatives and integrals of random processes. Both operations - differentiation and integration - assume, as you know, the convergence of a certain sequence of quantities to the limit. But for random variables defined not deterministically, but by their own probability distributions, the concept of convergence to the limit (and thus the concepts of continuity, differentiability, and integrability for random functions) cannot have the same meaning that is put into it in analysis. For a sequence of random variables, only a probabilistic definition of convergence to the limit is possible, which, by the way, opens up more diverse possibilities in choosing the definition itself. Probabilistic convergence is also essential for considering the so-called ergodic properties of random functions, to which we turn in the next section.

Let's start, for simplicity, by considering different types of convergence of a sequence of random variables to a (non-random) number a.

One of the types of probabilistic convergence is convergence in the mean square (r.m.s. square), which is understood as the vanishing of the mean square deviation from the number a at

which is written as

Designation 1. i. m. is made up of the initial letters of the English name of this limit (limit in the mean square). The use of this type of convergence is most expedient in those cases when one has to deal with quadratic (in particular, those that have an energy meaning) combinations of random variables.

Equality (19.1) obviously assumes the finiteness of the most finite and the average value since . Subtracting and adding in brackets in (19.1), we rewrite this equality differently:

But the limit of the sum of two non-negative quantities can be equal to zero only if the limits of both terms are equal to zero, i.e.

Thus, is the limit of the sequence of means and the limit of the variance is zero.

Another kind of probabilistic convergence to a - convergence in probability (in ver.) - is defined as follows:

where, as usual, is any arbitrarily small positive number. In this case, write

Equality (19.2) means that the probability of hitting somewhere outside an arbitrarily narrow interval vanishes in the limit. In view of the arbitrary smallness, this in turn means that the probability density of the random variable goes over at . However, it by no means follows from this that a is the limit of the sequence and that D tends to zero. Moreover, they can increase indefinitely with increasing N or even be infinite for any N. Let, for example, be non-negative and distributed according to the Cauchy law:

For any, the limit for is equal to zero, while the limit does not exist. However, the normalization condition is always satisfied:

so tends to . However, it is easy to verify that for any N and are infinite.

Convergence in probability is often called convergence in the sense of the law of large numbers. Random variables are said to be limit constant if there is a sequence of constants such that

If all are the same (equal to a), then this equality goes into (19.2), i.e., it means that converges in probability to a or the difference - a converges in probability to zero.

Convergence in probability should be clearly distinguished from ordinary convergence

Indeed, regarding the behavior of empirical numbers - values ​​- nothing can be proved mathematically. Only statements relating to theoretical concepts, including the concept of probability, as defined in the original axioms, can be proved. In probability convergence, we are not talking about the fact that and at , but that the probability of an event tends to unity. The connection of this statement with experience lies in the "axiom of measurement", according to which the probability is measured by the relative frequency

the occurrence of the considered random event in a sufficiently long series of tests, in a sufficiently extensive ensemble of systems, etc.

For a better understanding of this fundamental aspect of the issue, let us dwell on some limit theorems of probability theory, united under the general name of the law of large numbers, namely, on theorems related to the case when (19.2) contains the arithmetic mean of N random variables

We make a series of N trials, take their results and calculate the average (19.3). Then we look to see if there is an event (let's call it the BN event) that

In order to measure the probability of an event BN, we must carry out a very large number M of series of N trials, we must have a collective of such series. The law of large numbers (19.2) states that the longer the series that form the collective (the greater N), the closer to one, i.e., according to the "axiom of measurement", the greater the number of series will correspond to the onset of BN (in the limit - practically all):

Thus, this is a quite meaningful statement, but it becomes so only when the mathematical concept of probability is clearly compared with the empirical concept of relative frequency. Without this, the law of large numbers remains a certain theorem, logically following from a certain system of axioms for the quantity P, which is defined as a completely additive, non-negative and normalized to unity function of the domain.

Often this question, which we have already touched upon in § 1, is presented rather confusedly in the educational literature, without a clear indication that the "axiom of measurement", which connects the concepts of probability theory with real phenomena, with experiment and practice, is not contained in mathematical theory as such. One can come across statements that the foundation for the success of the application of probability theory in various problems of natural science and technology is laid precisely in the law of large numbers. If this were the case, then it would mean that

the foundation of practical success is the logical consequence of certain abstract axioms, and that these mathematical axioms themselves prescribe how empirical quantities should behave.

In principle, it would be possible to proceed from other axioms - and construct another theory of probability, the conclusions of which, being different from those in the existing theory, would be just as logically flawless and just as optional for real phenomena. The situation here is the same as with the various possible geometries. But as soon as a mathematical theory is supplemented by certain methods of measuring the quantities with which it operates, and thus becomes a physical theory, the situation changes. The correctness or incorrectness of a theory then ceases to be only a matter of its logical consistency, but becomes a matter of its correspondence to real things and phenomena. The question of the truth of the axioms themselves acquires content, since now this can be subjected to experimental and, in general, practical verification.

However, even before such a verification, an internal correspondence between both parts of the physical theory is necessary: ​​the established methods for measuring quantities should not be in conflict with the equations to which these quantities are subjected by the mathematical part of the theory. For example, Newton's equations of motion assume that force is a vector and are therefore incompatible with a way of measuring force that would characterize it only in absolute magnitude. Perhaps, in reality, the force is not a vector, but, say, a tensor, but this is another question concerning how well this physical theory reflects the objective reality as a whole. We are now only talking about the fact that the presence of a contradiction between the mathematical and measurement parts of a physical theory makes it untenable even before any verification of its consequences in experiment.

From this point of view, the law of large numbers differs from other - logically equivalent - theorems of probability theory only in that, as will be seen from what follows, it especially clearly and clearly shows the compatibility of the mathematical definition of probability and the frequency method of its measurement. He shows that the frequency "axiom of measurement" does not contradict mathematical theory, but the latter, of course, does not and cannot replace this "axiom".

The proof of various theorems in the form of the law of large numbers usually uses Chebyshev's inequality, proved in his dissertation in 1846. Let a random variable have a finite variance Chebyshev's inequality

States that

If, in particular, , then inequality (19.4) takes the form

Although inequalities (19.4) and (19.5) give only a very rough estimate of P (a more accurate estimate can be obtained if the distribution law is known), they are very useful and important for theoretical constructions.

In the case when Chebyshev's inequality contains the arithmetic mean (19.3) of N random variables, inequality (19.5) allows us to prove Chebyshev's theorem, which is a fairly general expression of the law of large numbers. Namely, if is a sequence of pairwise independent random variables with uniformly bounded variances (D С), then

Really,

According to the Chebyshev inequality

whence the theorem (19.6) follows for the probability of the opposite event, i.e., the convergence in probability to

A special case of Chebyshev's theorem is Poisson's theorem. Let - random variables-fixers of the test outcome or 0 in accordance with the occurrence or non-occurrence of event A during the test, in which . Then

and Chebyshev's theorem gives

This is Poisson's theorem. An even more special case is when . Then we come to Bernoulli's theorem, one of the first formulations of the law of large numbers:

Let us dwell on this simplest form of law. Theorem (19.8) shows that with an increase in the number of trials N, the relative frequency of the event A, i.e., the empirical value converges in probability to - the probability of the event A. If this were not so, then it would be pointless to measure the probability using the relative frequency . But since this is so, then the frequency method of measuring the probabilities both (by the relative frequency of the occurrence of event A in a series of N trials) and P (by the relative frequency of the occurrence of an event in a team of M series of trials) can be taken as an addition to mathematical theory, because it does not contradict it. After that, it is already possible to both ask and test by experience whether the resulting physical theory reflects real statistical regularities.

Curiously, for the fulfillment of Theorem (19.8) for any values ​​of , i.e., for convergence in probability

it suffices to require that this convergence takes place only for (the relative frequency of unlikely events must be small).

We now write Chebyshev's theorem for the case when all are a. Then

and the theorem becomes

which is the basis of the arithmetic mean rule in measurements. Individuals can strongly deviate from a, but with probability we have a at This happens because when calculating the average value, random deviations of individual terms are compensated and in the vast majority of cases the deviation turns out to be very small.

Deviations from a may be random measurement errors. But if the reading accuracy itself during measurement is not less, i.e., there is a systematic error associated with the scale division value, then the accuracy is not less for any N, so it is pointless, appealing to the law of large numbers, to strive to obtain in this case the value and with an error less than due to A rather widespread misconception that the arithmetic mean allows you to surpass the measurement accuracy limited from below and obtain, say, with the help of a shield ammeter, a reading of the current strength with an accuracy of microamperes.

Another situation is also possible: the measurand itself can be random (noise current, etc.). Then we can be sure that at , i.e., the arithmetic mean tends to the mathematical expectation of the random variable.

The condition of mutual independence of the results of measuring a random variable requires, generally speaking, that it be measured at sufficiently large time intervals. However, for the validity of the law of large numbers, this independence condition itself is not necessary, since the Chebyshev inequality requires only for . We will not dwell on more general theorems and on the necessary and sufficient conditions under which the law of large numbers is valid for the arithmetic mean, since these conditions concern the quantity itself and are therefore less interesting in practice than narrower conditions, but related to individual terms

In 1909, E. Borel (then - in a more general form - F. P. Cantelli, then A. N. Kolmogorov) proved a stronger statement than the law of large numbers. According to Bernoulli's theorem

According to Borel (strong law of large numbers)

i.e. with certainty, or, as they say, "almost probably", the relative frequency has a probability as its limit. This is an even stronger basis for measuring probability by relative frequency.

Based on (19.9), one can introduce another type of probabilistic convergence - convergence in the sense of the strong law of large numbers, which is also called convergence with probability or almost sure convergence:

(19.10)

Briefly, this can be written as

Sometimes, in connection with the definition (19.10), confusion arises about the fact that it contains the usual limit of a sequence of random variables. One gets the impression that we seem to be deviating here from the statement made above that the convergence of random variables can only have a probabilistic meaning. But that is precisely what is at stake in this case. Among the various realizations of the sequence, there are also possible realizations that converge to a in the usual sense. It can be shown that the set of such realizations has a certain probability P. Convergence almost certainly means that this probability, i.e., the probability of a random event, is equal to one. In other words, realizations converging to a in the usual sense "almost exhaust" the set of all possible realizations of the sequence. Thus, in (19.10) we do not go anywhere from the probabilistic definition of convergence, although now we do not mean the limit of probability (as in convergence in probability ), and the probability limit.

We present two of the conditions for convergence to a almost surely. One of them is necessary and sufficient

However, this condition can never be verified in practice. Another, stronger sufficient condition, is that

that for any the series must converge

Other sufficient conditions and, in general, a detailed mathematical discussion of questions concerning probabilistic convergence can be found in the books (Chapter 3) and (Chapter 1).

Convergence in the mean square entails (due to Chebyshev's inequality) convergence in probability, and if all are almost certainly uniformly bounded in absolute value, then, conversely, convergence in probability implies convergence in the mean square. Convergence almost certainly also entails convergence in probability, but not convergence in mean square; at the same time, mean square convergence does not imply convergence almost surely.

theory probability convergence theorem

Limit theorems of probability theory

Convergence of sequences of random variables and probability distributions

1.1.1.1 Convergence of random variables

Let there be a probability space with a system of random variables and a random variable given in it. In probability theory, the following types of convergence of sequences of random variables are considered.

A sequence of random variables converges in probability to a random variable if for any

This type of convergence is denoted as:, or.

A sequence of random variables converges to a random variable with probability 1 (or almost certainly) if

that is, if for for all except, perhaps, from some set of zero probability (). Convergence with probability 1 will be denoted as follows: , or. Convergence with probability 1 is convergence almost everywhere with respect to the probability measure.

Note that convergence is an event from the -algebra, which can be represented as

Let us formulate some theorems establishing criteria for almost sure convergence.

Theorem 1.1. if and only if for any

or, which is the same,

Theorem 1.2. If the row

converges for any

It can be shown that convergence entails convergence (this follows from (1.1)). The converse assertion is not true in general, but the following theorem holds.

Theorem 1.3. If, then there exists a subsequence such that for .

The connection between convergence and convergence is established by the following theorems.

Theorem 1.4. (Levy on monotonic convergence) Let there be a monotonic sequence of non-negative random variables: having finite mathematical expectations bounded by the same value: . Then the sequence converges with probability 1 to some random variable c, and

Theorem 1.5. (Lebesgue on dominated convergence) Let and be quantities, where is a non-negative random variable with a finite mathematical expectation. Then the random variable also has a finite mathematical expectation and

A sequence of random variables converges to a random variable on average order if

We will denote such a convergence. When they talk about convergence in the mean square and denote. By virtue of the generalized Chebyshev inequality, convergence implies convergence. From the convergence in probability, and even more so from the convergence almost surely, the convergence of the order does not follow. Thus, convergence in probability is the weakest convergence of the three that we have considered.

A sequence is said to be fundamental in probability (almost probably, on the average order) if for any

Theorem 1.6. (Cauchy Convergence Criterion) In order for a sequence to be fundamental in the corresponding sense (in probability, almost certainly, on the average of the order) it is necessary and sufficient.

1.1.1.2 Weak convergence of distributions

It is said that the probability distribution of random variables converges weakly to the distribution of a random variable if for any continuous bounded function

Weak convergence will be denoted as follows: . Note that convergence implies convergence. The converse is not true, but for weak convergence implies convergence in probability.

Condition (1.2) can be rewritten using the Lebesgue integral over a measure as follows

For random variables with a probability density, weak convergence means convergence for any bounded function

If we are talking about distribution functions and the corresponding and, then weak convergence means that

transcript

1 S.Ya. Shatskikh Lectures on Probability Theory Types of Convergence of Sequences of Random Variables Draft Convergence in Probability. We will assume that all random variables of interest to us are defined on the same probability space Ω, A, ). Let us recall the definition of the convergence of random variables in probability, which we met when studying the law of large numbers in the form of P.L. Chebyshev. Definition 1. A sequence of random variables X n (ω)) is said to converge to a random variable X(ω) in probability if for any ε > 0 ω : X n (ω) X(ω) > ε) 0, n. Notation: X n (ω) X(ω). Convergence in probability is a complete analogue of convergence in measure, which is considered in the courses of functional analysis and "Lebesgue integral". Theorem. If for n X n (ω) X(ω), X n (ω) Y (ω), then ω : X(ω) = Y (ω)) = 1 (the uniqueness of the limit is almost certain). Theorem. If for n X n (ω) X(ω), Y n (ω) Y (ω), then 1 ax n (ω) + b Y n (ω) 2 X n (ω) Y n (ω) ax( ω) + b Y (ω) 3 X n (ω) X(ω) Y (ω), X(ω). (a, b const), Theorem. For random variables X(ω), Y (ω) the functional ) X(ω) Y (ω) d(x(ω), Y (ω)) = M 1 + X(ω) Y (ω) 1

2 defines a metric in the space of random variables 1. Convergence in this metric is equivalent to convergence in probability. Proof. First we prove the equivalence of convergences. Consider increasing on the half-line ; A = B() is the Borel σ algebra of the interval ; Lebesgue measure. Let us set [ k 1 Xn(ω) k:= 1 A k n (ω), where A k n = n, k ], k = 1, n. n Consider a sequence of random variables X 1 1(ω), X 1 2(ω), X 2 2(ω), X 1 3(ω), X 2 3(ω), X 3 3(ω),... (6) It is clear that for any ω the constructed sequence is the union of infinite sequences of zeros and ones. Therefore, at any point ω this sequence has no limit and its convergence set is empty. On the other hand, for any ε (0, 1) ω : Xn(ω) k > ε) = 1, k = 1, n, n, therefore sequence (6) converges in probability to (identically) zero. Although almost sure convergence does not follow from convergence in probability, the following theorem nevertheless holds. Theorem 4 (F. Riess). If for n X n (ω) X(ω), then there exists a subsequence n k ) such that for k X nk (ω) a.s. X(ω). 7

8 Proof 3. First, we construct the required subsequence n k ). Let us set n 0 = 1 and then, for k N, define by induction n k as the smallest natural number for which the following inequalities hold: n k > n k 1, ω : X nk (ω) X(ω) 1 )< 1 k 2 k Такое число существует в силу сходимости по вероятности ω : X n (ω) X(ω) 1 } 0, (n). k Теперь установим сходимость X nk (ω) п.н. X(ω), (k). Ввиду соотношения (9) (см. доказательство теоремы 2) } ω : sup X nk (ω) X(ω) >ε k m = k=m ω : X nk (ω) X(ω) > ε ). Therefore ) ω : sup X nk (ω) X(ω) > ε k m ω : X nk (ω) X(ω) > ε ). k=m () For any ε > 0, there is a natural M ε such that Therefore, for m > M ε 1 m< ε. m >M ε by choice n k ω : X nk (ω) X(ω) > ε ) k=m k=m ω : X nk (ω) X(ω) > 1 k ) k=m 1 2 k. Thus, taking into account (), we will have ) ω : sup X nk (ω) X(ω) > ε k m k=m 1 2 k. Passing to the limit in this inequality for m, in view of the finiteness of the sum of the geometric progression, we obtain ) lim ω : sup X nk (ω) X(ω) > ε = 0. m k m 2). 3 This theorem is considered in the course of functional analysis. eight

9 The question of the metrization of convergence is almost sure. Consider the question of the metrization of almost sure convergence. As we shall see, generally speaking, the answer to this question is negative: in contrast to convergence in probability, convergence is almost certainly non-metrizable. However, some remarks must be made here. There are examples of probability spaces for which convergence in probability is equivalent to almost sure convergence. In such spaces, every sequence of random variables that converges in probability is necessarily almost sure convergent. In such a situation, the convergence is almost certainly metrizable due to the metrizability of convergence in probability (see the theorem?). However, otherwise, as the following theorem shows, the metrization of convergence is almost certainly impossible. Theorem 5. If in a set of random variables defined on a certain probability space the concepts of convergence with probability one and convergence in probability do not coincide, then for such a set of random variables there is no metric whose convergence is equivalent to almost sure convergence. Proof. Assume the opposite, i.e. in the set of random variables there is a metric ρ (,) corresponding to almost sure convergence: for n X n (ω) a.s. X(ω) ρ (X n (ω), X(ω)) 0. Consider a sequence of random variables X n (ω)), which converges to the random variable X(ω) in probability but not almost surely on the one hand, for some δ > 0 there exists a subsequence n k ) for all members of which the inequality ρ (X nk (ω), X(ω)) > δ is satisfied. () On the other hand, the convergence in probability remains: X nk (ω) X(ω), for k. However, by virtue of Theorem 4, we can assert that the subsequence n k ) has a "subsequence" n km ) for which, for m Hence X nkm (ω) a.s. X(ω). lim m ρ (X nkm (ω), X(ω)) = 0, which contradicts (). The theorem has been proven. We now give examples of probability spaces for which convergence in probability is equivalent to almost sure convergence. First, let us recall the definition of an atomic probability space 5 (see Encyclopedia of TV and MS, edited by Yu.V. Prokhorov, Neveu J. "MOTV"). 4 An example of such a sequence has been discussed above. 5 Roughly speaking, an atomic probability space consists of a finite or countable set of points, each of which has a positive probability. An example of a finite atomic space is the Bernoulli scheme. nine

10 Definition. A probability space Ω, A, ) is called atomic if there exists a finite or countable partition of Ω into atoms A i A: 1 Ω = A i, A i A j =, (i j), the set of indices I is finite or countable. i I 2 A i ) > 0, for any i I; 3 for any B A each atom A i has one of two properties or B A i ) = 0, or B A i ) = A i ); ) 4 A i = A i ) = 1. i I i I Theorem 6. For an atomic probability space, convergence with probability one is equivalent to convergence in probability. Proof. On an atomic probability space, convergence in probability implies convergence on each atom. Indeed, if for each ε > 0, for n ω : X n (ω) X(ω) ε) 0, then for any i I Therefore, the convergence set ω A i: X n (ω) X(ω) ε) 0 ω : X n (ω) X(ω)) contains all atoms and hence its probability is one. Hence, using Theorem 3, we obtain the proof of our theorem. Comment. The converse assertion 6 is also true: if in some probability space the concepts of convergence with probability one and convergence in probability coincide, then such a probability space is atomic (see Neveu "MOTV p. 37; Prokhorov A.V., Ushakov V.G., Ushakov N. G. "Collection of problems on TV problem 5.25, p. 107.). Average convergence Definition 4. A sequence of random variables X n (ω)) is said to converge on average of order p > 0 to a random variable X(ω) if for n M X n (ω) X(ω) p ) 0. For p = 2 speaks of mean square convergence. Of course, speaking of average convergence of order p, we assume finiteness of mathematical expectations M X n (ω) p )<, M X(ω) p } <. В следующей теореме мы установим, что сходимость по вероятности является необходимым условием сходимости в среднем порядка p >0.6 For our elementary course in probability theory, the proof of this assertion is too technical. ten

11 Theorem. If for some p > 0 for n M X n (ω) X(ω) p ) 0. then X n (ω) X(ω). Proof. Chebyshev And note that it is enough to pass to the limit at n in P.L. X n (ω) X(ω) p > ε)< M X n(ω) X(ω) p } ε 2. X n (ω) X(ω) p >ε) = X n (ω) X(ω) > ε 1/p). The following simple example shows that convergence in probability cannot be a sufficient condition for average convergence. Example. We assume that Let Ω = , A = B(), ) = λ ) be the Lebesgue measure on the interval . Then for any ε > 0, However, for p 1 X(ω) 1, X n (ω), = n when ω [ 0, 1/n ], 1, when ω (1/n, 1 ]. X n (ω) X(ω) > ε) = λ[ 0, 1/n ]) = 1/n 0, n. M X n (ω) X(ω) p ) = n p 1 n = np 1 1 for all n N. The absence of convergence on average in this example is due to the "area going to infinity". In the following theorem, an important role is played by the condition of uniform boundedness of integrable random variables, which prevents such a "leaving". Theorem. If for a sequence of random variables X n (ω)) there exists a real number 0< C < + такое, что ω : X n (ω) C} = 1, для любого n N, и при n имеет место сходимость по вероятности X n (ω) X(ω), то M X(ω) } C и lim M X n (ω) X(ω) } = 0. Доказательство. Вначале покажем, что из условия равномерной ограниченности случайных величин X n (ω)} с вероятностью единица следует ограниченность предельной случайной величины с вероятностью единица: ω : X(ω) C} = 1. 11

12 Indeed, convergence in probability implies convergence a.s. for some subsequence Therefore, by the properties of limits, if Hence, and ω : X n(m) (ω) X(ω)) = 1, for m. ω ω : X n(m) (ω) X(ω)), then X(ω) C. ω : X n(m) (ω) X(ω)) ω : X(ω) C) ω : X (ω) C) = 1. Hence, we obtain the existence and boundedness of the mathematical expectation of the random variable X(ω) M X(ω) ) C. Now it is easy to verify the validity of the inequality ω : X n (ω) X(ω) 2C) = 1. Further, by the properties of mathematical expectations, MX n (ω)) MX(ω)) M X n (ω) X(ω) ) X n (ω) X(ω) d + X n (ω) X(ω) d ω : X n(ω) X(ω) ε) ω: X n(ω) X(ω) > ε) ε + 2C ω : X n (ω) X(ω) > ε). Passing to the limit as n, in view of the arbitrariness of ε, we obtain the proof of our theorem. In the following theorem, instead of the condition of being uniformly bounded by a constant, a weaker condition of being uniformly bounded by a (non-negative) integrable random variable will be considered. Lebesgue's theorem on dominated convergence. If for a sequence of random variables X n (ω)) there are random variables X(ω) and Y (ω) such that 1 X n (ω) X(ω), n, then for n 2 for all n X n ( ω) Y (ω), - almost certainly, 3 MY (ω))<, M X(ω) } MY (ω)} < M X n (ω) X(ω) } 0. 12

13 Proof 7. First, we establish the inequalities X(ω) Y (ω), - almost sure. The convergence of a sequence of random variables in probability implies convergence almost surely for some subsequence: X n(m) (ω) a.s. X(ω), m. In other words, the probability of the convergence set is equal to unity ω : X n(m) (ω) X(ω)) = 1. Therefore, passing to the limit (m) in the inequality X n(m) (ω) Y (ω), for any ω ω : X n(m) (ω) X(ω)) From this we obtain the existence of MX(ω)) and an estimate (almost surely). M X(ω) ) MY (ω)). Therefore, and Estimate the quantity = X n (ω) X(ω) 2Y (ω), (almost probably) M X n (ω) X(ω) ) 2MY (ω)). M X n (ω) X(ω) ) = X n (ω) X(ω) d + X n (ω) X(ω) d ω: X n(ω) X(ω) ε) ε + 2 ω: X n(ω) X(ω) > ε) Y (ω) d. () ω: X n(ω) X(ω) > ε) By condition 1 of the theorem (convergence in probability), for any ε > 0 ω : X n (ω) X(ω) > ε) 0, (n) . Therefore, using the lemma on the integral over a set of low probability, we can assert that lim Y (ω) d = 0. ω: X n(ω) X(ω) > ε) convergence refers to the only place in Lebesgue's theory of integration where naive formalities can lead to an incorrect result." See Feller, v.2, p.

14 Passing to the limit in inequality () we will have 0 lim M X n (ω) X(ω) ) ε. Hence, in view of the arbitrariness of ε > 0, we obtain the proof of the theorem. Comment. The proof of this theorem is detailed in the course "Lebesgue's Integral". A slightly different version of the proof can be found in the book [Shiryaev "Probability"]. Let us present without proof two more classical results (of real analysis), which are often used in the analysis of average convergence. Theorem on monotone convergence. If a non-decreasing sequence of non-negative random variables X n (ω)) X n (ω) X n+1 (ω), n = 1, converges almost surely to a random variable X(ω), then for n MX n (ω)) MX (ω)). Comment. If the mathematical expectation MX(ω)) is finite, then (due to monotonicity) the mathematical expectations of all random variables MX n (ω) are finite. We have the convergence of the monotone sequence to the finite limit MX n (ω)) MX(ω)). If the mathematical expectation MX(ω)) is infinite, then, assuming finite mathematical expectations of random variables MX n (ω)), we obtain the convergence of the monotone sequence to the infinite limit MX n (ω)) +. Lemma Fatou. Any sequence of non-negative random variables X n (ω)) satisfies the inequality lim MX n (ω)) Mlim X n (ω)). Comment. The assertion of Fatou's lemma shows that the inequality 2 = lim MX n (ω)) M lim X n (ω)) = 1, which took place in the above example, is a manifestation of a general regularity. Task. If for n M X n (ω) X(ω) p ) 0, then M X n (ω) p ) M X(ω) p ). Decision. Using G. Minkowski's inequality, we can write (M X n (ω) p )) 1/p = (M X n (ω) X(ω) + X(ω) p )) 1/p 14

15 (M X n (ω) X(ω) p )) 1/p + (M X(ω) p )) 1/p. Passing to the upper limit, for n we get lim (M X n (ω) p )) 1/p (M X(ω) p )) 1/p. Hence, using the continuity and monotonicity of the power function, we have lim M X n (ω) p ) M X(ω) p ). () On the other hand, arguing similarly, from the inequality (M X(ω) p )) 1/p (M X(ω) X n (ω) p )) 1/p + (M X n (ω) p )) 1/ p. we obtain M X(ω) p ) lim M X n (ω) p ). Combining together the inequalities () and (), we obtain the solution of our problem. Theorem. If at n (). M X n (ω) X(ω) p ) 0, then for any q (0, p) M X n (ω) X(ω) q ) 0. Proof. It suffices to pass to the limit at n in A.M. Lyapunov (see [Shiryaev A.N. Probability.]) (M X n (ω) X(ω) q )) 1/q (M X n (ω) X(ω) p )) 1/p, at 0< q p. При p = 2 и q = 1 доказательство теоремы можно получить с помощью следующего варианта неравенства Коши-Буняковского M X n (ω) X(ω) } = M X n (ω) X(ω) 1} (M X n (ω) X(ω) 2}) 1/2 (M 1 2 }) 1/2 = (M Xn (ω) X(ω) 2}) 1/2. Пространство L p Ω, A, } Рассмотрим пространство L p Ω, A, } - т.е. множество всех случайных величин X(ω), определенных на Ω, измеримых относительно σ алгебры A и таких, что M X n (ω) p } = X n (ω) p d <. Ω Это пространство вполне аналогично известному из курса функционального анализа линейному пространству L p [ 0,1], которое состоит из всех функций y = f(x) определенных на отрезке [ 0, 1], измеримых по Лебегу и интегрируемых с показателем p по мере Лебега 1 0 f(x) p dx <. 15

16 Without giving detailed proofs, we formulate several statements related to the space L p Ω, A, ), which are similar to the corresponding statements about the space L p . The functional X(ω) p:= (M X(ω) p )) 1/p defines the norm in the space of random variables 8 L p Ω, A, ) : 1 X(ω) p 0, 2 c X(ω) p = c X(ω) p, c = const, 3 X(ω) + Y (ω) p X(ω) p + Y (ω) p, (Minkowski's inequality). Note that the linearity of the set L p Ω, A, ) immediately follows from the properties of the norm. Moreover, with respect to convergence in the norm 9 X n (ω) X(ω) p 0, the space L p Ω, A, ) is complete. In our case, the definition of completeness is as follows: if a sequence of random variables is fundamental in the norm X n (ω)) L p Ω, A, ) X n (ω) X m (ω) p 0, for n, m, then there exists a random variable X (ω) L p Ω, A, ) such that X n (ω) X(ω) p 0, for n. So, L p Ω, A, ) is a complete linear normed space, i.e. Banach space. For p = 2, the space L 2 Ω, A, ) is a Hilbert space with scalar product 10: X(ω), Y (ω) := MX n (ω)y (ω)) = X n (ω)y (ω) d. For the scalar product of real-valued random variables introduced in this way, G. Minkowski's inequality X(ω), Y (ω) X(ω) 2 + Y (ω) 2 holds. values ​​coinciding almost certainly, since by definition of the norm X(ω) p = 0 X(ω) 0. 9 I.e. convergence on average with p. 10 We are dealing with real-valued random variables, so the sign of complex conjugation over the second factor can be omitted. Ω 16

17 Let us introduce the notation for the distribution functions of random variables X n (ω) and X(ω) : In addition, through C F F (x) : F n (x) = ω : X n (ω) x), F (x) = ω : X(ω)x). we denote the set of points of continuity of the function C F:= x R: lim x x F (x) = F (x)). Definition 4. A sequence of random variables X n (ω)) is said to converge in distribution to a random variable X(ω) if for n F n (x) F (x) at each point x C F. (11) d Notation: X n (ω) X(ω). Definition 5. If for n F n (x) F (x), at each point x C F, (12) then we say that the sequence of distribution functions F n (x)) converges weakly 11 to the distribution function F (x). w Designation: F n (x) F (x). Comment. If the distribution function F (x) is continuous on the entire real axis (CF = (,)), then relations (11) and (12) are pointwise convergence. Moreover, it can be shown 12 that in this case the convergence F n (x) F (x) is uniform on the entire real axis. w Note. If F n (x) F (x), then for x / C F the inequalities 13 F (x) lim F n (x)< lim F n (x) F (x). Пример. Рассмотрим последовательность функций распределения 0, когда x (, 1/n); n F n (x) = x + 1, когда x [ 1/n, 1/n]; 2 2 1, когда x (1/n,), графики которых имеют вид F n (x) 1 1/2 1/n 0 1/n 11 Иногда слабую сходимость называют "сходимостью в основном". 12 См. задачу См. задачу 5. x 17

18 It is easy to see that for any x (,) lim F n(x) = F (x) = The graph of the function F (x) has the form F (x) 0 when x (, 0); 1/2 when x = 0; 1 when x(0,). 1 1/2 0 Since the limit function F (x) is not right continuous, it cannot be a distribution function. But since Definition 5 of weak convergence deals with convergence to distribution functions, in this example we cannot assert that F n (x) F (x). However, after a slight change in the limit function F (x), one can obtain the distribution function F (x), to which the functions F n (x) will weakly converge. Indeed, consider the distribution function of the random variable X(ω) 0: 0 when x (, 0); F(x) = 1 when x ; A = B() is the Borel σ algebra of the interval ; Lebesgue measure. Denote by Φ 1 () the function inverse to the function of the standard Gaussian distribution. Set Then Φ(x) = 1 2π x exp) (u2 du. 2 X 2k (ω) = Φ 1 (ω), X 2k 1 (ω) = Φ 1 (ω), ω ;k = 1, 2,.... ω : X n (ω) x) Φ(x), for all positive integers n. Therefore, the sequence X n ) (trivially) converges in distribution. However, it is easy to see that there is no convergence in probability. Indeed, since X 2k (ω) X 2m 1 (ω) 2 Φ 1 (ω), for any k, m. then ω : X 2k (ω) X 2m 1 (ω) > ε) ω : Φ 1 (ω) > ε ) [ (ε = 2 1 Φ. 2 2)] weak convergence. This variant is better suited for determining the weak convergence of multivariate distribution functions and even for determining 20

21 weak convergence of distributions on more complex infinite-dimensional metric spaces. Theorem 6. In order for the sequence of distribution functions F n (x)) to converge weakly to the distribution function F (x), it is necessary and sufficient that the equality lim ϕ(x) df n (x) = ϕ(x) df (x) (15) for any continuous and bounded function ϕ(x) on the real axis R. Proof. Let us first show that the weak convergence (12) implies equality 14 (15). For any ε > 0, there is a positive A(ε) C F such that, by the properties of the distribution function 15 x: x > A(ε)) ​​df (x) = 1 A(ε) A(ε) df (x) = 1< ε, (16) и, кроме того, найдется натуральное N(ε, A(ε)) такое, что при всех n >N(ε, A(ε)) ​​F n (A(ε)) ​​F (A(ε))< ε, F n (A(ε)) F (A(ε)) < ε. Тогда при всех n >N(ε, A(ε)) ​​x: x > A(ε)) ​​df n (x)< 3ε. (17) Пусть ϕ(x) - непрерывная и ограниченная на вещественной оси R функция. Будем считать, что для всех действительных x ϕ(x) C = const. В силу существования интеграла Римана - Стилтьеса от непрерывной функции по интегрирующей функции распределения, а также из определения этого интеграла как предела интегральных сумм 16 следует, что для любого ε >0, there exists δ > 0 such that for any partition of the segment [ A(ε), A(ε)] whose diameter is less than δ > 0, the inequalities A(ε) A(ε) ϕ(x) df (x) S n (δ)< ε, A(ε) A(ε) ϕ(x) df n (x) S(δ) < ε. (18) где k 1 k 1 S n (δ) = ϕ(t i) i F n (x), S(δ) = ϕ(t i) i F (x). i=0 i=0 14 Импликация (12) = (15) носит название теоремы Хелли-Брея. 15 Множество точек непрерывности функции распределения всюду плотно на вещественной оси. 16 см. Рудин У. "Основы математического анализа стр

22 Take a partition of the segment [ A(ε), A(ε)] [ A(ε), A(ε)] = A(ε) = x 0< x 1 <... < x k = A(ε)}, считая что все точки деления x i C F, а диаметр разбиения меньше δ. Кроме того, для ε >0 for a chosen k (the number of partition points), we will consider the previously chosen number N(ε, A(ε)) ​​so large that for all n > N(ε, A(ε)) ​​F n (x i) F (x i)< ε, i = 0, k. k Тогда Поэтому i F n (x i) i F (x i) = F n (x i+1) F n (x i) F (x i+1) + F (x i) F n (x i+1) F (x i+1) + F n (x i) F (x i) < 2 ε, i = 0, k. k S n (δ) S(δ) k 1 ϕ(t i) i F n (x i) i F (x i) C k 2 ε k i=0 = 2 Cε. (19) Тогда из неравенств (18) и (19) получим A(ε) ϕ(x) df (x) A(ε) A(ε) A(ε) ϕ(x) df n (x) < 2 ε + 2 C ε. (20) В свою очередь из неравенства (16) и (17) будем иметь ϕ(x) df (x) ϕ(x) df n (x) x: x >A(ε)) ​​x: x > A(ε))< 4 C ε. (21) Собирая вместе неравенства (20) и (21), можно утверждать, что для любого ε >0, there exists a natural number N(ε, A(ε)) ​​such that for all n > N(ε, A(ε)) ​​the inequality ϕ(x) df (x) ϕ(x) df n (x)< 6 C ε + 2 ε. Равенство (15) доказано. Покажем теперь, что из равенства (15) следует слабая сходимость (12). Возьмем x 0 C F и рассмотрим две вспомогательные функции. Функция f ε (1) (x) непрерывна на всей числовой оси, равна единице при x x 0 ε, нулю при x x 0 и линейна на отрезке . Функция f ε (2) (x) := f ε (1) (x ε). Графики этих функций изображены на рис.? 22

23 1 0 f (1) ε f ε (2) x 0 ε x 0 x 0 + ε Fig.? x It is easy to see that F n (x 0) = x 0 f ε (2) (x) df n (x) Using condition (15), we pass to the limit at n, f ε (2) (x) df n ( x). lim F n (x 0) f (2) ε (x) df (x) = x 0 + ε f (2) ε (x) df (x) + x 0 + ε f ε (2) (x) df (x) x 0 + ε 1 df (x) + 0 = F (x 0 + ε). Arguing similarly, we will have F n (x 0) = Hence, for n = x 0 ε x 0 1 df n (x) x 0 lim F n (x 0) f (1) ε (x) df (x) + x 0 ε x 0 x 0 ε f (1) ε (x) df n (x) = f (1) ε (x) df (x) = f ε (1) (x) df n (x). f ε (1) (x) df (x) + f ε (1) (x) df (x) x 0 1 df (x) + 0 = F (x 0 ε). So, we have obtained the inequality F (x 0 ε) lim F n (x 0) lim F n (x 0) F (x 0 + ε). 23

24 Passing to the limit in this inequality at ε 0, taking into account that x 0 C F F (x 0) lim F n (x 0) lim F n (x 0) F (x 0). Thus, for any x 0 C F lim F n(x 0) = F (x 0). Equality (12), and with it the theorem, are proved. Remark on the Riemann-Stieltjes and Lebesgue-Stieltjes integrals. Note that the Riemann-Stieltjes integral I (, x0 ](x) df n (x) = lim N L, N L I (, x0 ](x) df n (x) does not exist if the distribution function F n (x) has a discontinuity at the point x 0. The standard proof of this fact is as follows: Considering the Riemann-Stieltjes sums for the integral N L I (, x0 ](x) df n (x), x 0 (L, N) (), it is easy to obtain the equality S = n 1 I (, x0 ](ξ i) = I (, x0 ](ξ i0), i=0 i0, x i0 +1) Fn (x i0 +1) F n (x i0), when choosing ξ i0< x 0, 0, при выборе ξ i0 >x 0. F n (x i0 +1) F n (x i0) > 0, then such integral sums cannot have a limit as the partition diameter tends to zero. Therefore, the integral () does not exist in the sense of Riemann-Stieltjes and, strictly speaking, inequality (21) cannot be obtained by integrating (according to Riemann-Stieltjes) inequality (20). Nevertheless, inequality (21) can also be obtained using the Riemann-Stieltjes integral. Indeed, since the function f ε (1) (x) is continuous, there exists the Riemann-Stieltjes integral f ε (1) (x) df n (x), 17 Partitions of a segment with this property can have an arbitrarily small diameter. 24

25 and f (1) ε (x) df n (x) = Since for all x (, x 0 ] then x 0 f ε (1) (x) df n (x) x 0 x 0 that f (1) ε (x) = 0 for x x 0 Similarly, f (2) ε (x) df n (x) = x 0 f ε (1) (x) df n (x) + f ε ( 1) (x) df n (x).x 0 f (1) ε (x) 1, 1 df n (x) = F n (x 0) F n () = F n (x 0).f ε (1) (x) df n (x) = 0. x 0 f (2) ε (x) df n (x) + x 0 +ε It is easy to see that, by the properties of the function f ε (2) (x) x 0 Therefore f (2) ε (x) df n (x) = F n (x 0); x 0 + ε f ε (2) (x) df n (x) + x 0 f ε (2) (x ) df n (x) 0;x 0 f (2) ε (x) df n (x) F n (x 0).x 0 + ε x 0 + ε f ε (2) (x) df n (x f (2) ε (x) df n (x) = 0. If we consider the integral () as the Lebesgue-Stieltjes integral for x 0 (L, N), then, since the indicator I (, x0 ]( x) is a simple function, by the definition of the Lebesgue-Stieltjes integral we will have N I (, x0 ](x) df n (x) = 1 (F n (x 0) F n (L)). And, therefore, L I ( , x0 ](x) df n (x) = F n (x 0) We give a new formulation of Theorem 6. To do this, we denote by C(R) the space of continuous functions bounded on the real axis. Next, using an arbitrary distribution function G(x), we define on the space C(R) the linear functional G(ϕ) := ϕ(x) dg(x), 25 ϕ(x) C(R)

26 Using new notation, Theorem 6 can be reformulated as follows. w Theorem 6. Weak convergence F n (x) F (x) is equivalent to the convergence of linear functionals F n (ϕ) F (ϕ) on the space C(R). Metrization of weak convergence. Metric P. Levy. For a pair of arbitrary distribution functions F (x) and G(x) on the real line, consider the functional L(F, G) = inf h > 0: F (x h) h G(x) F (x + h) + h), () which is called the P. Levy distance between the distributions F and G. Theorem 7. The functional L(,) defines a metric in the set of distribution functions on the real line. Convergence in this metric is equivalent to weak convergence w F n (x) F (x) L(F n, F) 0, (n). Proof. The theorem has been proven. d Problem 1. If X n X c = const, then X n Solution. Distribution function of constant c X. F (x) = ω : X(ω) x) = 1 for x c, 0 for x< c непрерывна во всех точках вещественной оси, кроме точки x = c. Поэтому в этой задаче слабая сходимость означает следующее 1 для x >c, lim F n(x) = 0 for x< c. Для любого ε >0 ω : X n (ω) c > ε) = ω : X n (ω)< c ε} + ω : X n (ω) >c+ε). Using obvious relations, we obtain the inequality ω : X n (ω)< c ε} ω : X n (ω) c ε} = F n (c ε), ω : X n (ω) >c + ε) = 1 ω : X n (ω) c + ε) = 1 F n (c + ε), ω : X n (ω) c > ε) F n (c ε) + 1 F n (c + ε). Passing to the limit in this inequality, for n we obtain a solution to the problem. d d Problem 2. If X n (ω) X(ω), and Y n (ω) 0, then X n (ω) + Y n (ω) X(ω). 26

27 Decision. Let F (x) := ω : X(ω) x). Choosing ε > 0 so that x, x ε, x+ε C F it is easy to establish the inclusions ω : X n (ω) + Y n (ω) x) ω : X n (ω) x + ε) ω : Y n ( ω) > ε), Then ω : X n (ω) x ε) ω : X n (ω) + Y n (ω) x) ω : Y n (ω) > ε). ω : X n (ω) + Y n (ω) x) ω : X n (ω) x + ε) + ω : Y n (ω) > ε), ω : X n (ω) x ε) ω : X n (ω) + Y n (ω) x) + ω : Y n (ω) > ε). Therefore, denoting F n (x) := ω : X n (ω) x), we have F n (x ε) ω : Y n (ω) > ε) ω : X n (ω)+y n (ω) x) F n (x+ε)+ω : Y n (ω) > ε). Passing in this inequality to the limit at n, taking into account that x ε, x+ε C F, we obtain the relation F (x ε) lim ω : X n (ω) + Y n (ω) x) lim ω : X n (ω) + Y n (ω) x) F (x + ε). Now we pass to the limit, tending ε 0: F (x) lim ω : X n (ω) + Y n (ω) x) lim ω : X n (ω) + Y n (ω) x) F (x). Hence lim ω : X n(ω) + Y n (ω) x) = F (x). Problem 3. If a sequence of distribution functions F n (x)) converges weakly to a distribution function F (x) that is continuous on the entire real axis, then this convergence is uniform on the entire real axis: F n (x) w F (x), and F (x) C (,) F n (x) F (x) on R. Solution. For an arbitrary ε > 0, take a natural number m > 1/ε. Since the function F (x) is continuous, there are points x 1<... < x m 1 такие, что F (x i) = i, i = 1,..., m 1. (!) m Ввиду слабой сходимости, в этих точках для всех n, начиная с некоторого, будут выполнены неравенства F n (x i) F (x i) < ε, i = 1,..., m 1. (!!) В силу неубывания функций распределения, а также свойств (!!) и (!) получаем следующие неравенства: при x [ x i, x i+1 ], (i = 1,..., m 2) F n (x) F (x) F n (x i+1) F (x i) F (x i+1) + ε F (x i) = 1 m + ε < 2ε. Аналогично, при x (, x 1 ] F n (x) F (x) F n (x 1) F (x 1) + ε = 1 m + ε < 2ε, 27

28 and for x [ x 1, x 2 ]... [ x m 2, x m 1 ] = F (x 0) F (x 0 1/m). lim X n = x 0 ) lim = 0. m Problem 7. If a sequence of distribution functions F n (x)) converges to a distribution function F (x) for all x from some dense set on the real line, then w F n ( x)F(x). Decision. To solve this problem, we need to prove that lim F n(x) = F (x) for all x C F. () Let x C F, then for any ε > 0 there is δ 1 (ε) > 0 such that as soon as x S(x, δ 1 (ε)) x: x x< δ 1 (ε)}, то F (x) F (x) < ε. () Рассмотрим всюду плотное на вещественной прямой множество A такое, что lim F n(x) = F (x) для всех x A. () Tогда существует пара точек x, x A таких, что x δ 1 (ε) < x < x, x < x < x + δ 1 (ε). Так как для точек x, x выполняется свойство (), то для любого ε >0, there exists N ε N such that as soon as n > N ε, then F n (x) F (x)< ε и F n (x) F (x) < ε. Следовательно, ввиду (), как только n >N ε, then F n (x) F (x)< 2ε и F n (x) F (x) < 2ε. 31

32 Hence, in view of the monotonicity of the function F n (x), for all n > N ε we obtain the inequality F n (x) F n (x) F n (x), F n (x) F (x)< 2ε. Cходимость () доказана. Замечание. Так как множество точек разрыва функции распределения (ввиду монотонности) является не более чем счетным, то множество её точек непрерывности является всюду плотным на вещественной оси. 32


LECTURE 3A (4) Radon Nikodim's theorem This lesson will be devoted to the proof of Radon Nikodim's theorem. We will need it in order to prove the isomorphism of the spaces L p (Ω) and (L q (Ω)) *, where

LABORATORY WORK 5 PASSING TO THE LIMIT UNDER THE SIGN OF THE LEBEGH INTEGRAL I. BASIC CONCEPTS AND THEOREM

EAT. ORE MATHEMATICAL ANALYSIS. NUMERICAL AND FUNCTIONAL SERIES NOVOSIBIRSK 200 2 RUSSIAN MINISTRY OF EDUCATION AND SCIENCE SEI HPE "NOVOSIBIRSK STATE PEDAGOGICAL UNIVERSITY" E.M. Rudoy MATHEMATICAL ANALYSIS.

Lecture 1 THE THEORY OF THE LEBESCUE MEASURE FROM R 2. 1. The need to extend the concept of an integral. Let us first discuss the construction of the Riemann integral. Let the function f(x) be defined on its own segment . Let's define a partition

5. Measure theory, lecture 5: measurable functions Measure and integral concepts are very close. The measure of a set is the integral of its characteristic function. Conversely, if a measure is given on a space, we can say

Valid analysis. Lecture 4. February 25, 2009 1 Real analysis. IV semester. year 2009. Lecturer Skvortsov V. A. Write about mistakes in [email protected] Lecture 4 February 25, 2009 Lebesgue defined the class

Last updated: March 16, 2008 List of definitions: 1.1 Non-overlapping segments ............................................... 2 1.2 System of non-overlapping segments ..........................................

V.V. Zhuk, A.M. Kamachkin 1 Power series. Convergence radius and convergence interval. The nature of the convergence. Integration and differentiation. 1.1 Convergence radius and convergence interval. Functional range

LECTURE 4A Metric spaces 1. The simplest (and most important) properties of metric spaces 1) Continuity of distance. It is easy to see that the "distance" function ρ(x, y) is continuous in each of the arguments.

MINISTRY OF EDUCATION AND SCIENCE OF RUSSIA Federal State Budgetary Educational Institution of Higher Professional Education "Novosibirsk National Research State

Lecture 1 The concept of a random process and its finite-dimensional distributions The theory of random processes is part of the theory of probability. The specificity of the theory of random processes is that it considers

List of problems with solutions in functional analysis Let a linear normed space Prove that for any elements the inequality from the axioms of the norm

Lecture 6 9 The principle of contraction mappings Fixed point theorems Let D be an operator, generally speaking, non-linear, acting from a Banach space B into itself. Definition Operator D, acting from a Banach space

Topic 2 Completeness, compactness, internal metrics. 2.1 Convergence and completeness Definition 2.1. A sequence of points x 1, x 2,... of a metric space (X, d) is called fundamental if for any

LECTURE A The Riemann-Stieltjes Integral 1. Let f n (x) C[; b], g(x) BV[; b], f n (x) f(x) on [; b]. Then Indeed, by virtue of the estimate f n (x)dg(x) f(x)dg(x). F (x)dg(x) F C[;b]V b (g) (1) and linearity properties

Supplementary Lecture 1 METRIC SPACES. APPENDIX 1. The simplest properties of metric spaces Property 1. Distance continuity. It is easy to see that the "distance" function ρ(x, y) is continuous

G. N. Yakovlev Function spaces

Chapter 1. Limits and continuity 1. Numerical sets 1 0. Real numbers From school mathematics you know natural N integers Z rational Q and real R numbers Natural and integer numbers

Limits and continuity. Limit of a function Let the function = f) be defined in some neighborhood of the point = a. At the same time, at the very point a, the function is not necessarily defined. Definition. The number b is called the limit

Lecture 1. Probability space luminaries). random experiments. Space

8 Complex number series Consider a number series with complex numbers of the form k a, (46) where (a k) is a given number sequence with complex terms k

Lomonosov Moscow State University Faculty of Computational Mathematics and Cybernetics Department of General Mathematics Problems in Functional Analysis (V semester) Lecturer Associate Professor N. Yu.

A. Yu. Pirkovsky Functional Analysis Lecture 4 4.1. Banach spaces Recall that a sequence (x n) in a metric space (, ρ) is called the fundamental sequence (or the Cauchy sequence),

LECTURES 8 9 Hille Yosida's theorem S 3. Definition and elementary properties of maximal monotone operators Throughout these two lectures, the symbol H denotes a Hilbert space with a scalar

V.V. Zhuk, A.M. Kamachkin 5 Functional sequences and series. Uniform convergence, the possibility of permutation of limit transitions, integration and differentiation of series and sequences.

Chapter 28 GENERALIZED FUNCTIONS 28.1. Spaces D, D of basic and generalized functions The concept of a generalized function generalizes the classical concept of a function and makes it possible to express in mathematical form such

21. Compactness Compactness is an extremely important technical concept of topology and analysis. Let's start with a definition. Definition 21.1. A topological space X is said to be compact if it has

Federal Agency for Education Federal State Educational Institution of Higher Professional Education SOUTHERN FEDERAL UNIVERSITY R. M. Gavrilova, G. S. Kostetskaya Methodical

1. Definition and basic properties of the Riemann integral Definition of a partition A partition of a segment [, b] is a set of points = x 1< x 2 < < x n+1 = b. Разбиение обозначают буквой P. Разбиение может быть

LABORATORY WORK 7 COMPLETE AND COMPACT IN METRIC SPACES. BASIC CONCEPTS AND THEOREMS Definition. Let X be a mapping: X X R that puts each pair (x y) X X into

Seminar Lecture 3 ABSOLUTELY CONTINUOUS FUNCTIONS 1. Definitions and properties Recall the definition given in the lecture. Definition 1. The function f(x) is called absolutely continuous on the segment [; b] if for

Measure theory, lecture 4: Lebesgue measure Mischa Verbitsky March 14, 2015 NMU 1 Boolean rings (review) DEFINITION: A Boolean ring is a ring all of whose elements are idempotents. NOTE: In a boolean ring

CHAPTER STABILITY OF LINEAR SYSTEMS In this chapter the stability of the simplest class of differential systems of linear systems is studied. In particular, it is established that for linear systems with constants

TOPIC V FOURIER SERIES LECTURE 6 Expansion of a periodic function in a Fourier series Many processes occurring in nature and technology have the properties to repeat at certain intervals Such processes

Functions continuous on a segment (theorems of Bolzano-Cauchy, Weierstrass, Kantor). Functionals are continuous on a compact set.. Theorem on intermediate values ​​Theorem. (Bolzano-Cauchy) Let the function f be continuous

DEFINITE INTEGRAL. Integral Sums and Definite Integral Let a function y = f () defined on the segment [, b ], where< b. Разобьём отрезок [, b ] с помощью точек деления на n элементарных

Moscow State University named after MVLomonosov Faculty of Chemistry Manual for preparing for the exam in mathematical analysis for students of the general flow Third semester Numerical series Differential

LECTURE 4A Metric spaces 1 1. Examples and counterexamples

Lecture 5 TOPOLOGICAL SPACES. 1. Definition of a topological space Definition 1. An arbitrary set X with a distinguished system of subsets τ of the set X is called a topological space

A. Yu. Pirkovsky Functional Analysis Lecture 23 23.1. Compact operators in a Hilbert space We already know quite a lot about compact operators in Banach spaces (see Lectures 18

2. Degree with a rational indicator; exponential In addition to what was said in the previous lecture, we also show how the concept of limit can be reduced to the concept of continuity. Namely, the following obvious

V.V. Zhuk, A.M. Kamachkin 7 Hilbert space. Definition. The simplest properties of the scalar product. Main theorem. Fourier series in Hilbert space. 7.1 Definition of a Hilbert space.

FOREWORD The manual is a continuation of . It was created on the basis of well-known textbooks on mathematical analysis [6]. It is based on the lectures of V. V. Zhuk, which were repeatedly read

13. Exponent and logarithm To complete the proof of Proposition 12.8, it remains for us to give one definition and prove one proposition. Definition 13.1. A series a i is called absolutely convergent if

LECTURE N Properties of infinitesimal and infinitely large functions Remarkable limits Continuity of functions Properties of infinitesimals Signs of the existence of a limit 3Properties of infinitely large 4First

S. S. Platonov Elements of harmonic analysis Part I. Fourier series f(x) = n= c n e inx Petrozavodsk 2010 Federal Agency for Education State educational institution of higher professional

Kolodiy A.M., Kolodiy N.A. Lectures on the theory of probability for students of the specialty "Mathematical support and administration of information systems" 4. Limit theorems 4. Law of large numbers.

ADDITIONAL CHAPTERS OF PROBABILITY THEORY EA Baklanov MMF NSU, 2012 CHAPTER 1 Probability inequalities 1. Exponential inequalities. Throughout this section, X 1,..., X n are independent random

INTRODUCTION TO MATHEMATICAL ANALYSIS Topic: Limit and continuity of a function Lecture 7 Limit of a function CONTENTS: Limit of a function at a point Limit of a function at infinity Basic theorems on the limits of functions Infinite