Biographies Characteristics Analysis

Find the confidence interval for estimating the mathematical expectation. Point and interval estimates of specific gravity

CONFIDENCE INTERVAL FOR MATHEMATICAL EXPECTATION

1. Let it be known that sl. the quantity x obeys the normal law with unknown mean μ and known σ 2: X~N(μ,σ 2), σ 2 is given, μ is unknown. β is specified. Based on the sample x 1, x 2, … , x n, it is necessary to construct I β (θ) (now θ=μ), satisfying (13)

The sample mean (also called sample mean) obeys the normal law with the same center μ, but smaller variance X~N (μ, D), where variance D =σ 2 =σ 2 /n.

We will need the number K β, defined for ξ~N(0,1) by the condition

In words: between the points -K β and K β of the abscissa axis lies the area under the density curve of the standard normal law, equal to β

For example, K 0.90 = 1.645 quantile of the 0.95 level of value ξ

K 0.95 = 1.96. ; K 0.997 =3.

In particular, setting aside 1.96 standard deviations to the right and the same to the left from the center of any normal law, we capture the area under the density curve equal to 0.95, due to which K 0 95 is a quantile of the level 0.95 + 1/2 * 0.005 = 0.975 for this law.

The required confidence interval for the general mean μ is I A (μ) = (x-σ, x+σ),

where δ = (15)

Let's give a rationale:

According to what has been said, words. the value falls into the interval J=μ±σ with probability β (Fig. 9). In this case, the value deviates from the center μ by less than δ, and random interval ± δ (with a random center and the same width as J) will cover the point μ. That is Є J<=> μ Є Iβ, and therefore Р(μЄІ β) = Р(Є J)=β.

So, the interval I β, constant over the sample, contains the mean μ with probability β.

Clearly, the larger n, the smaller σ and the interval is narrower, and the larger we take the guarantee β, the wider the confidence interval.

Example 21.

Based on a sample with n=16 for normal size With known varianceσ 2 =64 found x=200. Construct a confidence interval for the general mean (in other words, for mathematical expectation) μ, taking β=0.95.

Solution. I β (μ)= ± δ, where δ = K β σ/ -> K β σ/ =1.96*8/ = 4

I 0.95 (μ)=200 4=(196;204).

Concluding that with a guarantee of β=0.95 the true average belongs to the interval (196,204), we understand that an error is possible.

Out of 100 confidence intervals I 0.95 (μ), on average 5 do not contain μ.

Example 22.

In the conditions of the previous example 21, what should n be taken to halve the confidence interval? To have 2δ=4, we must take

In practice, one-sided confidence intervals are often used. So, if they are useful or not scary high valuesμ, but are unpleasantly low, as in the case of strength or reliability, then it is reasonable to construct a one-sided interval. To do this, you should raise its upper limit as much as possible. If we construct, as in Example 21, a two-sided confidence interval for a given β, and then expand it as much as possible at the expense of one of the boundaries, we obtain a one-sided interval with a greater guarantee β" = β + (1-β) / 2 = (1+ β)/2, for example, if β = 0.90, then β = 0.90 + 0.10/2 = 0.95.

For example, we will assume that we're talking about about the strength of the product and raise the upper limit of the interval to . Then for μ in example 21 we obtain a one-sided confidence interval (196,°°) with a lower limit of 196 and a confidence probability β"=0.95+0.05/2=0.975.

A practical disadvantage of formula (15) is that it is derived under the assumption that the variance = σ 2 (hence = σ 2 /n) is known; and this rarely happens in life. The exception is the case when the sample size is large, say, n is measured in hundreds or thousands, and then for σ 2 one can practically take its estimate s 2 or .

Example 23.

Suppose, in some big city as a result sample survey living conditions of residents, the following data table was obtained (example from work).

Table 8

Source data for example

It is natural to assume that the value X is the total (usable) area (in m2) per person and obeys the normal law. The mean μ and variance σ 2 are unknown. For μ, a 95% confidence interval needs to be constructed. In order to find sample means and variance using grouped data, we will compile the following table of calculations (Table 9).

Table 9

Calculating X and 5 from grouped data

N groups 3 Total area per person, m2 Number of residents in group r j Midpoint of the interval x j r j x j rjxj 2
Up to 5.0 2.5 20.0 50.0
5.0-10.0 7.5 712.5 5343.75
10.0-15.0 12.5 2550.0 31875.0
15.0-20.0 17.5 4725.0 82687.5
20.0-25.0 22.5 4725.0 106312.5
25.0-30.0 27.5 3575.0 98312.5
more than 30.0 32.5 * 2697.5 87668.75
- 19005.0 412250.0

In this auxiliary table, the first and second initial statistical moments are calculated using formula (2) a 1 And A 2

Although the variance σ 2 is unknown here, due to the large sample size, we can practically apply formula (15), putting σ = = 7.16 in it.

Then δ=k 0.95 σ/ =1.96*7.16/ =0.46.

The confidence interval for the general average at β=0.95 is equal to I 0.95 (μ) = ± δ = 19 ± 0.46 = (18.54; 19.46).

Consequently, the average value of area per person in a given city with a guarantee of 0.95 lies in the interval (18.54; 19.46).



2. Confidence interval for the mathematical expectation μ in the case of an unknown variance σ 2 of the normal value.

(16)

This interval for a given guarantee β is constructed according to the formula, where ν = n-1,

.

The coefficient t β,ν has the same meaning for the t distribution with ν degrees of freedom as β for the distribution N(0,1), namely:

In other words, sl. The value tν falls into the interval (-t β,ν ; +t β,ν) with probability β. The values ​​of t β,ν are given in Table 10 for β=0.95 and β=0.99.

Table 10.

Values ​​t β,ν

Returning to example 23, we see that in it the confidence interval was constructed according to formula (16) with the coefficient t β,υ =k 0..95 =1.96, since n=1000.


>Search only in


this section

Confidence intervals: list of solutions to problems

Confidence intervals: theory and problems

Understanding Confidence Intervals
Let us briefly introduce the concept of a confidence interval, which
1) estimates some parameter of a numerical sample directly from the data of the sample itself,

2) covers the value of this parameter with probability γ. Confidence interval for parameter X (with probability γ) is called an interval of the form , such that

, and the values ​​are calculated in some way from the sample. Usually in applied problems confidence probability

taken equal to γ ​​= 0.9; 0.95; 0.99. Consider a sample of size n made from population , distributed presumably according to the normal distribution law. Let us show what formulas are used to find confidence intervals for distribution parameters

- mathematical expectation and dispersion (standard deviation).

Confidence interval for mathematical expectation Case 1. The variance of the distribution is known and equal to . Then the confidence interval for the parameter a
has the form: t

determined from the Laplace distribution table according to the relation Case 2. The variance of the distribution is unknown, calculated from the sample point estimate The variance of the distribution is known and equal to . Then the confidence interval for the parameter a
dispersion Then the confidence interval for the parameter has the form:, where is the sample average calculated from the sample, parameter

determined from the Student distribution table Example. Based on 7 measurements of a certain quantity, we found average results dimensions equal to 30 and, equal to 36. Find the boundaries within which, with a reliability of 0.99, lies true meaning measured quantity.

Solution. We'll find . Then the confidence limits for the interval containing the true value of the measured value can be found using the formula:
, where is the sample mean, is the sample variance. We substitute all the values ​​and get:

Confidence interval for variance

We believe that, generally speaking, the mathematical expectation is unknown, and only the point unbiased estimate of the variance is known. Then the confidence interval has the form:
, Where - distribution quantiles determined from tables.

determined from the Student distribution table Based on the data of 7 tests, the evaluation value for the standard deviation was found s=12. Find, with probability 0.9, the width of the confidence interval constructed to estimate the dispersion.

Solution. The confidence interval for the unknown population variance can be found using the formula:

We substitute and get:


Then the width of the confidence interval is 465.589-71.708=393.881.

Confidence interval for probability (proportion)

Confidence interval for mathematical expectation Let the sample size and sample fraction (relative frequency) be known in the problem. Then the confidence interval for the general share (true probability) has the form:
, where the parameter has the form: is determined from the Laplace distribution table using the relation.

determined from the Laplace distribution table according to the relation If in the problem the total size of the population from which the sample was taken is additionally known, the confidence interval for the general share (true probability) can be found using the adjusted formula:
.

determined from the Student distribution table It is known that Find the boundaries within which the general share is likely to be contained.

Solution. We use the formula:

Let's find the parameter from the condition , we get Substitute into the formula:


Other examples of problems mathematical statistics you will find on the page

Let a sample be taken from a general population subject to the law normal distribution for parameterN( m; ). This basic assumption of mathematical statistics is based on the central limit theorem. Let the general standard deviation be known , but the mathematical expectation of the theoretical distribution is unknown m(average value ).

In this case, the sample mean , obtained during the experiment (section 3.4.2), will also be a random variable m;
). Then the “normalized” deviation
N(0;1) – is a standard normal random variable.

The task is to find an interval estimate for m. Let's construct a two-sided confidence interval for m so that the true mathematical expectation belongs to him with a given probability (reliability) .

Set such an interval for the value
- this means finding the maximum value of this quantity
and minimum
, which are the boundaries of the critical region:
.

Because this probability is equal
, then the root of this equation
can be found using Laplace function tables (Table 3, Appendix 1).

Then with probability it can be argued that the random variable
, that is, the desired general average belongs to the interval
. (3.13)

Size
(3.14)

called accuracy assessments.

Number
quantile normal distribution - can be found as an argument of the Laplace function (Table 3, Appendix 1), taking into account the relation 2Ф( u)=, i.e. F( u)=
.

Back, by set value deviations can be found with what probability the unknown general mean belongs to the interval
. To do this you need to calculate

. (3.15)

Let a random sample be extracted from the general population using the repeated selection method. From Eq.
can be found minimum resampling volume n, necessary for the confidence interval with a given reliability did not exceed the preset value . The required sample size is estimated using the formula:

. (3.16)

Let's explore estimation accuracy
:

1) As the sample size increases n magnitude decreases, and therefore the accuracy of the estimate increases.

2) C increase reliability of the assessment the value of the argument increases u(because F(u) increases monotonically) and therefore increases . In this case, the increase in reliability reduces accuracy of its assessment .

Evaluation
(3.17)

called classical(Where has the form:- a certain parameter depending on And n), because it characterizes the most frequently encountered distribution laws.

3.5.3 Confidence intervals for estimating the mathematical expectation of a normal distribution with an unknown standard deviation 

Let it be known that the population is subject to the law of normal distribution for parameterN( m;), where the value root mean square deviations unknown.

To construct a confidence interval for estimating the general mean in this case, statistics are used
, having a Student distribution with k= n–1 degrees of freedom. This follows from the fact that N(0;1) (see section 3.5.2), and
(see section 3.5.3) and from the definition of the Student distribution (part 1.section 2.11.2).

Let us find the accuracy of the classical estimate of the Student distribution: i.e. we'll find has the form: from formula (3.17). Let the probability of fulfilling the inequality
given by reliability :

. (3.18)

Because the TSt( n-1), it is obvious that has the form: depends on And n, so they usually write
.

(3.19)

Where
– Student distribution function with n-1 degrees of freedom.

Solving this equation for m, we get the interval
which reliably  covers the unknown parameter m.

Magnitude has the form: , n-1, used to determine the confidence interval random variable T(n-1), distributed according to t-test with n-1 degrees of freedom is called Student's coefficient. It should be found by given values n and  from tables " Critical points Student distributions. (Table 6, Appendix 1), which represent solutions to equation (3.19).

As a result, we get the following expression accuracy confidence interval for estimating the mathematical expectation (general mean), if the variance is unknown:

(3.20)

Thus, there is a general formula for constructing confidence intervals for the mathematical expectation of the population:

where is the accuracy of the confidence interval depending on the known or unknown dispersion is found according to formulas, respectively 3.16. and 3.20.

Problem 10. Some tests were carried out, the results of which are listed in the table:

x i

It is known that they obey the law of normal distribution with
. Find rating m* for mathematical expectation m, construct a 90% confidence interval for it.

Solution:

So, m(2.53;5.47).

Problem 11. The depth of the sea is measured by a device whose systematic error is 0, and random errors are distributed according to the normal law, with a standard deviation =15m. How many independent measurements must be made to determine the depth with errors of no more than 5 m at a confidence level of 90%?

Solution:

According to the conditions of the problem we have for parameterN( m; ), Where =15m, =5m, =0.9. Let's find the volume n.

1) With a given reliability = 0.9, we find from Tables 3 (Appendix 1) the argument of the Laplace function u = 1.65.

2) Knowing the specified estimation accuracy =u=5, let's find
. We have

. Therefore the number of tests n25.

Problem 12. Temperature sampling has the form: for the first 6 days of January is presented in the table:

Find the confidence interval for the mathematical expectation m population with confidence probability
and evaluate the general standard deviation s.

Solution:


And
.

2) Unbiased estimate find it using the formula
:

=-175

=234.84

;
;

=-192

=116


.

3) Since the general variance is unknown, but its estimate is known, then to estimate the mathematical expectation m we use the Student distribution (Table 6, Appendix 1) and formula (3.20).

Because n 1 =n 2 =6, then ,
, s 1 =6.85 we have:
, hence -29.2-4.1<m 1 < -29.2+4.1.

Therefore -33.3<m 1 <-25.1.

Similarly we have,
, s 2 = 4.8, so

–34.9< m 2 < -29.1. Тогда доверительные интервалы примут вид: m 1 (-33.3;-25.1) and m 2 (-34.9;-29.1).

In applied sciences, for example, in construction disciplines, confidence interval tables are used to assess the accuracy of objects, which are given in the relevant reference literature.

Let the random variable X of the population be normally distributed, taking into account that the variance and standard deviation s of this distribution are known. It is required to estimate the unknown mathematical expectation using the sample mean. In this case, the task comes down to finding a confidence interval for the mathematical expectation with reliability b. If you specify the value of the confidence probability (reliability) b, then you can find the probability of falling into the interval for the unknown mathematical expectation using formula (6.9a):

where Ф(t) is the Laplace function (5.17a).

As a result, we can formulate an algorithm for finding the boundaries of the confidence interval for the mathematical expectation if the variance D = s 2 is known:

  1. Set the reliability value – b.
  2. From (6.14) express Ф(t) = 0.5× b. Select the value of t from the table for the Laplace function based on the value Ф(t) (see Appendix 1).
  3. Calculate the deviation e using formula (6.10).
  4. Write down a confidence interval using formula (6.12) such that with probability b the inequality holds:

.

Example 5.

The random variable X has a normal distribution. Find confidence intervals for an estimate with reliability b = 0.96 of the unknown mathematical expectation a, if given:

1) general standard deviation s = 5;

2) sample average;

3) sample size n = 49.

In formula (6.15) of the interval estimate of the mathematical expectation A with reliability b all quantities except t are known. The value of t can be found using (6.14): b = 2Ф(t) = 0.96. Ф(t) = 0.48.

Using the table in Appendix 1 for the Laplace function Ф(t) = 0.48, find the corresponding value t = 2.06. Hence, . By substituting the calculated value of e into formula (6.12), you can get a confidence interval: 30-1.47< a < 30+1,47.

The required confidence interval for an estimate with reliability b = 0.96 of the unknown mathematical expectation is equal to: 28.53< a < 31,47.

To begin with, recall the following definition:

Let's consider the following situation. Let the population variants have a normal distribution with mathematical expectation $a$ and standard deviation $\sigma$. The sample mean in this case will be considered as a random variable. When the quantity $X$ is normally distributed, the sample mean will also be normally distributed with the parameters

Let us find a confidence interval that covers the value $a$ with a reliability of $\gamma $.

To do this, we need the equality

From it we get

From here we can easily find $t$ from the table of function values ​​$Ф\left(t\right)$ and, as a consequence, find $\delta $.

Let us recall the table of values ​​of the function $Ф\left(t\right)$:

Figure 1. Table of function values ​​$Ф\left(t\right).$

Confidence integral for estimating the mathematical expectation for an unknown $(\mathbf \sigma )$

In this case, we will use the corrected variance value $S^2$. Replacing $\sigma $ with $S$ in the above formula, we get:

Example problems for finding a confidence interval

Example 1

Let the quantity $X$ have a normal distribution with variance $\sigma =4$. Let the sample size be $n=64$ and the reliability be $\gamma =0.95$. Find the confidence interval for estimating the mathematical expectation of this distribution.

We need to find the interval ($\overline(x)-\delta ,\overline(x)+\delta)$.

As we saw above

\[\delta =\frac(\sigma t)(\sqrt(n))=\frac(4t)(\sqrt(64))=\frac(\t)(2)\]

The parameter $t$ can be found from the formula

\[Ф\left(t\right)=\frac(\gamma )(2)=\frac(0.95)(2)=0.475\]

From Table 1 we find that $t=1.96$.