Biographies Characteristics Analysis

Methods of information processing and forecasting for students of the specialty: "Management of organizations". Tabular values ​​of the Irwin criterion for the extreme elements of the variation series V.V.

Let be the observed sample and be the variational series constructed from it. The hypothesis to be tested is that all belong to the same population(no outliers). An alternative hypothesis is that there are outliers in the observed sample.

According to the Chauvenet criterion, an element of the volume sample is an outlier if the probability of its deviation from the mean value is not greater than .

The following Chauvenet statistics are compiled:

where is the mean,

Sample variance

Let us determine what distribution the statistics has when the hypothesis is fulfilled. To do this, we make the assumption that even at small random variables and are independent, then the distribution density random variable looks like:


The values ​​of this distribution function can be calculated using the Maple 14 mathematical package, substituting the obtained values ​​instead of the unknown parameters.

If statistics then the value () should be recognized as an outlier. Critical values ​​are given in the table (see Appendix A). Instead, in formula (1.1), we substitute extreme values ​​to check for outliers.

Irwin's criterion

This criterion is used when the distribution variance is known in advance.

A sample of volume is taken from a normal general population, and a variation series is compiled (sorted in ascending order). The same hypotheses and are considered as in the previous criterion.

When the largest (smallest) value is recognized as an outlier with a probability. Critical values ​​are listed in the table.

Grubbs criterion

Let a sample be extracted and a variational series be built on it. The hypothesis to be tested is that all () belong to the same general population. When checking for an outlier of the largest sample value, the alternative hypothesis is that they belong to one law, but to some other, significantly shifted to the right. When checking for an outlier of the largest value of the sample, the statistics of the Grubbs test has the form

where is calculated by formula (1.2), and - by (1.3)

When testing for an outlier of the smallest sample value, the alternative hypothesis assumes that it belongs to some other law, significantly shifted to the left. AT this case the calculated statistic takes the form

where is calculated by formula (1.2), and - by (1.3).

Statistics or are applied when the variance is known in advance; statistics and -- when the variance is estimated from the sample using relation (1.3).

Max or minimum element The sample is considered an outlier if the value of the corresponding statistics exceeds the critical value: or, where is the specified significance level. Critical values ​​and are given in summary tables (see Appendix A). The statistics obtained in this test, when the null hypothesis is fulfilled, have the same distribution as the statistics in the Chauvenet test.

For > 25, one can use approximations for critical values

where is the quantile of the standard normal distribution.

A is approximated as follows

If the variance () and the mathematical expectation (µ - mean value) are known in the extracted sample, then the statistics are used

The critical values ​​of these statistics are also listed in the tables. If, then the outlier is considered significant and the alternative hypothesis is accepted.


Tasks for self-study disciplines.

Exercise 1. In accordance with the option, simulate a set of empirical data obtained as a result of measuring a one-dimensional feature. To do this, you need to tabulate the function:

, ,

and get 15 - 20 consecutive data. Here, presumably, the characteristic of the sign (reflects the main trend of the sign), and the interference (errors) of measurements, which were the result of the manifestation of various kinds of accidents.

Initial data options:

Carry out the detection of anomalous levels of the data series obtained by tabulating the function and perform their smoothing:

a). Irwin's method, according to the formula

,

.

The calculated values ​​are compared with the tabular values ​​of the Irwin criterion:

Irwin's test table

The table shows the values ​​of the Irwin test for the significance level (with a 5% error).

b). by checking the differences in the average levels, breaking the time series of data into approximately two equal parts and calculating the mean value and variance for each part. Next, check the equality of the variances of both parts using the Fisher test. If the hypothesis of equality of variances is accepted, proceed to test the hypothesis of the absence of a trend using Student's t-test. To calculate empirical value statistics, use formulas:

,

where is the mean standard deviation mean differences:

.

Compare the calculated value of statistics with the table.

in). Foster-Stuart method.

2. Perform mechanical smoothing of the levels of the series:

a). simple moving average method;

b). weighted moving average method;

in). Exponential smoothing method.

Task 2. Datasheet economic indicators, a time series of monthly volumes of transportation (tied to a certain area) of agricultural goods in conventional units is given.

Applying the Chetverikov method to extract the components of the time series:

a). align the empirical series using a centered moving average with a smoothing period;

b). subtract the obtained preliminary estimate of the trend from the initial empirical series: .

in). Calculate for each year (by row) the standard deviation of the value using the formula

G). find the preliminary value of the average seasonal wave: .

e). get a series devoid of a seasonal wave: .

e). smooth the resulting series using a simple moving average with a smoothing interval of five, and obtain a new trend estimate .

g). calculate the deviations of the series from the original empirical series:

.

h). the resulting deviations are subjected to processing in accordance with paragraphs. in). and d). to identify new values ​​of the seasonal wave.

and). to calculate the strength factor of the seasonal wave according to the formulas and further (the coefficient itself):

.

The stress factor is not calculated for the first and last year.

to). Using the tension factor, calculate the final values seasonal component time series: .

Task 3. The time series is given in the table:

Make a preliminary selection of the best growth curve:

a). finite difference method (Tintner);

b). growth characteristics method.

2. For the original series, construct linear model , having determined its parameters by the least squares method.

3. For the initial time series, build an adaptive Brown model with the smoothing parameter and ; choose nai best model Brown , where is the lead time (number of steps forward).

4. Assess the adequacy of models based on research:

a). closeness mathematical expectation zero residual component; critical value of Student's statistic to accept (for confidence level 0,70);

b). random deviations of the residual component according to the criterion of peaks (turning points); perform calculations based on the ratio ;

in). independence (lack of autocorrelation) of the levels of a number of residuals, either by the Durbin-Watson test (use the levels and as critical ones), or by the first autocorrelation coefficient (take the critical level equal to );

G). normality of the distribution law of the residual component based on the RS-criterion (as critical levels accept the interval (2.7 - 3.7)).

5. Evaluate the accuracy of the models using the standard deviation and mean relative error approximations.

6. Based comparative analysis the adequacy and accuracy of the models, choose the best model, according to which to build point and interval forecasts two steps ahead (). Show the results of forecasting graphically.

Task 4. The evaluation of the processors of 10 workstations of the local network, built on the basis of machines of approximately the same type, but from different manufacturers (which implies some deviations in the parameters of the machines from the base model). To test the operation of processors, a mixture of the ICOMP 2.0 type was used, which is based on two main tests:

1. 125.turb3D - turbulence simulation test in cubic volume(application software);

2.NortonSI32 - engineering program AutoCAD type

and an auxiliary test for normalizing data processing time SPECint_base95. The processors were evaluated by the weighted execution time of the mixture, normalized by the efficiency of the base processor, in accordance with the formula

where is the execution time of the th test;

the weight of the test;

efficiency of the base processor on the m test.

If expression (1) is logarithmic, then we get:

and after renaming the variables:

base test processing time SPECint_base95 ;

logarithm of the processing time of the first test,

logarithm of processing time of the second test, regression coefficient obtained in the assessments (test weight);

regression coefficient - the weight of the test for processing arithmetic operations in integers (basic test).

1. According to the measurement data given in the table, build a regression (empirical) function, evaluate the regression coefficients and check the model for adequacy (calculate the covariance matrix, pair correlation coefficients, coefficient of determination).

Data options:

Option 1.

Option 2.

Option 3.

Option 4.

With relative sliding of parts of friction pairs, damage to the contacting surfaces occurs. This type of damage to the surface volumes of the part is called wear. The loss of only one thousandth of the mass of the machine as a result of wear leads to a complete loss of performance. Every three years...
(Mechanics. Fundamentals of calculation and design of machine parts)
  • SYSTEM STABILITY CRITERIA AND METHODS FOR DETERMINING CRITICAL LOADS
    There are three main criteria for the stability of structures: dynamic, static and energy, which also determine the methodology for calculating structures for stability. one. Dynamic(according to Lyapunov) criterion is based on the study of solutions to the equations of dynamic motion deviated from the initial ...
    (Structural mechanics flat bar systems)
  • CRITERIA FOR SELECTING ADVERTISING DISTRIBUTION CHANNELS
    Among all the decisions that are made in the planning process, the most important is the choice of specific media within each media. As a rule, media planners tend to choose those media that allow them to achieve the following goals: 1) achieve a given frequency of presentation of an advertising message ...
    (Psychology of Mass Communications)
  • Correlation-regression analysis
    Correlation and regression are methods for identifying statistical dependence between the studied variables. “Based on the analysis of empirical data collected during the study, not only the very fact of the existence of a statistical dependence is described, but also the mathematical formula of the function ...
    (Marketing research)
  • CORRELATION AND REGRESSION RESEARCH METHOD
    One of the modeling methods economic processes is a correlation-regression research method. Modeling is the process of expressing complex interrelated economic phenomena means mathematical formulas and symbols. Combination qualitative analysis using mathematical...
    (General and applied statistics)
  • CORRELATION AND REGRESSION ANALYSIS
    Statistical study of economic and technological processes is currently one of essential tools in the development of process control systems. Knowing the relationships between parameters allows you to identify the key factors that affect the quality of the finished product or the studied ...
    (Mathematics and economic-mathematical models)

  • Gross errors (misses) are among the errors that change randomly with repeated observations. They clearly exceed in their value the errors justified by the conditions of the experiment. The miss is understood as the value of the error, the deviation of which from the distribution center significantly exceeds the value justified by the objective conditions of measurement. Therefore, from the point of view of probability theory, the occurrence of a miss is unlikely.

    Gross errors can be caused by uncontrolled changes in measurement conditions, malfunction, operator errors, etc.

    To eliminate gross errors, the apparatus for testing statistical hypotheses is used.

    In metrology, statistical hypotheses are used, which are understood as hypotheses about the form of an unknown distribution, or about the parameters of known distributions.

    Examples of statistical hypotheses:

    The considered sample (or its separate result) belongs to the general population;

    The general population is distributed according to normal law;

    variance of two normal collections are equal to each other.

    In the first two hypotheses, an assumption was made about the type of unknown distribution and the belonging of individual (suspicious) results this species distributions, and in the third - about the parameters of two known distributions. Along with the hypothesis put forward, a hypothesis that contradicts it is also considered. The null (basic) hypothesis is called. A competing (alternative) is the one that contradicts the zero.

    When proposing and accepting a hypothesis, the following four cases can occur:

    the hypothesis is accepted, and in fact it is correct;

    The hypothesis is true, but it is wrongly rejected. The resulting error is called an error of the first kind, and the probability of its occurrence is called the level of significance and denoted q(α );

    the hypothesis is rejected, and in reality it is incorrect;

    The hypothesis is incorrect, but is erroneously accepted. The error that occurs in this case is called an error of the second kind, and the probability of its occurrence is denoted by β .

    Value 1 - β, i.e., the probability that a hypothesis will be rejected when it is wrong is called the power of the criterion.

    It should be noted that in the regulatory documentation on statistical product quality control and textbooks on quality management, the probability of recognizing a batch of good products as unusable (i.e., making a mistake of the first kind) is called the “manufacturer's risk”, and the probability of accepting an unusable batch is called “consumer risk” .

    All statistical criteria are random variables taking certain values(tables of critical values). The area of ​​acceptance of the hypothesis (the area allowed values) is the set of criterion values ​​under which the hypothesis is accepted. Critical is the set of criterion values ​​at which the null hypothesis is rejected. The area of ​​acceptance of the hypothesis and the critical area are separated by critical points, which are the tabular values ​​of the criteria.

    The area of ​​rejection of the hypothesis, as shown in Figure 1, can be one-sided (right-sided or left-sided) and two-sided.

    right hand

    K obs > k cr, where k cr - positive number (Figure 1, a).

    left-sided is called the critical region defined by the inequality

    K obs< k кр, where k cr - negative number (Figure 1, b).

    bilateral is called the critical region defined by the inequalities

    K obs > k 1 ; K obs 2 , where k 2 >k 1 .

    If a critical points are symmetric with respect to zero, the two-sided critical region is determined by the inequalities: K obs<-k кр, K набл >k cr, or equivalent inequality \K obl \>k cr(Figure 1, c).

    Figure 1 - Graphical interpretation of the distribution of the area of ​​acceptance of the hypothesis

    The basic principle of testing statistical hypotheses is formulated as follows: if the observed (experimental) value of the criterion belongs to the critical region, the hypothesis is rejected; if the observed value of the criterion belongs to the acceptance area of ​​the hypothesis, the hypothesis is accepted.

    Statistical hypothesis testing is carried out for the accepted level of significance q(taken equal to 0.1; 0.05; 0.01, etc.). So the accepted level of significance q = 0.05 means that the advanced zero statistical hypothesis can be accepted with confidence P= 0.95. Or is there a probability to reject this hypothesis (make a Type I error) equal to P= 0,95.

    The null statistical hypothesis confirms that the tested “suspicious” result of measurement (observation) belongs to this group of measurements.

    The formal criterion for the anomalous result of observations (and, consequently, the basis for accepting a competing hypothesis: the “suspicious” result does not belong to this group of measurements) is the boundary spaced from the distribution center by the value tS, i.e.:

    (1)

    where x isub- the result of the observation, checked for the presence of a gross error; t- coefficient depending on the type and distribution law, sample size, significance level; S - RMS.

    Thus, the margins of error depend on the type of distribution, the sample size, and the chosen confidence level.

    When processing already available observational results, arbitrarily discard individual results should not be used, as this may lead to a fictitious increase in the accuracy of the measurement result. A group of measurements (sample) may contain several gross errors and their elimination is carried out sequentially, one at a time.

    All methods for eliminating gross errors (misses) can be divided into two main types:

    Exclusion methods with a known general RMS;

    Exclusion methods for unknown general RMS.

    In the first case X c . R. and RMS is calculated based on the results of the entire sample; in the second case, suspicious results are removed from the sample before calculation.

    In the case of a limited number of observations and (or) the complexity of estimating the parameters of the distribution law, it is recommended to exclude gross errors using approximate coefficients of the distribution type. This excludes the values x i< x r- and x i> x r+ , where x r - , x r+ – miss limits determined by the expressions:

    (2),(3)

    where A– coefficient, the value of which is selected depending on the specified confidence probability in the range from 0.85 to 1.30 (it is recommended to choose the maximum value BUT equal to 1.3); γ – counter-kurtosis, the value of which depends on the form of the quantity distribution law (ZRV).

    After the elimination of misses, the operation to determine the estimates of the distribution center and the standard deviation of the results of observations and measurements must be repeated.

    Since in practice measurements are more common with an unknown RMS (a limited number of observations), the following criteria for checking suspicious (in terms of errors) observation results are considered in the manual: Irvin, Romanovsky, variation range, Dixon, Smirnov, Chauvin.

    Since the criterion requirements (coefficients) that determine the boundary beyond which the “rough” (in the sense of errors) observational results are different authors are different, then the check should be performed simultaneously according to several criteria (it is recommended to use at least three of the ones considered below). The final conclusion about the belonging of “suspicious” results to the considered set of observations should be made according to the majority of criteria. In addition, the choice of a criterion for determining gross errors should be performed after constructing a histogram of the observation results. By the type of the histogram, a preliminary identification of the type of distribution law is performed (normal, close to normal or different from it).

    Irwin's criterion. For the experimental data obtained, the coefficient is determined by the formula:

    (4)

    where x n + 1, x nhighest values random variable; S is the standard deviation calculated for all sample values.

    Then this coefficient is compared with the table value λ q, the possible values ​​of which are given in Table 1.

    Table 1 - Irwin's criterion λ q.

    If a λ >λ q , then the null hypothesis is not confirmed, i.e., the result is erroneous, and it should be excluded during further processing of the observation results.

    Romanovsky criterion. The competing hypothesis about the presence of gross errors in suspicious results is confirmed if the following inequality is true:

    (5)

    where tp- quantile of the Student's distribution for a given confidence probability with the number of degrees of freedom k = n -k n (k n - number of suspicious observations). A fragment of quantiles for Student's distribution is presented in Table 2.

    Point estimates distribution and RMS S results

    observations is calculated without taking into account k n suspicious observations.

    Table 2 - Student's criterion tp(Student quantiles)

    Criterion of variation range. Is one of simple methods exclusion of a gross measurement error (miss). To use it, determine the range variation series ordered set of observations (x 1 ≤x 2 ≤...≤x k ≤...≤x n):

    If any member of the variation series, for example x k , differs sharply from all others, then a check is made using the following inequality:

    (7)

    where X- sample mean arithmetic value, computed after exclusion of the expected miss; z- criterion value.

    The null hypothesis (about the absence of a gross error) is accepted if indicated inequality performed. If a x k does not satisfy condition (7), then this result is excluded from the variation series.

    Coefficient z depends on the number of members of the variational series n which is presented in table 3.

    Table 3 - Variation range criterion

    Dixon's criterion. The criterion is based on the assumption that the measurement errors obey the normal law (previously, it is necessary to build a histogram of the results of observations) and testing the hypothesis of belonging to the normal distribution law. When using the criterion, the Dixon coefficient (the observed value of the criterion) is calculated to test for the largest or smallest extreme value depending on the number of measurements. Table 4 shows the formulas for calculating the coefficients. Odds r 10 , r 11 apply when there is one outlier, and r 21 and r 22 - when there are two ejections. An initial ordering of the measurement results (sample size) is required. The criterion is applied when the sample may contain more than one gross error.

    Table 4 - Dixon coefficient formulas

    The values ​​of the Dixon coefficients calculated for the sample using the formulas r compared with the accepted (table) value of the Dixon criterion r q(table 5).

    Null hypothesis about the absence of a gross error is satisfied if the inequality r< r q.

    If a r> r q, then the result is recognized as a gross error and

    excluded from further processing.

    Table 5 - Criteria values ​​of the Dixon coefficients (at the accepted level

    significance q)

    Wright criteria. The three sigma rule is one of the simplest tests for results that obey the normal distribution law. The essence of the three sigma rule: if a random variable is normally distributed, then absolute value its deviation from the mathematical expectation does not exceed three times the standard deviation.

    In practice, the three-sigma rule is applied as follows: if the distribution of the random variable under study is unknown, but the condition specified in the given rule is met, then there is reason to assume that the studied variable is distributed normally; otherwise, it is not normally distributed. For this purpose, for the sample (including the suspicious result), the center of distribution and the estimate of the standard deviation of the observation result are calculated. Result that satisfies the condition

    ,

    is considered to have a gross error and is removed, and the previously calculated distribution characteristics are refined.

    This criterion is similar Wright's criterion, based on the fact that if the residual error is greater than four sigma, then this measurement result is a gross error and should be excluded during further processing. Both criteria are reliable when the number of measurements is more than 20…50. It is legitimate to use them when the value of the general standard deviation is known ( S).

    It may turn out that for new values ​​and S other results will fall into the anomalous category.

    Smirnov's criterion. The Smirnov criterion is used for sample sizes P≥ 25 or at known values general secondary and SKO. It sets less rigid bounds on gross error. To implement this criterion, the actual values ​​of the distribution quantiles (the observed value of the criterion) are calculated using the formula:

    (8)

    The found value is compared with the criterion β k given in table6

    Table 6 - Distribution quantiles β k

    Chauvin's criterion. The Chauvenet criterion is used for laws that do not contradict the normal one and is based on determining the number of expected results of observations n cool, which have as large errors as the suspicious one. The hypothesis of the presence of a gross error is accepted if the following condition is met:

    The procedure for testing the hypothesis is as follows:

    1) the arithmetic mean and standard deviation are calculated S observational results for the entire sample;

    2) from the table of the normalized normal distribution (Appendix 1 - the integral function of the normalized normal distribution) by value

    the probability of a suspicious result in the general population of numbers is determined n:

    (9)

    3) number of expected results fl is determined by the formula:

    The above criteria in many cases turn out to be “hard”. Then it is recommended to use the criterion of gross error " k", depending on the sample size P and accepted confidence level R.

    Table 7 - Dependence of the criterion of gross error k on sample size P

    and confidence level R

    For distributions other than normal, classes such as two modal round-vertex compositions of normal and discrete distribution with kurtosis ε = 1.5 - 3.0; peaked bimodal; compositions of a discrete two-valued distribution and a Laplace distribution with kurtosis ε = 1.5 - 6.0; uniform distribution compositions with exponential kurtosis distribution ε = 1.8-6.0 and the class of exponential distributions within the change of kurtosis ε = 1.8-6.0 the limit of gross error is determined by the value ± (t gr . σ ) or ±( t gr . S), where:

    (11)

    where γ - counterexcess;

    (12)

    Errors in determining estimates S North Kazakhstan and t sp are negatively correlated, i.e., an increase in the standard deviation S accompanied by a decrease t zp. Therefore, the determination of the boundaries of gross error for laws other than normal, with kurtosis ε < 6 using the criterion t zp is sufficiently accurate and can be widely used in practice.

    Ratings , S and ε should be calculated after exclusion of suspicious results from the sample. After calculating the boundaries of a gross error, the results of observations that are inside the boundaries are returned, and the previously found distribution characteristics are refined.

    For a uniform distribution, it is possible to take the value ±1.8 . S.

    Consider an example application of criteria to eliminate gross errors in the measurement of speed shock wave. The results are presented in table 8.

    Table 8 - Results of observations

    It is required to determine whether the result of the observation contains V=3.50 km/s gross error.

    For graphic definition form of the distribution law, we will construct a histogram. When constructing, the division into intervals is carried out in such a way that the measured values ​​turn out to be the middle of the intervals, which is shown in Figure 2.

    Used to assess questionable sample values ​​for gross errors. The order of its application is as follows.

    Find the calculated value of the criterion λ calc = (|x to - x to prev |)/σ,

    where x k- questionable value x to prev- the previous value in the variation series, if x k is estimated from the maximum values ​​of the variation series, or the next, if x k is estimated from the minimum values ​​of the variation series (Irwin used in general case the term "first meaning"); σ is the general standard deviation (RMSD) of a continuous normally distributed random variable.

    If a λ calc > λ tab, x kblunder. Here λ table- tabular value (percentage point) of the Irwin criterion.

    The questions that arise in this case are described on the page. In particular, in the original article, the tabular values ​​of the criterion are calculated for a normally distributed random variable with a known general standard deviation (MSD) σ . Insofar as σ most often unknown, Irwin proposed to use in calculations instead of σ sample standard deviation s determined by the formula

    where n is the sample size, x i are the elements of the sample, x Wed is the mean value of the sample.

    This approach is usually used in practice. However, the acceptability of using a sample standard deviation, and thus percentage points for the general standard deviation, has not been confirmed.

    This article presents tabular values ​​(percentage points) of the Irwin criterion, calculated by the method of statistical computer modeling using a sample standard deviation for the maximum value of the variation series with a standard normal distribution of a random variable (with other parameters of a normal distribution, as well as for minimum value variational series, the same results are obtained). For each sample size n simulated 10 6 samples. As shown by preliminary calculations, parallel definitions differences in percentage point values ​​can be up to 0.003. Since the values ​​were rounded up to 0.01, in doubtful cases, 2 to 4 parallel determinations were performed.

    In addition, according to the data, tabular values ​​of the Irwin criterion for the known general SD were calculated and compared with those given in .

    Since at practical application Irwin's criterion, certain difficulties often arise due to the lack of literary sources tabular values ​​of the criterion for some sample sizes, were calculated by the same method of statistical computer modeling, some of the values ​​missing from the tabular values.

    It is clear that with a sample size of 2, the application of the test using the sample standard deviation does not make sense. This is confirmed by the fact that the simplification of the expression for the calculated value of the criterion with a sample standard deviation gives Square root of the two, which clearly shows the meaninglessness of applying the criterion with a sample size of 2 and a sample standard deviation.

    The results are shown in table. one.

    Table 1 - Tabular values ​​of the Irwin criterion for extreme elements variation series.

    Sample sizeAccording to the generalBy selective standard deviation
    Significance level
    0,1 0,05 0,01 0,1 0,05 0,01
    2 2,33* 2,77* 3,64* - - -
    3 1,79* 2,17* 2,90* 1,62 1,68 1,72
    4 1,58 1,92 2,60 1,55 1,70 1,88
    5 1,45 1,77 2,43 1,45 1,64 1,93/
    6 1,37 1,67 2,30 1,38 1,60 1,94
    7 1,31 1,60 2,22 1,32 1,55 1,93
    8 1,26 1,55 2,14 1,27 1,51 1,92
    9 1,22 1,50 2,09 1,23 1,47 1,90
    10 1,18* 1,46* 2,04* 1,20 1,44 1,88
    11 1,15 1,43 2,00 1,17 1,42 1,87
    12 1,13 1,40 1,97 1,15 1,39 1,85
    13 1,11 1,38 1,94 1,13 1,37 1,83
    14 1,09 1,36 1,91 1,11 1,35 1,82
    15 1,08 1,34 1,89 1,09 1,33 1,80
    20 1,03* 1,27* 1,80* 1,03 1,27 1,75
    25 0,99 1,23 1,74 0,99 1,22 1,70
    30 0,96* 1,20* 1,70* 0,96 1,19 1,66
    35 0,93 1,17 1,66 0,94 1,16 1,63
    40 0,91* 1,15* 1,63* 0,92 1,14 1,61
    45 0,89 1,13 1,61 0,90 1,12 1,59
    50 0,88* 1,11* 1,59* 0,89 1,10 1,57
    60 0,86* 1,08* 1,56* 0,87 1,08 1,54
    70 0,84* 1,06* 1,53* 0,85 1,06 1,52
    80 0,83* 1,04* 1,51* 0,83 1,04 1,50
    90 0,82* 1,03* 1,49* 0,82 1,03 1,48
    100 0,81* 1,02* 1,47* 0,81 1,02 1,46
    200 0,75* 0,95* 1,38* 0,75 0,95 1,38
    300 0,72* 0,91* 1,33* 0,72 0,91 1,33
    500 0,69* 0,88* 1,28* 0,69 0,88 1,28
    1000 0,65* 0,83* 1,22* 0,65 0,83 1,22
    Note: the values ​​marked with an asterisk are calculated from the data and, if necessary, adjusted during the statistical analysis. computer simulation. The remaining values ​​were calculated using statistical computer simulations.

    If we compare the percentage points for the known general RMS given in Table. 1, with the corresponding percentage points given in , they differ in several cases by 0.01, and in one case by 0.02. Apparently, the percentage points given in this article are more accurate, since in doubtful cases they were checked by statistical computer modeling.

    From Table 1 it can be seen that the percentage points of the Irwin criterion when using the sample standard deviation with relatively small sample sizes differ markedly from the percentage points when using the general standard deviation. Only at significant sample sizes, around 40, do the percentage points become close. Thus, when using the Irwin criterion, you should use the percentage points given in Table. 1, taking into account the fact that the calculated value of the criterion was obtained according to the general or sample standard deviation.

    LITERATURE

    1. Irvin J.O. On a criterion for the rejection of outlying observation //Biometrika.1925. V. 17. P. 238-250.

    2. Kobzar A.I. Applied math statistics. - M.: FIZMATLIT, 2006. - 816s. © V.V. Zalyazhnykh
    When using materials, put a link.