Biographies Characteristics Analysis

Average values ​​in statistics. Average values

Average values ​​refer to generalizing statistical indicators that give a summary (final) characteristic of mass social phenomena, since they are built on the basis of a large number of individual values ​​of a varying attribute. To clarify the essence of the average value, it is necessary to consider the features of the formation of the values ​​of the signs of those phenomena, according to which the average value is calculated.

It is known that the units of each mass phenomenon have numerous features. Whichever of these signs we take, its values ​​for individual units will be different, they change, or, as they say in statistics, vary from one unit to another. So, for example, the salary of an employee is determined by his qualifications, the nature of work, length of service and a number of other factors, and therefore varies over a very wide range. The cumulative influence of all factors determines the amount of earnings of each employee, however, we can talk about the average monthly wages of workers in different sectors of the economy. Here we operate with a typical, characteristic value of a variable attribute, related to a unit of a large population.

The average reflects that general, which is typical for all units of the studied population. At the same time, it balances the influence of all factors acting on the magnitude of the attribute of individual units of the population, as if mutually canceling them. The level (or size) of any social phenomenon is determined by the action of two groups of factors. Some of them are general and main, constantly operating, closely related to the nature of the phenomenon or process being studied, and form that typical for all units of the studied population, which is reflected in the average value. Others are individual, their action is less pronounced and is episodic, random. They act in the opposite direction, cause differences between the quantitative characteristics of individual units of the population, seeking to change the constant value of the characteristics being studied. The action of individual signs is extinguished in the average value. In the cumulative influence of typical and individual factors, which is balanced and mutually canceled out in generalizing characteristics, the fundamental law of large numbers.

In the aggregate, the individual values ​​of the signs merge into a common mass and, as it were, dissolve. Hence and average value acts as "impersonal", which can deviate from the individual values ​​of features, not quantitatively coinciding with any of them. The average value reflects the general, characteristic and typical for the entire population due to the mutual cancellation in it of random, atypical differences between the signs of its individual units, since its value is determined, as it were, by the common resultant of all causes.

However, in order for the average value to reflect the most typical value of a feature, it should not be determined for any populations, but only for populations consisting of qualitatively homogeneous units. This requirement is the main condition for the scientifically based application of averages and implies a close connection between the method of averages and the method of groupings in the analysis of socio-economic phenomena. Therefore, the average value is a general indicator that characterizes the typical level of a variable trait per unit of a homogeneous population in specific conditions of place and time.

Determining, thus, the essence of average values, it must be emphasized that the correct calculation of any average value implies the fulfillment of the following requirements:

  • qualitative homogeneity of the population on which the average value is calculated. This means that the calculation of average values ​​should be based on the grouping method, which ensures the selection of homogeneous, same-type phenomena;
  • exclusion of the influence on the calculation of the average value of random, purely individual causes and factors. This is achieved when the calculation of the average is based on sufficiently massive material in which the operation of the law of large numbers is manifested, and all accidents cancel each other out;
  • when calculating the average value, it is important to establish the purpose of its calculation and the so-called defining indicator-tel(property) to which it should be oriented.

The determining indicator can act as the sum of the values ​​of the averaged feature, the sum of its reciprocals, the product of its values, etc. The relationship between the defining indicator and the average value is expressed as follows: if all values ​​of the averaged feature are replaced by the average value, then their sum or product in in this case will not change the defining indicator. On the basis of this connection of the determining indicator with the average value, an initial quantitative ratio is built for the direct calculation of the average value. The ability of averages to preserve the properties of statistical populations is called defining property.

The average value calculated for the population as a whole is called general average; average values ​​calculated for each group - group averages. The general average reflects the general features of the phenomenon under study, the group average gives a description of the phenomenon that develops under the specific conditions of this group.

The calculation methods can be different, therefore, in statistics, several types of average are distinguished, the main of which are the arithmetic average, the harmonic average and the geometric average.

In economic analysis, the use of averages is the main tool for assessing the results of scientific and technological progress, social measures, and the search for reserves for economic development. At the same time, it should be remembered that excessive focus on averages can lead to biased conclusions when conducting economic and statistical analysis. This is due to the fact that average values, being generalizing indicators, cancel out and ignore those differences in the quantitative characteristics of individual units of the population that really exist and may be of independent interest.

Types of averages

In statistics, various types of averages are used, which are divided into two large classes:

  • power averages (harmonic mean, geometric mean, arithmetic mean, mean square, mean cubic);
  • structural averages (mode, median).

To calculate power means all available characteristic values ​​must be used. Fashion and median are determined only by the distribution structure, therefore they are called structural, positional averages. The median and mode are often used as an average characteristic in those populations where the calculation of the mean exponential is impossible or impractical.

The most common type of average is the arithmetic average. Under arithmetic mean is understood as such a value of a feature that each unit of the population would have if the total of all values ​​of the feature were distributed evenly among all units of the population. The calculation of this value is reduced to the summation of all values ​​of the variable attribute and the division of the resulting amount by the total number of population units. For example, five workers completed an order for the manufacture of parts, while the first produced 5 parts, the second - 7, the third - 4, the fourth - 10, the fifth - 12. Since the value of each option occurred only once in the initial data, to determine the average output of one worker should apply the simple arithmetic mean formula:

i.e., in our example, the average output of one worker is equal to

Along with the simple arithmetic mean, they study weighted arithmetic mean. For example, let's calculate the average age of students in a group of 20 people whose age ranges from 18 to 22 years old, where xi- variants of the averaged feature, fi- frequency, which shows how many times it occurs i-th value in the aggregate (Table 5.1).

Table 5.1

Average age of students

Applying the weighted arithmetic mean formula, we get:


There is a certain rule for choosing a weighted arithmetic average: if there is a series of data on two indicators, for one of which it is necessary to calculate

the average value, and at the same time, the numerical values ​​\u200b\u200bof the denominator of its logical formula are known, and the values ​​\u200b\u200bof the numerator are unknown, but can be found as the product of these indicators, then the average value should be calculated using the arithmetic weighted average formula.

In some cases, the nature of the initial statistical data is such that the calculation of the arithmetic mean loses its meaning and the only generalizing indicator can only be another type of average value - average harmonic. At present, the computational properties of the arithmetic mean have lost their relevance in the calculation of generalizing statistical indicators due to the widespread introduction of electronic computers. The average harmonic value, which is also simple and weighted, has acquired great practical importance. If the numerical values ​​of the numerator of the logical formula are known, and the values ​​of the denominator are unknown, but can be found as a quotient of one indicator by another, then the average value is calculated by the weighted harmonic mean formula.

For example, let it be known that the car traveled the first 210 km at a speed of 70 km/h, and the remaining 150 km at a speed of 75 km/h. It is impossible to determine the average speed of the car throughout the entire journey of 360 km using the arithmetic mean formula. Since the options are the speeds in individual sections xj= 70 km/h and X2= 75 km/h, and weights (fi) are the corresponding segments of the path, then the products of options by weights will have neither physical nor economic meaning. In this case, it makes sense to divide the segments of the path into the corresponding speeds (options xi), i.e., the time spent on passing individual sections of the path (fi / xi). If the segments of the path are denoted by fi, then the entire path is expressed as Σfi, and the time spent on the entire path is expressed as Σ fi / xi , Then the average speed can be found as the quotient of the total distance divided by the total time spent:

In our example, we get:

If when using the average harmonic weight of all options (f) are equal, then instead of the weighted one, you can use simple (unweighted) harmonic mean:

where xi - individual options; n- the number of variants of the averaged feature. In the example with speed, a simple harmonic mean could be applied if the segments of the path traveled at different speeds were equal.

Any average value should be calculated so that when it replaces each variant of the averaged feature, the value of some final, generalizing indicator, which is associated with the averaged indicator, does not change. So, when replacing the actual speeds on individual sections of the path with their average value (average speed), the total distance should not change.

The form (formula) of the average value is determined by the nature (mechanism) of the relationship of this final indicator with the averaged one, therefore the final indicator, the value of which should not change when the options are replaced by their average value, is called defining indicator. To derive the average formula, you need to compose and solve an equation using the relationship of the averaged indicator with the determining one. This equation is constructed by replacing the variants of the averaged feature (indicator) with their average value.

In addition to the arithmetic mean and the harmonic mean, other types (forms) of the mean are also used in statistics. All of them are special cases. degree average. If we calculate all types of power-law averages for the same data, then the values

they will be the same, the rule applies here majorance medium. As the exponent of the mean increases, so does the mean itself. The most commonly used formulas in practical research for calculating various types of power mean values ​​are presented in Table. 5.2.

Table 5.2


The geometric mean is applied when available. n growth factors, while the individual values ​​of the trait are, as a rule, relative values ​​of the dynamics, built in the form of chain values, as a ratio to the previous level of each level in the dynamics series. The average thus characterizes the average growth rate. geometric mean simple calculated by the formula

Formula geometric mean weighted has the following form:

The above formulas are identical, but one is applied at current coefficients or growth rates, and the second - at the absolute values ​​of the levels of the series.

root mean square is used when calculating with the values ​​of square functions, is used to measure the degree of fluctuation of the individual values ​​of the attribute around the arithmetic mean in the distribution series and is calculated by the formula

Mean square weighted calculated using a different formula:

Average cubic is used when calculating with the values ​​of cubic functions and is calculated by the formula

weighted average cubic:

All the above average values ​​can be represented as a general formula:

where is the average value; - individual value; n- the number of units of the studied population; k- exponent, which determines the type of average.

When using the same source data, the more k in the general power mean formula, the larger the mean value. It follows from this that there is a regular relationship between the values ​​of power means:

The average values ​​described above give a generalized idea of ​​the population under study, and from this point of view, their theoretical, applied, and cognitive significance is indisputable. But it happens that the value of the average does not coincide with any of the really existing options, therefore, in addition to the considered averages, in statistical analysis it is advisable to use the values ​​​​of specific options that occupy a well-defined position in an ordered (ranked) series of attribute values. Among these quantities, the most commonly used are structural, or descriptive, average- mode (Mo) and median (Me).

Fashion- the value of the trait that is most often found in this population. With regard to the variational series, the mode is the most frequently occurring value of the ranked series, i.e., the variant with the highest frequency. Fashion can be used to determine the most visited stores, the most common price for any product. It shows the size of the feature characteristic of a significant part of the population, and is determined by the formula

where x0 is the lower limit of the interval; h- interval value; fm- interval frequency; fm_ 1 - frequency of the previous interval; fm+ 1 - frequency of the next interval.

Median the variant located in the center of the ranked row is called. The median divides the series into two equal parts in such a way that on both sides of it there is the same number of population units. At the same time, in one half of the population units, the value of the variable attribute is less than the median, in the other half it is greater than it. The median is used when examining an element whose value is greater than or equal to or simultaneously less than or equal to half of the elements of the distribution series. The median gives a general idea of ​​where the values ​​of the feature are concentrated, in other words, where is their center.

The descriptive nature of the median is manifested in the fact that it characterizes the quantitative boundary of the values ​​of the varying attribute, which are possessed by half of the population units. The problem of finding the median for a discrete variational series is solved simply. If all units of the series are given serial numbers, then the serial number of the median variant is defined as (n + 1) / 2 with an odd number of members n. If the number of members of the series is an even number, then the median will be the average value of two variants with serial numbers n/ 2 and n / 2 + 1.

When determining the median in interval variation series, the interval in which it is located (the median interval) is first determined. This interval is characterized by the fact that its accumulated sum of frequencies is equal to or exceeds half the sum of all frequencies of the series. The calculation of the median of the interval variation series is carried out according to the formula

where X0- the lower limit of the interval; h- interval value; fm- interval frequency; f- the number of members of the series;

∫m-1 - the sum of the accumulated terms of the series preceding this one.

Along with the median, for a more complete characterization of the structure of the studied population, other values ​​​​of options are used, which occupy a quite definite position in the ranked series. These include quartiles and deciles. Quartiles divide the series by the sum of frequencies into 4 equal parts, and deciles - into 10 equal parts. There are three quartiles and nine deciles.

The median and mode, in contrast to the arithmetic mean, do not extinguish individual differences in the values ​​of a variable attribute and, therefore, are additional and very important characteristics of the statistical population. In practice, they are often used instead of the average or along with it. It is especially expedient to calculate the median and mode in those cases when the studied population contains a certain number of units with a very large or very small value of the variable attribute. These values ​​of options, which are not very characteristic for the population, while affecting the value of the arithmetic mean, do not affect the values ​​of the median and mode, which makes the latter very valuable indicators for economic and statistical analysis.

Variation indicators

The purpose of a statistical study is to identify the main properties and patterns of the studied statistical population. In the process of summary processing of statistical observation data, we build distribution lines. There are two types of distribution series - attributive and variational, depending on whether the attribute taken as the basis of the grouping is qualitative or quantitative.

variational called distribution series built on a quantitative basis. The values ​​of quantitative characteristics for individual units of the population are not constant, more or less differ from each other. This difference in the value of a trait is called variations. Separate numerical values ​​of the trait occurring in the studied population are called value options. The presence of variation in individual units of the population is due to the influence of a large number of factors on the formation of the trait level. The study of the nature and degree of variation of signs in individual units of the population is the most important issue of any statistical study. Variation indicators are used to describe the measure of trait variability.

Another important task of statistical research is to determine the role of individual factors or their groups in the variation of certain features of the population. To solve such a problem in statistics, special methods for studying variation are used, based on the use of a system of indicators that measure variation. In practice, the researcher is faced with a sufficiently large number of options for the values ​​of the attribute, which does not give an idea of ​​the distribution of units according to the value of the attribute in the aggregate. To do this, all variants of the attribute values ​​are arranged in ascending or descending order. This process is called row ranking. The ranked series immediately gives a general idea of ​​the values ​​that the feature takes in the aggregate.

The insufficiency of the average value for an exhaustive characterization of the population makes it necessary to supplement the average values ​​with indicators that make it possible to assess the typicality of these averages by measuring the fluctuation (variation) of the trait under study. The use of these indicators of variation makes it possible to make the statistical analysis more complete and meaningful, and thus to better understand the essence of the studied social phenomena.

The simplest signs of variation are minimum and maximum - this is the smallest and largest value of the feature in the population. The number of repetitions of individual variants of feature values ​​is called repetition rate. Let us denote the frequency of repetition of the feature value fi, the sum of frequencies equal to the volume of the studied population will be:

where k- number of variants of attribute values. It is convenient to replace frequencies with frequencies - w.i. Frequency- relative frequency indicator - can be expressed in fractions of a unit or a percentage and allows you to compare variation series with a different number of observations. Formally we have:

To measure the variation of a trait, various absolute and relative indicators are used. The absolute indicators of variation include the mean linear deviation, the range of variation, variance, standard deviation.

Span variation(R) is the difference between the maximum and minimum values ​​of the trait in the studied population: R= Xmax - Xmin. This indicator gives only the most general idea of ​​the fluctuation of the trait under study, as it shows the difference only between the extreme values ​​of the variants. It is completely unrelated to the frequencies in the variational series, that is, to the nature of the distribution, and its dependence can give it an unstable, random character only from the extreme values ​​of the trait. The range of variation does not provide any information about the features of the studied populations and does not allow us to assess the degree of typicality of the obtained average values. The scope of this indicator is limited to fairly homogeneous populations, more precisely, it characterizes the variation of a trait, an indicator based on taking into account the variability of all values ​​of the trait.

To characterize the variation of a trait, it is necessary to generalize the deviations of all values ​​from any value typical for the population under study. Such indicators

variations, such as the mean linear deviation, variance and standard deviation, are based on the consideration of deviations of the values ​​of the attribute of individual units of the population from the arithmetic mean.

Average linear deviation is the arithmetic mean of the absolute values ​​of the deviations of individual options from their arithmetic mean:


The absolute value (modulus) of the variant deviation from the arithmetic mean; f- frequency.

The first formula is applied if each of the options occurs in the aggregate only once, and the second - in series with unequal frequencies.

There is another way to average the deviations of options from the arithmetic mean. This method, which is very common in statistics, is reduced to calculating the squared deviations of options from the mean value with their subsequent averaging. In this case, we get a new indicator of variation - the variance.

Dispersion(σ 2) - the average of the squared deviations of the variants of the trait values ​​from their average value:

The second formula is used if the variants have their own weights (or frequencies of the variation series).

In economic and statistical analysis, it is customary to evaluate the variation of an attribute most often using the standard deviation. Standard deviation(σ) is the square root of the variance:

The mean linear and mean square deviations show how much the value of the attribute fluctuates on average for the units of the population under study, and are expressed in the same units as the variants.

In statistical practice, it often becomes necessary to compare the variation of various features. For example, it is of great interest to compare variations in the age of personnel and their qualifications, length of service and wages, etc. For such comparisons, indicators of the absolute variability of signs - the average linear and standard deviation - are not suitable. It is impossible, in fact, to compare the fluctuation of work experience, expressed in years, with the fluctuation of wages, expressed in rubles and kopecks.

When comparing the variability of various traits in the aggregate, it is convenient to use relative indicators of variation. These indicators are calculated as the ratio of absolute indicators to the arithmetic mean (or median). Using as an absolute indicator of variation the range of variation, the average linear deviation, the standard deviation, one obtains the relative indicators of fluctuation:


The most commonly used indicator of relative volatility, characterizing the homogeneity of the population. The set is considered homogeneous if the coefficient of variation does not exceed 33% for distributions close to normal.

The average value is the most valuable from an analytical point of view and a universal form of expression of statistical indicators. The most common average - the arithmetic average - has a number of mathematical properties that can be used in its calculation. At the same time, when calculating a specific average, it is always advisable to rely on its logical formula, which is the ratio of the volume of the attribute to the volume of the population. For each mean, there is only one true reference ratio, which, depending on the data available, may require different forms of means. However, in all cases where the nature of the averaged value implies the presence of weights, it is impossible to use their unweighted formulas instead of the weighted average formulas.

The average value is the most characteristic value of the attribute for the population and the size of the attribute of the population distributed in equal shares between the units of the population.

The characteristic for which the average value is calculated is called averaged .

The average value is an indicator calculated by comparing absolute or relative values. The average value is

The average value reflects the influence of all factors influencing the phenomenon under study, and is the resultant for them. In other words, repaying individual deviations and eliminating the influence of cases, the average value, reflecting the general measure of the results of this action, acts as a general pattern of the phenomenon under study.

Conditions for the use of averages:

Ø homogeneity of the studied population. If some elements of the population subject to the influence of a random factor have significantly different values ​​of the studied trait from the rest, then these elements will affect the size of the average for this population. In this case, the average will not express the most typical value of the feature for the population. If the phenomenon under study is heterogeneous, it is required to break it down into groups containing homogeneous elements. In this case, group averages are calculated - group averages expressing the most characteristic value of the phenomenon in each group, and then the overall average value for all elements is calculated, characterizing the phenomenon as a whole. It is calculated as the average of the group means, weighted by the number of population elements included in each group;

Ø a sufficient number of units in the aggregate;

Ø the maximum and minimum values ​​of the trait in the studied population.

Average value (indicator)- this is a generalized quantitative characteristic of a trait in a systematic population under specific conditions of place and time.

In statistics, the following forms (types) of averages are used, called power and structural:

Ø arithmetic mean(simple and weighted);

simple

The most common form of statistical indicators is the average value, which is a generalized quantitative characteristic of a feature in a statistical population under specific conditions of place and time. The indicator in the form of an average value expresses typical features and gives a generalized description of the same type of phenomena according to one of the varying signs. The widespread use of averages is explained by the fact that they have a number of positive properties that make them an indispensable tool for analyzing phenomena and processes in the economy.

The most important property of the average value is that it reflects the common that is inherent in all units of the population under study. The values ​​of the attribute of individual units of the population fluctuate in one direction or another under the influence of many factors, among which there can be both basic and random. For example, the stock price of a corporation is mainly determined by the financial results of its activities. At the same time, on certain days and on certain stock exchanges, due to the prevailing circumstances, these shares may be sold at a higher or lower rate. The essence of the average lies in the fact that it cancels out the deviations of the values ​​of the attribute of individual units of the population, due to the action of random factors, and takes into account the changes caused by the action of the main factors. This allows the average to reflect the typical level of the attribute and abstract from the individual characteristics of individual units.

The typicality of the average is directly related to the homogeneity of the population. The average value will reflect the typical level of the attribute only when it is calculated from a qualitatively homogeneous population. So, if we calculate the average rate for the shares of all enterprises sold on a given day on a given exchange, we get a fictitious average. This will be explained by the fact that the population used for the calculation is extremely heterogeneous. In this and similar cases, the average method is used in combination with the grouping method: if the population is heterogeneous, the general averages must be replaced or supplemented by group averages, i.e. averages calculated for qualitatively homogeneous groups.



The following conventions are used in the theory of averages.

1. The sign by which the average is determined is called averaged feature and is denoted.

2. The value of the averaged attribute for each unit of the population is called its individual value and is denoted.

3. The repeatability of individual values ​​is called frequency and is denoted f .

4. The total value of the feature is denoted W .

Any quantitative attribute of a statistical population has one single mean value. It can be calculated in various ways depending on the form of expression of the averaged feature (absolute, relative and average) and the available information. Depending on the degree k various types of averages are obtained.

1.simple arithmetic mean - the most common type of medium

k =1

2.Arithmetic weighted average – is used if the individual values ​​of the trait and their frequencies are known f . Each option is "weighted" by its frequency, i.e. multiply by it. Frequencies f are called statistical weights or simply weights of the average .

Example. Based on the available data, we calculate the average work experience of employees

3.Average harmonic simple is used if it is necessary that the sum of the reciprocals of the individual values ​​of the attribute remains unchanged during averaging.

where is the sum of the reciprocal values ​​of the feature.

Example. A car with a load from the enterprise to the warehouse traveled at a speed of 40 km/h, and back empty at a speed of 60 km/h. What is the average speed of the car for both trips?

Let the transportation distance be S km. S does not play any role in the calculation of the average speed. When changing individual speed values to the average value, it is necessary that the time spent on both trips remains unchanged, otherwise the average speed can be anything - from the speed of a turtle to the speed of light. Travel times are the same. So,

Reducing all the terms of the equality by S, we get i.e. the condition of the harmonic mean is satisfied. Substituting and , we get

The arithmetic average of 50 km/h is not correct, because results in a different movement time than it actually is. If the distance is 96 km, then the real travel time will be

In statistical practice, the harmonic weighted average is more often used.

4.Average harmonic weighted is used if the individual values ​​of the characteristic and the total values ​​of the characteristic are known.

Example

5.Average aggregate is used if the total values ​​of the trait and their frequencies are known.

Example. Determine the average cost of production, if known

6.root mean square used to calculate the standard deviation, which is an indicator of variation, as well as in engineering

k =2

Mean square weighted

7.Geometric mean used to calculate the average growth rate according to the chain scheme k= 0

At k= 1 we get the arithmetic mean, k= 2 - quadratic, with k= 3 - cubic, with k= 0 - geometric, k= -1 is the harmonic mean. The higher the exponent k , the larger the mean value. If all initial values ​​of a feature are equal, then all averages are equal to const. So we have the following relation, which is called the rule of majorance of means :

Using this rule, statistics can, depending on the mood and desire of its “expert”, either “drown” or “rescue” a student who received grades 2 and 5 in a session. What is his average score?

Judging by the arithmetic mean, the average score is 3.5. But if the dean wants to “drown” the unfortunate person and calculates the harmonic mean, then the student remains an average loser who did not reach the top three.

However, the student council may object to the dean and present the average cubic value . The student already looks "good" and even applies for a scholarship.

Structural averages - mode and median - in contrast to power averages, which are largely an abstract characteristic of the population, act as specific quantities that coincide with well-defined variants of the population. This makes them indispensable in solving practical problems.

Fashion- this is the most common value of the attribute in units of this population. For a discrete distribution series, the mode is determined without calculation, by looking through the frequency column, and corresponds to the feature value with the highest frequency. From example No. 1, the highest frequency f=20, which corresponds to the 4th tariff category, therefore M o =4.

For an interval distribution series, the mode is determined by the formula

where is the lower limit of the modal interval;

the value of the modal interval;

– frequencies of the interval, respectively, preceding the modal, modal and following the modal.

Modal corresponds to the interval with the highest frequency.

Let's calculate the mode for example No. 2. The modal corresponds to the interval 130-140. For him , = 140-130=10, =20,

Most often, the rate of production of workers is 134%, most often the plan is overfulfilled by 34%.

Median- the value of the feature that lies in the middle of the ranked series and divides it in half. Ranked series - a series arranged in ascending or descending order of a feature. For discrete variational series, the median is not calculated, but determined by looking at the series. For example, for five workers, the daily rate of production of parts is 10, 12, 15, 16 and 18 pieces, respectively. M e is the output of the third employee and is equal to 15 parts. With an even number of attribute values, the median is taken as the half-sum of the attribute values ​​occupying the median value. For example, at 10 values, the half-sum of the 5th and 6th values ​​of the attribute.

For an interval series, the median is determined by the formula

where the lower limit of the median interval;

the value of the median interval;

half-sum of the volume of the variation series;

accumulated frequency of the interval preceding the median;

frequency of the median interval.

The median is the interval corresponding to half the volume of the series. In order to find the median interval, it is necessary to accumulate frequencies until an interval containing half the volume of the series is found.

Let's calculate the median for example No. 2. The median interval is 120-130, because the cumulative frequency corresponding to it contains half the volume of the series. For him

Half of the workers meet the output rate of less than 129%, and the other half of the workers perform the output rate of more than 129%.

Average values ​​refer to generalizing statistical indicators that give a summary (final) characteristic of mass social phenomena, since they are built on the basis of a large number of individual values ​​of a varying attribute.

The average value reflects the general that is characteristic of all units of the studied population. At the same time, it balances the influence of all factors acting on the magnitude of the attribute of individual units of the population, as if mutually canceling them.

However, in order for the average value to reflect the most typical value of a trait, it should not be determined for any populations, but only for populations consisting of qualitatively homogeneous units. This requirement is the main condition for the scientifically substantiated application of average values ​​of quantities and implies a close relationship between the method of averages and the method of groupings in the analysis of socio-economic phenomena.

The average value is a generalizing indicator that characterizes the typical level of a variable trait per unit of a homogeneous population in specific conditions of place and time.

The average calculated for the population as a whole is called the general average, the averages calculated for each group are called group averages. The general average reflects the general features of the phenomenon under study, the group average gives a characteristic of the size of the phenomenon that develops under the specific conditions of this group.

In statistics, various types of averages are used, which are divided into two large classes:

1) power averages (harmonic mean, geometric mean, arithmetic mean, mean square, mean cubic);

2) structural averages (mode, median).

The most common type of average is the arithmetic average. Simple arithmetic average formula:

Arithmetic weighted average:

where x i– variants of the averaged feature; f - frequency, which shows how many times the i-th value occurs in the population.

Simple harmonic mean formula:

where x i- separate options; n is the number of variants of the averaged feature. The geometric simple mean is calculated by the formula:

The formula for the geometric weighted mean is:

Root mean square formula:

Weighted mean square formula:

Average cubic formula:

Average cubic weighted:

3. Structural averages: mode and median

Mode is the value of a feature that occurs most often in a given population. In relation to the variational series, the mode is the most frequently occurring value of the ranked series. It shows the size of the feature, characteristic of a significant part of the population, and is determined by the formula:

h is the value of the interval;

f m– interval frequency;

f m-1– frequency of the previous interval;

f m+1– frequency of the next interval.

The median is the variant located in the center of the ranked series. The median divides the series into two equal parts in such a way that on both sides of it there is the same number of population units. At the same time, in one half of the population units, the value of the variable feature is less than the median, while in the other half it is greater.

The descriptive nature of the median is manifested in the fact that it characterizes the quantitative boundary of the values ​​of the varying attribute, which are possessed by half of the population units.

When determining the median in interval variation series, the interval in which it is located (the median interval) is first determined. This interval is characterized by the fact that its accumulated sum of frequencies is equal to or exceeds half the sum of all frequencies of the series. The calculation of the median of the interval variation series is carried out according to the formula:

where x0 is the lower limit of the interval;

h is the value of the interval;

f m– interval frequency;

f is the number of members of the series;

sm- 1 - the sum of the accumulated members of the series preceding this one.

Along with the median, for a more complete characterization of the structure of the studied population, other values ​​​​of options are used, which occupy a quite definite position in the ranked series. These include quartiles and deciles. Quartiles divide the series by the sum of frequencies into four equal parts, and deciles into ten equal parts. There are three quartiles and nine deciles.

The median and mode, in contrast to the arithmetic mean, do not extinguish individual differences in the values ​​of a variable attribute and therefore are additional and very important characteristics of a statistical population. In practice, they are often used instead of the average or along with it. It is especially expedient to calculate the median and mode in those cases when the studied population contains a certain number of units with a very large or very small value of the variable attribute.

The relative dimensions of a structure are the ratio between the dimensions of a part and the whole. They characterize the composition, structure of the population. The form of presentation is specific weight or percentage. The sum of the relative values ​​of the structure is equal to 1 or 100%. The difference between the respective shares of the two populations is called a percentage point.

Absolute values ​​in statistics are the number of units and the sums for groups and for the whole population, which are the direct result of summarizing and grouping data.

Absolute values ​​are named numbers, that is, they have their own units of measurement (for example, pieces, tons, hryvnias). As part of absolute indicators, indicators of the population size (number of enterprises) and the volume of features (products, profits) are distinguished. There are three groups of feature meters - natural, labor and value.

natural meters reflect the physical properties inherent in phenomena (measures of weight, length, time). Sometimes combined units of measurement are used, which are the product of values ​​​​of different dimensions (electricity production in kWh).

It is not always possible to obtain absolute values ​​directly by summing up the values ​​of the attribute for individual units. In this case, the individual terms included in the absolute value lead to a comparable expression. For this, they often use conditionally natural meters. So, for example, when calculating the amount of fuel consumed, its different types, in accordance with their calorific value, are expressed in units of standard fuel, the calorific value of which is 7000 cal / kg.

Labor meters (man-hour, man-shift) are used in measuring labor costs for the production of products or for the performance of individual works, to determine labor productivity, and also to measure labor resources.

Cost meters make it possible to generalize and compare various phenomena. They are used in determining such important indicators as turnover, profit, capital investments.

Often, the absolute value of the indicator is calculated according to a certain rule based on other indicators. For example, gross profit is calculated as the difference between gross income and gross costs.

Many absolute values ​​are presented in the form of a balance, which provides for the calculation of the indicator in two sections: by sources of formation (revenue part of the balance sheet) and by directions of use (expenditure part). It is also possible to present absolute indicators in a dynamic balance form. For example, the increase in the number of pieces of equipment in an enterprise for a year can be represented as the difference in the number of pieces of equipment at the end and beginning of the year, or it can be represented as the difference between the number of units of newly introduced and retired equipment.



Chapter 4.3. Relative values.

Relative values ​​reflect the quantitative relationships of socio-economic phenomena. Their algebraic form is the quotient of the division of two similar or dissimilar quantities. The denominator of the ratio is considered as the basis of comparison or the basis of the relative value.

The comparison base can be 100, 1000, 10,000, or 100,000 units. Then the relative value will be expressed respectively as a percentage (%), in ppm (% o), prodecimille (% oo), prosantimille (% ooo).

Relative values ​​of different content and nature are used.

Relationship between different names absolute values ​​gives the relative magnitude of the intensity . This is a named value that combines the units of the numerator and denominator. For example, production per capita. Relative intensity values ​​characterize the degree of spread or development of a phenomenon in a particular environment. They also include demographic coefficients (fertility, mortality, intensity of migration flows), which are calculated by the ratio of the number of events (death, birth) for a certain period of time to the average population for the same period.

Comparison namesake values ​​allows you to distinguish the following types of relative values: structure, coordination, dynamics, plan task, plan implementation, comparison of object characteristics.

Relative values ​​of coordination - these are the relationships between the individual parts of the whole or the relationship of the individual parts of the population to one of them, taken as the basis for comparison. Example, the number of urban residents per 100 rural; the number of women per 100 men. These values ​​are expressed as percentages, ppm or multiples (for example, there are 114 women per 100 men).

To assess the intensity of development, use the relative magnitude of the dynamics, which is calculated as the ratio of the levels of the phenomenon under study for two periods.

Relative comparison values are calculated as ratios of similar indicators characterizing different objects or territories and having the same temporal certainty.

Some processes are planned and for the indicators that reflect them, plan targets are set. By comparing the planned and actual values ​​of the indicators, the relative values ​​are calculated: plan target and plan execution.

If we designate the actual level of the current period y1, base y0 and planned level ypl, then the relative value:

Kd= y1 / y0,

2) planned target

Kpz \u003d ypl / y0,

3) plan execution

Kvp \u003d y1 / ypl .

Chapter 4.4. Types and forms of average values.

Average value is called a statistical indicator that gives a generalized description of the varying feature of homogeneous units of the population in specific conditions of place and time. The value of the average characterizes the entire population and characterizes it in relation to one given attribute.

average value reflects the common that is inherent in all units of the studied population.

So, for example, the average wage gives a generalizing quantitative characteristic of the state of remuneration of the considered set of workers.

The essence of the middle lies in the fact that it cancels out random deviations of the attribute values ​​and takes into account the changes caused by the main factor.

Statistical processing by the method of averages consists in replacing the individual values ​​of a varying attribute with some balanced average X.

For example, the individual output of 5 operators of a commercial bank per day was 136, 140, 154 and 162 operations. To get the average number of transactions per day performed by one teller, you need to add up these individual indicators and divide the resulting amount by the number of tellers:

As can be seen from the above example, the average number of transactions does not match any of the individual ones, since not a single operator has performed 150 transactions. But if we imagine that each operator performed 150 operations, then their total amount will not change, but will also be equal to 750. Thus, we have come to the main property of average values: the sum of the individual values ​​of the attribute is equal to the sum of the average values.

This property once again emphasizes that the average value is a generalizing characteristic of the entire statistical population.

Mean values ​​are divided into two large classes:

Power averages:

Arithmetic

harmonic

Geometric

quadratic

Structural averages:

Fashion

Median

The most common type of average is the arithmetic average:

simple arithmetic mean

Arithmetic weighted average

Arithmetic mean for an interval series.

Simple arithmetic mean represents the average term, in determining which the total volume of a given attribute in the data set is equally distributed among all units included in this one.

So, the average annual production output per worker is such a value of the volume of production that would fall on each employee if the entire volume of output was equally distributed among all employees of the organization. The arithmetic mean simple value is calculated by the formula.