Biographies Characteristics Analysis

About statistics and statistical data. General concept of variation

1. The general concept of statistics. subject of statistics.

Statistics is called systematic and systematic accounting carried out throughout the country by state statistics bodies headed by the State Committee of the Russian Federation on Statistics.

Statistics - digital data published in special reference books and mass media.

Statistics is a special scientific discipline.

The subject and content of statistical science have been debatable for a long time. In order to resolve these issues in 1954 and 1968. special meetings were held with the involvement of a wide range of scientists and practitioners, not only statisticians, but also specialists related to science. In addition, until the mid-1970s there was a discussion about the subject of statistics in the specialized literature. During the discussions, it emerged 3 main points of view on the subject of statistics:

1. Statistics is a universal science that studies the mass phenomenon of nature and society.

2. Statistics - a methodological science that does not have its own subject of knowledge, but is a doctrine of the method used by the social sciences.

3. Statistics is a social science that has its own subject, methodology and studies the quantitative patterns of social development.

As a result of the meetings and discussions held in statistical science, the first two points of view were rejected by the majority of scientists and practitioners, and the third one was basically accepted, supplemented and clarified.

The subject of statistics is the quantitative side of mass socio-economic phenomena, inextricable links with their qualitative side, specific conditions, place and time. From this definition it follows main features of the subject of statistical science:

1. Statistics is a social science.

2. Unlike other social sciences, statistics studies the quantitative side of social phenomena.

3. Statistics studies a mass phenomenon.

4. Statistics studies the quantitative side of phenomena in close connection with the quantitative side, and this is embodied in the existence of a system of statistical indicators.

5. Statistics studies the quantitative side of phenomena in specific conditions of place and time.

2. Method of statistics and statistical methodology.

Statistical methodology is understood as a system of principles and methods for their implementation aimed at studying quantitative patterns that manifest themselves in the structure of relationships and the dynamics of socio-economic phenomena. The most important constituent elements methods of statistics and statistical methodology are mass statistical observation, summary and grouping, as well as the use of generalizing statistical indicators and their analysis.

Essence of the first element of statistical methodology is the collection of primary data about the object under study. For example: in the course of the census of the country's population, data are collected on each person living in its territory, which is entered in a special form.

Second element: summary and grouping is the division of the totality of data obtained at the observation stage into homogeneous groups according to one or more characteristics. For example, as a result of grouping materials, the census is divided into groups (by sex, age, population, education, etc.).

The essence of the third element of statistical methodology consists in the calculation and socio-economic interpretation of generalizing statistical indicators:

1. Absolute

2. Relative

3. Medium

4. Indicators of variation

5. Speakers

The three main elements of statistical methodology also constitute the three stages of any statistical study.

3. Law of large numbers and statistical regularity.

The law of large numbers plays an important role in statistical methodology. In its most general form, it can be formulated as follows:

The law of large numbers is a general principle by virtue of which the cumulative action of a large number of random factors leads, under certain general conditions, to a result almost independent of chance.

The law of large numbers is generated by special properties of mass phenomena. The mass phenomena of the latter, in turn, on the one hand, due to their individuality, differ from each other, and on the other hand, they have something in common that determines their belonging to a certain class.

A single phenomenon is more susceptible to the influence of random and insignificant factors than a mass of phenomena as a whole. Under certain conditions, the value of a feature of an individual unit can be considered as a random variable, given that it obeys not only a general pattern, but is also formed under the influence of conditions that do not depend on this pattern. It is for this reason that statistics widely use averages, which characterize the entire population with one number. Only with a large number of observations do random deviations from the main direction of development balance out, cancel each other out, and the statistical regularity manifests itself more clearly. Thus, essence of the law of large numbers lies in the fact that in the numbers summarizing the result of mass statistical observation, the pattern of development of socio-economic phenomena is revealed more clearly than with a small statistical study.

4. Branches of statistics.

In the process of historical development, as part of statistics as a single science, the following branches emerged and gained a certain independence:

1. The general theory of statistics, which develops the concept of categories and methods for measuring the quantitative patterns of social life.

2. Economic statistics studying the quantitative patterns of reproduction processes at various levels.

3. Social statistics, which studies the quantitative side of the development of the social infrastructure of society (statistics of health care, education, culture, moral, judicial, etc.).

4. Industry statistics (statistics of industry, agro-industrial complex, transport, communications, etc.).

All branches of statistics, developing and improving their methodology, contribute to the development of statistical science as a whole.

5. Basic concepts and categories of statistical science in general.

A statistical aggregate is a set of elements of the same type that are similar to each other in some respects and differ in others. For example: this is a set of sectors of the economy, a set of universities, a set of cooperation between design bureaus, etc.

Individual elements of a statistical population are called its units. In the examples discussed above, the units of the population are, respectively, the industry, the university (one) and the employee.

Units of the population usually have many features.

A sign is a property of the units of the population, expressing their essence and having the ability to vary, i.e. change. Signs that take a single value in individual units of the population are called varying, and the values ​​themselves are options.

Variable signs are subdivided into attributive or qualitative ones. An attribute is called attributive or qualitative if its separate value (variants) are expressed as a state or properties inherent in the phenomenon. Variants of attributive features are expressed in verbal form. Examples of such signs can serve - economic.

An attribute is called quantitative if its individual value is expressed in the form of numbers. For example: salary, scholarship, age, PF size.

According to the nature of variation, quantitative signs are divided into discrete and continuous.

Discrete - such quantitative signs that can only take on a well-defined, as a rule, integer value.

Continuous - are such signs that, within certain limits, can take on the value of both integer and fractional. For example: GNP of a country, etc.

There are also primary and secondary features.

The main features characterize the main content and essence of the phenomenon or process being studied.

Secondary features provide additional information and are directly related to the inner content of the phenomenon.

Depending on the goals of a particular study, the same signs in the same cases may be primary, and in others secondary.

statistic- This is a category that reflects the dimensions and quantitative ratios of the signs of socio-economic phenomena and their qualitative certainty in specific conditions of place and time. It is necessary to distinguish between the content of a statistical indicator and its specific numerical expression. Content, i.e. qualitative certainty lies in the fact that indicators always characterize socio-economic categories (population, economy, financial institutions, etc.). Quantitative dimensions of statistical indicators, i.e. their numerical values ​​depend primarily on the time and place of the object that is subjected to statistical research.

Socio-economic phenomena, as a rule, cannot be characterized by any one indicator, for example: the standard of living of the population. A scientifically substantiated system of statistical indicators is necessary for a comprehensive comprehensive characterization of the phenomena under study. Such a system is not permanent. It is constantly being improved based on the needs of social development.

6. Tasks of statistical science and practice in the conditions of market economy development.

The main tasks of statistics in the context of the development of market relations in Russia are the following:

1. Improving accounting and reporting and reducing the document flow on this basis.

2. Strengthening work to control the reliability of statistical information provided to enterprises, institutions and organizations of all sectors of the economy and forms of ownership.

3. Improving the timeliness of statistical information both to the incoming statistical agency and the structures of state power and management provided by them.

4. Deepening the analytical functions, developing statistical data, forming the subject matter of statistical data being conducted in accordance with the current tasks of the socio-economic development of the country.

5. Further development and improvement of statistical methodology based on the ever-wider introduction of PC practice and ... statistical analysis was not predicted.

Statistical summary - a method of scientific processing of statistical data collected during the observation process, in which information related to a single unit is summarized and then characterized by analytical indicators and a system of tables. When summarized, statistical data characterizing the entire population are obtained. At this stage, a transition is made from individual characteristics of the units of the population and a generalizing indicator that characterizes the entire population.

There is a summary in the narrow and broad sense of the word. In the narrow sense of the word, a summary is understood as a technical operation for calculating the results. In the broad sense of the word, the summary consists of grouping the information obtained in the process of monitoring the compilation of scorecards to characterize typical groups, the presentation of these indicators in tables, as well as the calculation of general and group totals.

2.1. General concept of groupings.

Groupings are still a method of researching socio-economic phenomena, in which the statistical population is divided into homogeneous groups that reveal the state and development of the entire population.

Grouping is the most important stage of statistical research, combining the collection of primary information about the scope of the study and the analysis of this information on the basis of generalizing statistical indicators.

Grouping methods are varied. This diversity is due, on the one hand, to a huge variety of features subjected to statistical research, and, on the other hand, to a variety of tasks that are solved on the basis of groupings.

2.2. The most important problem that arises when grouping.

The most important problem in building a grouping is the choice of a grouped feature or the basis of the grouping.

Grouping sign- a variable sign by which the units of the population are combined into groups.

According to the nature of variation, signs are divided, as is known, into: attributive and quantitative. This division determines the features of solving the second problem of groupings, namely, the determination of the number of allocated groups. When choosing some attributive features as grouping features, only a strictly defined number of groups can be distinguished. In particular, when grouping the population by sex, it can be distinguished ...

When grouping enterprises by profit, 3 groups can be distinguished.

For many attributive features, stable groupings are developed, called classifications. For example: classification of economic sectors, classification of occupations of the population, etc.

When grouping on a quantitative basis, the question of the number of group boundaries should be decided based on the essence of the socio-economic phenomenon under study. In this case, one should take into account such an indicator as the range of variations. The greater the range of variation, the more groups are formed and vice versa. It is also necessary to take into account the number of units of the population on which the grouping is built. With a small volume of the population, it is not advisable to form a large number of groups, because in this case, the groups will not have a sufficient number of units to identify statistical patterns.

An essential issue in grouping by quantitative characteristics is the definition of intervals. The indicators of the number of groups and the size of the intervals are inversely related. The larger the intervals, the fewer groups are required and vice versa.

An interval is the difference between its upper and lower bounds.

According to the size of the grouping attribute, the intervals are divided into equal and unequal. Equal intervals are used in cases where the change in the grouping attribute within the population occurs evenly. The calculation of the value of an equal interval is carried out according to the formula:

k - number of groups

Xmax, Xmin - respectively, the largest and smallest value of the attribute to the quality of the groups.

If the distribution of the grouping attribute within the population is uneven, then unequal intervals are used. Unequal intervals can be progressively increasing and progressively decreasing. often when grouping, so-called specialized intervals are used, i.e. those that are determined based on the purpose of the study and the essence of the phenomenon. For example: when grouping with the aim of characterizing the able-bodied population of the country, five-year intervals of the age of people are used.

The third problem of constructing groupings is the designation of interval boundaries. When selecting intervals according to discrete quantitative characteristics, their boundaries should be designated so that the lower limit of the subsequent interval differs from the upper limit of the previous one by one.

When grouping on a continuous quantitative basis, the boundaries are marked so that the groups are clearly separated from one another. This is achieved by adding to the numerical boundaries of the intervals indications of where to refer the unit having a grouping feature in sizes that exactly coincide with the boundaries of the intervals. Usually, additional explanations for the numerical boundaries of intervals formed according to continuous quantitative principles are expressed in the words: “more”, “less”, “over”, etc.

2.3. Grouping types.

Depending on the tasks solved with the help of groupings, the following types are distinguished:

Typological

Structural

Analytical

The main task of the typological one is to classify socio-economic phenomena by identifying groups that are homogeneous in terms of qualitative relations.

In this case, qualitative homogeneity is understood in the sense that, in relation to the property under study, all units of the totality obey the same law of development. For example: grouping enterprises of branches of the economy.

Absolute and relative values.

An absolute value is an indicator that expresses the dimensions of a socio-economic phenomenon.

A relative value in statistics is an indicator that expresses the quantitative relationship between phenomena. It is obtained by dividing one absolute value by another absolute value. The value with which we make comparisons is called base or comparison base.

Absolute values ​​are always named values.

Relative values ​​are expressed in coefficients, percentages, ppm, etc.

The relative value shows how many times, or how many percent the compared value is more or less than the comparison base.

In statistics, there are 8 types of relative values:

1. Essence and meaning of average values.

Averages are one of the most common summary statistics. They aim to characterize by one number a statistical population consisting of a minority of units. Averages are closely related to the law of large numbers. The essence of this dependence lies in the fact that, with a large number of observations, random deviations from the general statistics cancel each other out and, on average, a statistical regularity is more clearly manifested.

Using the method of averages, the following main tasks are solved:

1. Characteristics of the level of development of phenomena.

2. Comparison of two or more levels.

3. The study of the relationship of socio-economic phenomena.

  1. 4. Analysis of the distribution of socio-economic phenomena in space.

To solve these problems, statistical methodology has developed various types of averages.

2. Arithmetic mean.

To clarify the methodology for calculating the arithmetic mean, we use the following notation:

X - arithmetic sign

X (X1, X2, ... X3) - variants of a certain feature

n - number of population units

The average value of the feature

Depending on the initial data, the arithmetic mean can be calculated in two ways:

1. If the statistical observation data are not grouped, or the grouped variants have the same frequencies, then the simple arithmetic mean is calculated:

2. If the frequencies are grouped in the data are different, then the weighted arithmetic mean is calculated:

Number (frequencies) of variants

Sum of frequencies

The arithmetic mean is calculated differently in discrete and interval variation series.

In discrete series, feature variants are multiplied by frequencies, these products are summed up, and the resulting sum of products is divided by the sum of frequencies.

Consider an example of calculating the arithmetic mean in a discrete series:

Salary, rub. Xi

Number of employees, people fi

Product variant by weights (frequency) Xi*fi

In interval series, the value of a feature is given, as is known, in the form of intervals, therefore, before calculating the arithmetic mean, it is necessary to switch from an interval series to a discrete one.

As options for Xi, the middle of the corresponding intervals is used. They are defined as half the sum of the lower and upper bounds.

If the interval has no lower limit, then its middle is defined as the difference between the upper limit and half the value of the following intervals. In the absence of upper bounds, the middle of the interval is defined as the sum of the lower bound and half the value of the previous interval. After the transition to a discrete series, further calculations are carried out according to the method discussed above.

If the weights fi are given not in absolute terms, but in relative terms, then the formula for calculating the arithmetic mean will be as follows:

pi - relative values ​​of the structure, showing what percentage is the frequency of variants in the sum of all frequencies.

If the relative values ​​of the structure are given not in percentages, but in shares, then the arithmetic mean will be calculated by the formula:

3. Average harmonic.

The harmonic mean is the primitive form of the arithmetic mean. It is calculated in those cases when the weights fi are not given directly, but are included as a factor in one of the available indicators. As well as the arithmetic mean, the harmonic mean can be simple and weighted.

Harmonic mean unweighted:

Average harmonic mixed:

Wi - product of variants by frequencies

When calculating averages, it must be remembered that any intermediate calculations should result both in the numerator and in the denominator and indicators that make economic sense.

4. Structural average.

The structural average characterizes the composition of the statistical population according to one of the varying features. These means are the mode and the median.

Mode is the value of a variable that has the highest frequency in a given distribution series.

In discrete series of distributions, the mode is determined visually. First, the highest frequency is determined, and the modal value of the feature is determined from it. In interval series, the following formula is used to calculate the mode:

Xmo - the lower limit of the modality (interval of the series with the highest frequency)

Mo - interval value

fMo - modal interval frequency

fMo-1 - frequency of interval preceding modal

fMo+1 - frequency of the interval following the modal

The median is the value of the variable that divides the distribution series into two equal parts according to the volume of frequencies. The median is calculated differently in discrete and interval series.

1. If the distribution series is discrete and consists of an even number of members, then the median is defined as the average of the two median values ​​of the ranked series of features.

2. If there is an odd number of levels in the discrete distribution series, then the median will be the middle value of the ranked series of features.

In interval series, the median is determined by the formula:

The lower bound of the median interval (the interval for which the accumulated frequency exceeds half the sum of frequencies for the first time)

Me - interval value

The sum of the frequencies of the series

The sum of the accumulated frequencies preceding the median interval

Median Interval Frequency

1. General concept of variation.

Variation is the difference in the values ​​of the attribute in individual units of the population.

The variation arises due to the fact that the individual values ​​of the attribute are formed by the influence of a large number of interrelated factors. These factors often act in opposite directions, and their joint action forms the value of features in a particular unit of the population. The need to study variations is due to the fact that the average value, summarizing the data of statistical observation, does not show how the individual value of the trait fluctuates around it. Variations are inherent in the phenomena of nature and society. At the same time, the revolution in society is happening faster than similar changes in nature. Objectively, there are also variations in space and time.

Variations in space show the difference in statistical indicators related to various administrative-territorial units.

Variations in time show the difference in indicators depending on the period or point in time to which they refer.

2. Measures of variations.

Examples of variations include the following indicators:

1. range of variations

2. average linear deviation

3. standard deviation

4. dispersion

5. ratio

1. The range of variation is its simplest measure. It is defined as the difference between the maximum and minimum value of the feature. The disadvantage of this indicator is that it depends only on the two extreme values ​​of the attribute (min, max) and does not characterize the fluctuation within the population. R=Xmax-Xmin.

2. Average linear deviation is the average value of the absolute values ​​of the deviations from the arithmetic mean. It is determined by the formula:

Simple

Deviations are taken modulo, because otherwise, due to the mathematical properties of the mean, they would always be zero.

4. Dispersion (mean square of deviations) has the greatest use in statistics as an indicator of the measure of volatility.

The dispersion is determined by the formulas:

example: page 36

The variance is a named indicator. It is measured in units corresponding to the square of the units of measurement of the trait under study. In this case, it shows that the average deviation of profit for 50 enterprises from the average profit is 1.48.

The dispersion can also be determined by the formula:

3. Standard deviation is defined as the root of the variance.

According to the initial data given above, the standard deviation is:

5. Variation coefficient is defined as the ratio of the standard deviation to the average value of the feature, expressed as a percentage:

It characterizes the quantitative homogeneity of the statistical population. If this coefficient< 50%, то это говорит об однородности статистической совокупности. Если же совокупность не однородна, то любые статистические исследования можно проводить только внутри выделенных однородных групп.

3. Dispersion of an alternative sign.

Alternative are 2 mutually exclusive features. Those are the features that each individual unit of the population either possesses or does not possess. The presence of an alternative feature is usually denoted by one, and the absence by 0. The proportion of units with this feature is denoted by p (n), and the proportion of units with this feature is denoted by q. In this case, p+q=1.

The variance of an alternative attribute is determined by the formula:

4. Types of dispersions. Grafted their addition.

If the statistical population under study is divided into groups, then for each of them it is possible to determine the group means and variances. These variances will characterize the fluctuation of the studied trait for each individual group. On this basis, one can determine the mean from within the group variances.

ni=fi - number of units in separate groups

This variance characterizes the random variation of the trait, which does not depend on the factor underlying the grouping.

The intergroup variance is also calculated.

and ni=fi, respectively, the average and abundance for individual groups.

This dispersion characterizes the variation in the influence of the grouping trait. The sum of the inside mean of the group and between group variances makes it possible to determine the total variance.

This equality is called the rule for adding variances.

; , i.e. there is a close relationship between the manufacture of parts and other indicators.

If the values ​​of the characteristic under study are expressed in shares or coefficients, then the rule for adding variances is expressed by the following formulas:

ni - number of units in separate groups

pi - share of the trait under study in the whole population

average of within-group variances for proportions of features

1. Types and forms of dependence between socio-economic phenomena.

The variety of relationships in which there are socio-economic phenomena, give rise to the need for their classification.

According to the types, functional and correlation dependence is distinguished.

A functional dependence is such a dependence in which one value of the factor attribute X corresponds to one strictly defined value of the effective attribute Y.

Unlike functional dependence, correlation expresses such a relationship between socio-economic phenomena, in which one value of the factor attribute X can correspond to several values ​​of the effective attribute Y.

There are direct and inverse relationships according to direction.

A direct relationship is such a relationship in which the value of the factor attribute X and the resulting attribute Y change in the same direction. That. as X increases, Y values ​​increase on average, and as X decreases, Y decreases.

An inverse relationship between factor and resultant features, if they change in opposite directions.

2. Statistical methods for studying relationships.

An important place in the statistical study of relationships is occupied by the following methods:

1. Method of reduction of parallel data.

2. Method of analytical groupings.

3. Graphical method.

4. Balance method.

6. Correlation-regression.

1. Essence parallel data reduction method is as follows:

The initial data on the basis of X are arranged in ascending or descending order, and on the basis of Y, the corresponding indicators are recorded. By comparing the values ​​of X and Y, a conclusion is made about the presence and direction of dependence.

3. Essence graphic method is a visual representation of the presence and direction of relationships between features. To do this, the value of the factor attribute X is located along the abscissa axis, and the value of the resulting attribute along the ordinate axis. According to the joint arrangement of points on the graph, a conclusion is made about the direction and the presence of dependence. In this case, the following options are possible:

a \, b / (up), c \ (down).

If the points on the graph are arranged randomly (a), then there is no relationship between the studied features.

If the points on the graph are concentrated around the straight line (b) /, the relationship between the features is direct.

If the points are concentrated around the straight line (c) \, then this indicates the presence of an inverse relationship.

Based on the method of parallel data and the graphical method, indicators can be calculated that characterize the degree of closeness of the correlation dependence.

The most multiple of them is the Fechner sign coefficient. It is calculated by the formula:

C - the sum of the coinciding signs of the deviations of the individual values ​​of the attribute from the average.

H - sum of mismatches

This coefficient varies within (-1;1).

The value of KF=0 indicates the absence of dependence between the studied features.

If KF=±1, then this indicates the presence of a functional direct (+) and inverse (-) dependence. With a value of KF>½0.6½, it is concluded that there is a strong direct (inverse) relationship between the features. In addition, based on the initial data on the factor and resultant features, the Spearman rank correlation coefficient can be calculated, which is determined by the formula:

Rank difference squares

(R2-R1), n ​​- number of pairs of ranks

This coefficient, like the previous one, varies within the same limits and has the same economic interpretation as KF.

In cases where the value of X or Y is expressed by the same indicators, the rank correlation coefficient is calculated using the following formula:

tj - the same number of ranks in the j - row

If the relationship between three or more mathematical features is being investigated, then the concordance coefficient is used to study it, which is determined by the formula:

m - number of factors

n - number of observations

S - deviation of the sum of squares of ranks from the average of squares of ranks

3. Studying the relationship between quantitative traits.

To study the relationship of qualitative alternative features that take only 2 mutually exclusive values, the coefficient is used associations and contingents. When calculating these coefficients, the so-called. table of 4 stones, and the coefficients themselves are calculated by the formula:

Y groups

Groups based on X

If the association coefficient is ³ 0.5, and the contingency coefficient is ³ 0.3, then we can conclude that there is a significant relationship between the studied characteristics.

If the signs have 3 or more gradations, then the Pearsen and Chuprov coefficients are used to study the relationships. They are calculated according to the formulas:

C - Pearsen coefficient

K - Chuprov coefficient

j - indicator of mutual contingency

K - number of values ​​(groups) of the first feature

K1 - number of values ​​(groups) of the second feature

fij - frequencies of the corresponding cells of the table

mi - table columns

nj - strings

To calculate the Pearsen and Chuprov coefficients, an auxiliary table is compiled:

Feature group Y

Feature group X

When ranking qualitative features in order to study their relationship, the Kendall correlation coefficient is used.

n - number of observations

S is the sum of the differences between the number of sequences and the number of inversions by the second feature.

P is the sum of rank values ​​following the data and exceeding its value

Q - the sum of the rank values ​​following the data and less than its value (taken into account with the "-" sign).

In the presence of related ranks, the formula for the Kendall coefficient will be as follows:

Vx and Vy are determined separately for ranks X and Y by the formula:

5. Methods for identifying the main trend of the time series.

The levels of a series of dynamics are formed under the attention of 3 groups of factors:

1. Factors determining the main direction, i.e. development trend of the phenomenon under study.

2. Factors acting periodically, i.e. directional fluctuations by weeks of the month, months of the year, etc.

3. Factors acting in different, sometimes in opposite directions and not having a significant impact on the level of a given series of dynamics.

The main task of the statistical study of dynamics is to identify trends.

The main methods for identifying trends in time series are:

Interval coarsening method

moving average method

Analytical alignment method

1. Essence interval enlargement method is as follows:

The original series of dynamics is transformed and replaced by others consisting of other levels related to enlarged periods or points in time.

For example: a series of dynamics of the profit of a small enterprise for 1997 by quarters of the same year. At the same time, the levels of the series for enlarged periods or points in time can be either total or average indicators. However, in any case, the levels of the series calculated in this way more clearly reveal trends, since seasonal and random fluctuations cancel out and balance out when summing up or determining averages.

2. moving average method, like the previous one, involves the transformation of the original series of dynamics. To identify a trend, an interval consisting of the same number of levels is formed. In this case, each subsequent interval is obtained by shifting by 1 level from the initial one. According to the intervals thus formed, the sum is determined at the beginning, and then the averages. It is technically more convenient to define moving averages for an odd interval. In this case, the calculated average value will refer to a specific level of the time series, i.e. to the middle of the slip interval.

When determining the moving average over an even interval, the calculated value of the average refers to the interval between two levels, and thus loses economic meaning. This necessitates additional calculations related to centering according to the arithmetic simple formula from two adjacent non-centered averages.

Statistics- a science that studies the quantitative side of mass socio-economic phenomena and processes in inseparable unity with their qualitative side in the specific conditions of place and time.

In the natural sciences, the concept of "statistics" means the analysis of mass phenomena based on the application of methods of probability theory.

Statistics develops a special methodology for the study and processing of materials: mass statistical observations, the method of groupings, averages, indices, the balance method, the method of graphic images.

methodological features is the study of: the mass nature of phenomena, qualitatively homogeneous signs of a phenomenon in dynamics.

The statistics include a number sections, among which: the general theory of statistics, economic statistics, sectoral statistics - industrial, agriculture, transport, medical.

11. Groups of indicators for assessing the health status of the population.

The health of the population is characterized by three groups of main indicators:

A) medical and demographic - reflect the state and dynamics of demographic processes:

    Population statistics (density, distribution, social composition, composition by sex and age, literacy, education, nationality, language, culture.)

    Population dynamics (mechanical emigration and immigration, natural birth rate, death rate, natural increase.)

    Marital status (marriage rate, divorce rates, average length of marriage.)

    Reproduction processes (total fecundity, gross coefficient and net coefficient.)

    Average life expectancy

    Mortality (structure of mortality, mortality rates depending on the cause, nature of morbidity and age.)

B) indicators of morbidity and injury (primary morbidity, prevalence, cumulative morbidity, pathological damage, health index, mortality, injuries, disability.)

C) indicators of physical development:

    Anthropometric (height, body weight, circumference of the chest, head, shoulder, forearm, lower leg, thigh)

    Physiometric (vital capacity of the lungs, muscle strength of the hands, backbone strength)

    Somatoscopic (physique, muscle development, degree of fatness, shape of the chest, shape of the legs, feet, severity of secondary sexual characteristics.)

    Medical statistics, its sections, tasks. The role of the statistical method in studying the health of the population and the activities of the health care system.

Medical (sanitary) statistics - studies the quantitative side of the phenomena and processes associated with medicine, hygiene and health care.

There are 3 sections of medical statistics:

1. population health statistics- studies the health status of the population as a whole or its individual groups (by collecting and statistical analysis of data on the size and composition of the population, its reproduction, natural movement, physical development, the prevalence of various diseases, life expectancy, etc.). The assessment of health indicators is carried out in comparison with generally accepted assessment levels and levels obtained for various regions and in dynamics.

2. health statistics- solves the issues of collecting, processing and analyzing information about the network of healthcare institutions (their location, equipment, activities) and personnel (about the number of doctors, middle and junior medical personnel, their distribution by specialty, length of service, their retraining, etc. .). When analyzing the activities of medical institutions, the obtained data are compared with the normative levels, as well as the levels obtained in other regions and in dynamics.

3. Clinical statistics- is the use of statistical methods in processing the results of clinical, experimental and laboratory studies; it allows, from a quantitative point of view, to assess the reliability of the results of the study and solve a number of other problems (determining the volume of the required number of observations in a selective study, forming experimental and control groups, studying the presence of correlation and regression relationships, eliminating the qualitative heterogeneity of groups, etc.).

The tasks of medical statistics are:

1) study of the state of health of the population, analysis of the quantitative characteristics of public health.

2) identification of links between health indicators and various factors of the natural and social environment, assessment of the impact of these factors on the levels of public health.

3) study materially - technical base of healthcare.

4) analysis of the activities of medical institutions.

5) evaluation of the effectiveness (medical, social, economic) of ongoing therapeutic, preventive, anti-epidemic measures and health care in general.

6) the use of statistical methods in the conduct of clinical and experimental biomedical research.

Medical statistics is a method of social diagnostics, since it allows assessing the health status of the population of a country, region and, on this basis, developing measures aimed at improving public health. The most important principle of statistics is its application to study not individual, single, but mass phenomena, in order to identify their common patterns. These patterns are manifested, as a rule, in the mass of observations, that is, in the study of the statistical population.

In medicine, statistics is the leading method, because:

1) allows you to quantify the health indicators of the population and the performance of medical institutions

2) determines the strength of the influence of various factors on the health of the population

3) determines the effectiveness of treatment and recreational activities

4) allows you to evaluate the dynamics of health indicators and allows you to predict them

5) allows you to obtain the necessary data for the development of health care norms and standards.

    Statistical aggregate. Definition, types, properties. Features of the study of the statistical population.

The object of any statistical study is a statistical population.

Population- a group consisting of a set of relatively homogeneous elements taken together within known boundaries of space and time and possessing signs of similarity and difference.

Population Properties: 1) homogeneity of units of observation 2) certain boundaries of space and time of the phenomenon under study

The object of statistical research in medicine and health care can be various contingents of the population (the population as a whole or its separate groups, sick, dead, born), medical institutions, etc.

There are two types of statistics :

a) general population

b) sampling

1. The sample population is formed in such a way as to provide an equal opportunity for all elements of the original population to be covered by observation.

2. The sample must be representative (representative), accurately and fully reflect the phenomenon, i.e. give the same idea of ​​the phenomenon, as if the whole general population was studied.

Sample population

1) must be representative, accurately and fully reflect the phenomenon, i.e. to give the same idea of ​​the phenomenon as if the whole general population was studied, for this it must:

a. be sufficient in number

b. have the main features of the general population (in the selected part, all elements must be presented in the same ratio as in the general population)

2) when forming it, it must be observed

1) random selection- selection of units of observation by drawing lots using a table of random numbers, etc. At the same time, each unit has an equal opportunity to be included in the sample.

2) mechanical selection- units of the general population, sequentially arranged according to some attribute (alphabetically, by dates of visiting a doctor, etc.), are divided into equal parts; every 5, 10 or n-th observation unit is selected from each part in a predetermined order in such a way as to provide the required sample size.

3) typical(typological) selection - involves the mandatory preliminary division of the general population into separate qualitatively homogeneous groups (types) with subsequent sampling of units of observation from each group according to the principles of random or mechanical selection.

4) serial(nested, nested) selection - involves sampling from the general population not of individual units, but of entire series (an organized population of observation units, for example, organizations, regions, etc.)

5) to combined ways - a combination of different ways of forming a sample.

    The sampling set, the requirements for it. Principles and methods of forming a sample population.

There are two types of statistics :

a) general population- a set consisting of all units of observation that can be attributed to it in accordance with the purpose of the study. When studying public health, the general population is often considered within specific territorial boundaries or may be limited by other characteristics (gender, age, etc.), depending on the purpose of the study.

b) sampling- part of the general population, selected by a special (selective) method and intended to characterize the general population.

Features of conducting a statistical study on a sample population:

1. The sample population is formed in such a way as to provide an equal opportunity for all elements of the original population to be covered by observation.

2. The sample must be representative (representative), accurately and fully reflect the phenomenon, i.e. give the same idea of ​​the phenomenon, as if the whole general population was studied.

Sample population- part of the general population, selected by a special (selective) method and intended to characterize the general population.

Sample requirements:

1) must be representative, accurately and fully reflect the phenomenon, i.e. to give the same idea of ​​the phenomenon as if the whole general population was studied, for this it must:

a. be sufficient in number

b. have the main features of the general population (in the selected part, all elements must be presented in the same ratio as in the general population)

2) when forming it, it must be observed the basic principle of sampling: equal opportunity for each unit of observation to enter the study.

Ways to form a statistical population:

1) random selection - selection of units of observation by drawing lots using a table of random numbers, etc. At the same time, each unit has an equal opportunity to be included in the sample.

2) mechanical selection - units of the general population, sequentially arranged according to some feature (alphabetically, by dates of visiting a doctor, etc.), are divided into equal parts; every 5, 10 or n-th observation unit is selected from each part in a predetermined order in such a way as to provide the required sample size.

3) typical (typological) selection - involves the mandatory preliminary division of the general population into separate qualitatively homogeneous groups (types) with subsequent sampling of observation units from each group according to the principles of random or mechanical selection.

4) serial (nested, nested) selection - involves sampling from the general population not of individual units, but of entire series (an organized population of observation units, for example, organizations, regions, etc.)

5) combined methods - a combination of various methods of forming a sample.

To obtain data on the state of society, a whole complex of sciences is used. One of them is statistics. What does she represent?

What is statistics?

This is the name of the branch of knowledge, which sets out general questions on the collection, measurement and analysis of mass (quantitative or qualitative) data. Also, statistics is engaged in the study of the quantitative side of social mass phenomena in terms of their numerical form. This word comes from the Latin status, which means "state of affairs." Initially, this science was called "State Studies".

The term "statistics" was first used in 1746, and this moment marked the beginning of such an academic discipline and science. True, it cannot be said that its direct use began with this, since the accounting, measurement and analysis of data were carried out much earlier. Fashion is an important parameter. Something similar can be remembered from geometry, but this is not quite the same. But in statistics? This is the name of the value from the linear series, which occurs most often.

Examples

Let's talk about something closer to reality. What are website page statistics? This parameter can be the number of users who accessed the resource and had the opportunity to view its content. True, from this point of view it will be difficult to answer the question of what VKontakte statistics are.

Separate information for each page is not collected. But the number of users who come in a day, a month is counted - in general, constantly. This is the answer to the question, what is statistics in practice in information technology.

Grouping types

Within the framework of a scientific discipline, one set is divided into separate groups, which are homogeneous in a certain respect. To calculate the number of intervals when there are no clear frames, the Sturges formula is often used:

CHI \u003d 1 + 3.322 * lg CHN, where

  • CHI - number of integrals;
  • Lg - logarithm;
  • CN - number of observations.

Depending on the goals, there are three types of groupings:


A typical group should strive to be as different from others as possible and to be as similar as possible within itself. They are primary and secondary. The first ones are formed during the Secondary groupings are made based on the received data.

Classification of statistical methods

They have found their way almost everywhere. Therefore, it is logical to assume that there is no universal tool. Depending on the specificity and immersion in specific problems, the following data analysis is distinguished:

  • Development and research of general purpose tools that do not take into account the specifics of the application area.
  • Creation and use of statistical models of some real phenomenon or process in a certain field of activity.
  • Development and use of methods and tools to analyze specific data to solve applied problems.

Applied Statistics

This branch of science deals with the processing of data of an arbitrary nature. Probability theory also serves as the mathematical basis of applied statistics and its methods of analysis. It all starts with a description of the type of data received, as well as the mechanism of their origin. For this, probabilistic and deterministic methods are used. The latter can be applied only in cases where the researcher has enough data at his disposal (an example is the reports of state statistical bodies, which are based on information provided by enterprises). But you can transfer the result to a larger scale and evaluate the prospects only using

In the simplest situation, the available data act as the value of a certain feature that is characteristic of the object under study. The parameters here are quantitative or indicative (depending on the category to which they belong). The second option usually speaks of a qualitative characteristic. What if we take several of them? Or add quantitative? Then we can say that the vector of the object has been obtained. It is regarded as new. In large-scale studies, samples are drawn from several sets of vectors. It is important to clarify and double-check the information received. For this, resampling is used.

Conclusion

As you can see, statistics allows you to structure significant amounts of data that are necessary to be able to provide information about the state of affairs in certain areas. Thus, it plays an important role for investors, as it makes it possible to observe the dynamics of the growth of the economies of states. Statistics are also of interest to citizens and authorities, telling them about the processes in the country: demographic growth or crisis, increase in welfare or its fall, and so on.

– molten

Yes, most Americans are aware that the economy is not in the best shape right now, but most also believe that this is just a temporary recession. The media tells us that the recovery has either begun or is about to begin.

But is it true?

1. According to Rep. Betty Sutton, America has been losing an average of 15 industrial production daily over the past 10 years.

2. Even worse, this trend seems to have started to pick up. During 2010, an average of 23 industrial plants closed daily in the US.

3. Since 2001, America has lost over 56 thousand industrial production.

4. There are too few jobs in America and now the average time for an unemployed person to find a job is a whopping 39 weeks.

5. Only 48 percent of unemployed Americans now receive unemployment benefits from the government. Just a year ago, that figure was 75 percent.

6. There are no signs that the labor market is about to improve. One recent study found that 77 percent of small businesses in the US have no plans to hire additional workers at all.

7. Without enough decent jobs, millions of Americans are losing their homes. Over the past 4 years, in Las Vegas alone, 100,000 houses have been seized for mortgage debts.

8. New home sales are also under stress. In 2011, another all-time record for the lowest new home construction was set.

9. As household budgets dwindle, Americans may be saving less money, and a significant number say they don't have extra money for unnecessary spending. The US savings rate in September was the lowest since December 2007, and according to one recent study, one third of Americans say they don't have any money to spare right now.

10. According to one recent poll, one in three Americans say they would not be able to make their current mortgage payment or pay their rent if they unexpectedly lost their current job.

11. Extreme poverty is now at its highest level since the government began keeping statistics. Now more than one in seven Americans lives below the poverty line, and about 20 million of them live in extreme poverty.

12. State and local governments are experiencing huge debt problems. At the moment, the municipal bond market in the US is bursting at the seams. The following is an excerpt from a recent article that appeared on biggovernment.com:

Moody's has just announced a worrying downward trend in municipal debt credit ratings, which is declining at the fastest pace since the Lehman crash in 2008. The data shows that the municipal bond credit rating was downgraded 5.3 times more than it was upgraded.

13. Today, more Americans than ever rely on government to survive. A staggering 48.5 percent of all Americans live in families that receive public assistance through some form of social program. In 1983 this figure was below 30 percent.

14. In such an economy, young people are particularly affected. Incredibly, 37 percent of households headed by young people under 35 have net worth equal to or below zero.

15. The wealth gap between younger and older Americans continues to widen. According to the Census Bureau, the average net worth of households headed by people aged 65 and over is 47 times the net worth of families headed by people under 35.

Most citizens are unhappy with what is happening. According to a recent Fox News poll, 76 percent of Americans are "dissatisfied with the way things are going in the country." At the beginning of the year, this figure was only 61 percent.

Fuels are set on the basis of an analysis of statistical data on actual specific fuel consumption, as well as factors affecting changes in normal operating conditions. Multiple regression models are used as a mathematical apparatus.

Analysis of publications on the evaluation of the economic efficiency of new technology and their own research allowed the authors to draw a number of conclusions. First of all, the impact of individual factors on increasing the economic efficiency of production when using new equipment in oil product pipeline transport can be identified on the basis of voluminous material of actual observations and analysis of statistical data. When determining indicators for assessing economic efficiency, quantitative values ​​of meters should be taken into account, taking into account the conditions in force in a given period. The standards used in the calculations should fully reflect the existing costs with indexation of the cost of production and use of equipment in terms of inflation.

The history of the development of mankind has shown that without statistical data it is impossible to govern the state, develop individual industries and sectors of the economy, and ensure optimal proportions between them. The need to collect and summarize a lot of data on the country's population, enterprises, banks, farms, etc. leads to the existence of special statistical services - state statistics institutions. Depending on which industry the collection, processing and analysis of statistical data is organized, there are statistics of the population, industry, agriculture, capital construction, finance, etc. All these sections of statistics are designed to develop methods for collecting and summarizing data, constructing summary indicators to reflect the processes in the relevant industry. Statistics also calculates general economic indicators - gross national product, gross domestic product, total social product, national income, etc.

The word statistics is used in several senses, primarily as a synonym for the word data. It is in this sense that we can say the statistics of births and deaths in Russia or the statistics of crimes. Statistics is a branch of knowledge that combines the principles and methods of working with numerical data characterizing mass phenomena. Statistics is also called the branch of practical activity aimed at collecting, processing, analyzing statistical data.

An analysis of the causes of the emergence and course of inflation in the Russian Federation shows their uniqueness and a significant predominance of cost-push inflation over demand-push inflation. Therefore, Western anti-inflationary theories are not very suitable for Russian conditions. A domestic, harmonious, complete theory has not yet been created, just as there are no thick Russian textbooks on the fight against inflation. Bits of much-needed knowledge are scattered across hundreds of newspapers and magazines. The task is, on the one hand, to clear up the clots of non-payments, which in some cases have already led to the paralysis of production, on the other hand, to prevent precipitous inflation. Difficult tasks, but they must be solved. Based on the analysis of statistical data for the last seven years, the study of publications of leading domestic economists, the author proposes his own solutions to problems.

The task is, on the one hand, to clean up the clots of non-payments, which in some cases have already led to paralysis, and on the other hand, to prevent precipitous inflation. It's time to start suppressing inflation in the normal way - by increasing the output of products that are in demand in every possible way. The most difficult tasks, but they must be solved if we want to survive as a world power, and not as a raw materials appendage. Based on the analysis of statistical data and familiarization with the publications of leading domestic economists, the author proposes his own solutions to problems.

Thus, in models with variable parameters, a differentiated approach is needed to establish the ranges of variation of the selection coefficients, based on the analysis of statistical data, the type of technological processes and quality indicators of flows.

Forecasting tax revenues based on macroeconomic indicators determines the strategy for generating tax revenues for the next year and the medium term, but does not solve all the problems of tax planning. Therefore, a necessary component of tax planning is the processing and analysis of statistical data on the accumulation of taxes to the budget over the past period, as well as information on possible changes in tax legislation.

It is necessary to organize a systematic collection and analysis of statistical data characterizing the dynamics by years of operation of the volume of products and work performed using the introduced equipment, as well as the cost, labor intensity and material consumption.

Along with the determination by the main selected parameter, the calculation of the need for certain types of machinery and equipment is adjusted based on a number of other factors, changes in the balance of consumption of machinery and equipment by sectors of the national economy, changes in the structure of output of products, changes in the product range planned in rubles due to the introduction of more progressive, reliable and durable designs of changes associated with the development of specialization and cooperation, affecting the total volume of output, etc. period.

There is a very close relationship between employment indicators and other important indicators of economic development. Thus, the relationship between unemployment and GDP change is characterized by Okun's law, empirically discovered based on the analysis of statistical data for the United States (for the period of the 50-80s), and then substantiated and theoretically in macroeconomic studies. In its original form, as applied to the United States, Okun's law reads

For all positive values ​​of x, the function increases at x = b/2, the curve has an inflection point - accelerated growth at x slow growth at x > b/2. Functions of this type are used in the analysis of statistical data on consumer budgets, where a hypothesis is put forward about the existence of an asymptotic level of expenditure, about a change in the marginal propensity to consume a product, about the existence of a threshold level of income 1. In this case, for x -> yes y - e "(Fig. .2.5).

This formula was applied to analyze statistical data,

All sales forecasts are based on the use of three types of information obtained from studying what people say, what people do, and what people have done. Obtaining the first type of information is based on the study of the opinions of consumers and buyers, sales agents and intermediaries. Methods of sociological research and expert methods are used here. Learning what people are doing involves doing market testing. Studying what people have done involves analyzing the statistics of the purchases they have made.

Let us consider the distribution of oil and gas production facilities by the nature of changes in production volumes at oil and gas production facilities with growing, stable and declining production. For 1/1 1972, out of 104 oil and gas production departments of the industry, 43 (or 41.4%) were growing and 61 were stable or falling. Analysis of statistical data for 1970, carried out by the authors for 76 OGPDs, made it possible to identify some common characteristics of various subgroups of NGDUs, which are given in Table. fifteen.