Biographies Characteristics Analysis

Ermolaev mathematical statistics for psychologists. Methods of mathematical statistics in psychology

Mathematical methods in psychology are used to process research data and establish patterns between the studied phenomena. Even the simplest research is not complete without mathematical data processing.

Data processing can be carried out manually, or maybe with the use of special software. The final result may look like a table; Methods in psychology also allow you to graphically display the data obtained. For different (quantitative, qualitative and ordinal) different assessment tools are used.

Mathematical methods in psychology include both allowing to establish numerical dependencies and methods of statistical processing. Let's take a closer look at the most common of them.

In order to measure data, first of all, it is necessary to determine the scale of measurements. And here such mathematical methods in psychology are used as registration and scaling, consisting in the expression of the studied phenomena in numerical terms. There are several types of scales. However, only some of them are suitable for mathematical processing. This is mainly a quantitative scale that allows you to measure the degree of expression of specific properties in the objects under study and numerically express the difference between them. The simplest example is the measurement of intelligence quotient. The quantitative scale allows you to carry out the operation of ranking data (see below). When ranking data from a quantitative scale, it is converted into a nominal one (for example, low, medium or high value of the indicator), while the reverse transition is no longer possible.

Ranging is the distribution of data in descending (ascending) order of the feature being evaluated. In this case, a quantitative scale is used. Each value is assigned a certain rank (the indicator with the minimum value is rank 1, the next value is rank 2, and so on), after which it becomes possible to transfer the values ​​from the quantitative scale to the nominal one. For example, the measured indicator is the level of anxiety. 100 people were tested, the results are ranked, and the researcher sees how many people have a low (high or average) score. However, this way of presenting data entails a partial loss of information for each respondent.

Correlation analysis is the establishment of a relationship between phenomena. At the same time, it is measured how one indicator will change when the indicator in the relationship with which it is changed changes. Correlation is considered in two aspects: in strength and in direction. It can be positive (with an increase in one indicator, the second also increases) and negative (with an increase in the first, the second indicator decreases: for example, the higher the level of anxiety in an individual, the less likely it is that he will take a leading position in the group). The relationship can be linear or, more commonly, curved. The connections that help to establish may not be obvious at first glance if other methods of mathematical processing in psychology are used. This is its main merit. The disadvantages include high labor intensity due to the need to use a considerable number of formulas and careful calculations.

Factor analysis- this is another one that allows you to predict the likely influence of various factors on the process under study. At the same time, all factors of influence are initially taken as having equal value, and the degree of their influence is calculated mathematically. Such an analysis allows one to establish the common cause of the variability of several phenomena at once.

To display the obtained data, tabulation methods (creating tables) and graphic construction (diagrams and graphs that not only give a visual representation of the results obtained, but also allow predicting the course of the process) can be used.

The main conditions under which the above mathematical methods in psychology ensure the reliability of the study are the presence of a sufficient sample, the accuracy of measurements and the correctness of the calculations made.

O. A. SHUSHERINA

math statistics

for psychologists

Tutorial

Krasnoyarsk 2012

Part 1. Descriptive statistics

Topic 1. General population. Sample. Choice…………….....

Topic 2. Variational and statistical series………………………

Topic 3. Numerical characteristics of the sample……………………….....

Part 2. Statistical estimates of the distribution parameters of the general population

Topic 1. Point estimates of the parameters of the general population….

Topic 2. Interval estimates of the parameters of the general population………………………………………………………………

Part 3. Testing statistical hypotheses

Topic 1. Basic concepts of the theory of statistical decision making…………………………………………………………………….

Topic 2. Verification of hypotheses about the difference in the level of manifestation of the studied trait (Mann-Whitney criterion)…………………...

Topic 3. Testing the hypothesis about the equality of general means (independent samples)………………………………………………….

Topic 4. Testing the hypothesis of equality of general means (dependent samples)………………………………………….

Part 4. Correlation analysis

Topic 1. Correlation and its statistical study……………………………………………………………………

Topic 2. Significance of the sample linear correlation coefficient…………………………………………………………………

Topic 3. Rank correlation coefficients and associations…………………………………………………………………

Literature……………………………………………………………

Applications. tables …………………………………………….


Part 1. Descriptive statistics

Topic 1. general population. sample. choice.

Math statistics - This a science that develops methods for recording, describing and analyzing observational and experimental data in order to obtain probabilistic-statistical models of the phenomena under study. Its methods are applicable to the processing of observations and experiments of any nature.

Methods and methods mathematical and statistical processing students of humanitarian faculties, including psychological ones, cause significant difficulties and, as a result, fear and prejudice in the possibility of mastering them. However, as practice shows, these are false delusions.

In modern psychology, in the practical activity of a psychologist of any level, without using the apparatus of mathematical statistics, all conclusions can be perceived with a certain degree of subjectivity.

1. Problems of mathematical statistics

Main purpose of mathematical statistics- obtaining and processing data for statistically significant support of the decision-making process, for example, when solving problems of planning, management, forecasting.

The task of mathematical statistics is the study of mass phenomena in society, nature, technology by the methods of probability theory and their scientific substantiation.

AT probability theory we, knowing the nature of a certain phenomenon, find out how certain characteristics that we study, which can be observed in experiments, will behave.

AT mathematical statistics , on the contrary, the initial data are experimental data (observations on random variables), and it is required to make one or another judgment about the nature of the phenomenon under study.

The main tasks of mathematical statistics are:

§ Estimation of numerical characteristics or distribution parameters of a random variable according to experimental data.

§ Testing statistical hypotheses about the properties of the random phenomenon under study.

§ Determining the empirical relationship between variables describing a random phenomenon based on experimental data.

Consider typical research scheme when solving these problems. These studies are naturally divided into two parts.

Part 1. First, through observations and experiments, statistical data that make up the sample are collected, recorded - these are numbers, also called sample data . Then they are ordered, presented in a compact, visual or functional form. Various kinds of average values ​​characterizing the sample are calculated. The part of mathematical statistics that makes this work is called descriptive statistics .

Part 2. The second part of the researcher's work is to obtain, on the basis of the information found about the sample, sufficiently substantiated conclusions about the properties of the random phenomenon under study. This part of the work is provided by statistical methods, which are output statistics.

2. Sampling research method

Types of activity" href="/text/category/vidi_deyatelmznosti/" rel="bookmark"> type of activity that requires high professional competence and often a lot of time to work with each subject. Help comes selective research method , in this case, a limited number of objects are randomly selected from the entire population and studied.

Population is a set of objects (any group of people) that a psychologist studies on a sample basis. Theoretically, it is believed that the size of the general population is not limited. In practice, it is believed that this volume is limited depending on the object of observation and the problem being solved.

From the entire population of people, which is called the general population, a limited number of people (subjects, respondents) are randomly selected. A set of randomly selected objects for study is called sample population , or simply sampling .

Volume samples name the number of people in it. The sample size is denoted by the letter . It can be different, but not less than two respondents. The statistics are:

small sample ();

average sample ();

big sampling ().

The sampling process is called choice.

At sampling you can do it in the following ways:

1) after the selection and study of the subject, he is “returned” to the general population; such a sample is called repeated. A psychologist often has to test the same subjects several times using the same technique, but each time the subjects will have differences due to the functional and age variability inherent in each person;

2) after the selection and study of the subject, he is not returned to the general population; such a sample is called non-repetitive .

To sampling presented requirements determined by the goals and objectives of the study.

1. Organized sampling must be representative in order to get it right introduce in the same proportion and the same frequency are the main features in the general population. The sample will be representative if it is carried out by chance: each subject is selected randomly from the general population if all objects have the same probability of being included in the sample. A representative sample is a smaller but accurate model of the population.

In scientific research on a part (separate sample), it is never possible to fully characterize the whole (general population, population). Such errors, when generalizing, transferring the results obtained from the study of a separate sample to the entire population, are called representativeness errors .

2. The sample must be homogeneous , i.e., each subject must have those characteristics that are criteria for the study: age, gender, education, and so on. The conditions for conducting experiments should not change, and the sample should be obtained from one general population.

Samples are called independent (incoherent ), if the procedure of the experiment and the obtained results of measuring a certain property in subjects of one sample do not affect the features of the course of the same experiment and the results of measuring the same property in subjects of another sample.

Samples are called dependent (liaison ) if the procedure of the experiment and the obtained results of measurement of a certain property, carried out on one sample, affect the results of measurement of the same property in another experiment. Let us note that the same group of subjects, on which a psychological examination was carried out twice (even if different psychological qualities, signs, characteristics), is considered dependent or connected sample.

The main stage in the work of a psychologist with a sample is identification of the results of statistical analysis and dissemination of the findings to the entire population.

Selecting the most appropriate sample size depends on:

1) the degree of homogeneity of the phenomenon under study (the more homogeneous the phenomenon, the smaller the sample size can be);

2) statistical methods used by the psychologist. Some methods require a large number of subjects (more than 100 people), others allow a small number (5-7 people).

Statistical study

1. Collection of empirical data Selective research method

2. Primary processing Variation series

results observations

Empirical distribution

Frequency polygon Frequency histogram

3. Mathematical processing

statistical data Parameter Estimation

distribution

Correlation Methods Factorial Methods Regression Methods

analysis analysis analysis

Stages of statistical research

test questions

1. What are the main tasks of mathematical statistics?

2. What is called the general and sample populations for the random variable under study?

3. What is the essence of the selective method?

4. What sample is called representative, homogeneous?

1. Tables of grouped data

Processing of the experimental material begins with systematization and groupings results for some attribute.

tables. The main content of the table should be reflected in title.

simple table- this is a list, a list of individual test units with a quantitative or qualitative characteristic. Grouping by one attribute (for example, by gender) is used.

complex table It is used to clarify the cause-and-effect relationships between signs and allows you to identify a trend, to detect different aspects between signs.

No. of subjects

Points received for the task

2. Discrete statistical series

The sequence of data located at the order in which they were obtained in the experiment, is called statistical series .

The results of observations, in the general case, a series of numbers arranged in disorder, must be ordered ( rank). You can rank in ascending or descending order. After the ranking operation, the experimental data can be grouped so that in each group the feature takes the same value, which is called option (indicated by ).

The number of elements in each group is called frequency options(). Frequency shows, the number of times a given value occurs in the original population. The total sum of the frequencies is equal to the sample size: .

An ordered distribution series, in which the frequency of the variant belonging to a given population is indicated, is called variational near.

Variants (characteristic values)

Psychology papers can be calculated manually. The corresponding formulas and calculation algorithms are easy to find in the relevant textbooks or Internet resources. However, for a psychology student, statistics is not an end in itself, but only a tool for analysis, learning new patterns, revealing new psychological knowledge. Obviously, understanding this, in most modern psychological universities and faculties it is allowed to carry out statistical calculations using special statistical programs.

The most famous and widespread computer programs for calculating statistical criteria in coursework, diploma or master's in psychology are:

  • Microsoft Excel spreadsheets.
  • Statistical package STATISTICA.
  • SPSS program.

Statistical calculations with Excel spreadsheets

Excel spreadsheet is a program that allows you to perform various operations on tabular data. Its field is a regular table in which you can enter a table of initial data obtained after testing subjects using psychodiagnostic methods.

Each line in this table will correspond to the subject, and each column will correspond to an indicator on the scale of the psychological test. In Excel tables, you can perform statistical calculations both by columns and by rows.

In Excel, you can also build graphs that reflect the severity of psychological indicators in groups, and then transfer them to the text of the thesis, designed in the Word program.

Calculations of statistical criteria using statistical packages STATISTICA and SPSS

STATISTICA and SPSS programs are designed for statistical data processing and are used in various sciences. In psychology, these programs allow processing the results of empirical research when writing term papers, theses and master's theses.

The main field of the STATISTICA and SPSS packages is a table where you need to enter the test results of the subjects (table of initial data).

Further, using the options of the top menu, you can perform various calculations on the data columns. In STATISTICA and SPSS programs, you can calculate the entire range of statistical criteria needed when writing a diploma in psychology, from descriptive statistics before factor analysis.

Which program for statistical calculations to choose

Psychology students who start statistical processing of test results often face the question: “What calculation program should I use?”. Many people are very worried about this, because it seems to them that the “wrong choice” of the program will distort the results, lead to errors, etc.

It is important to understand that all statistical data analysis programs work according to the same, even identical algorithms. They are programmed with the same mathematical formulas. Therefore, to say that the choice of a statistical data analysis program in a psychology degree can affect the result is like thinking that the calculation of arithmetic expressions depends on the choice of brand of calculator.

According to the rules, tables with data directly from the statistical program cannot be entered into the text of a thesis in psychology. The tables produced by the statistical program often contain additional parameters that are not needed.

Therefore, you need to copy the calculation results from the statistical program and paste them into tables created by the Word program. That is, only figures remain in the term paper or thesis, reflecting the degree of statistical significance of relationships or differences between psychological indicators. Thus, from the point of view of the final result, it is completely indifferent with the help of which statistical program the calculations in the diploma in psychology were carried out.

However, in some universities, students are specifically taught to work in a particular statistical program. Then they may be required to present the results of the calculation exactly in the form in which they are given by the corresponding program. In this case, these tables are placed in the application, and the text of the work itself provides data in word format tables.

I hope this article will help you write a psychology paper on your own. If you need help, please contact (all types of work in psychology; statistical calculations).

The word "statistics" is often associated with the word "mathematics", and this intimidates students who associate this concept with complex formulas that require a high level of abstraction.

However, as McConnell says, statistics is primarily a way of thinking, and all you need to use it is to have a little common sense and know the basics of mathematics. In our daily life, we ourselves, without realizing it, are constantly engaged in statistics. Do we want to plan a budget, calculate the gasoline consumption of a car, estimate the effort that will be required to master a certain course, taking into account the marks obtained so far, predict the likelihood of good and bad weather from a weather report, or generally estimate how this or that event will affect on our personal or collective future - we constantly have to select, classify and organize information, connect it with other data so that we can draw conclusions that allow us to make the right decision.

All these activities differ little from those operations that underlie scientific research and consist in the synthesis of data obtained on various groups of objects in a particular experiment, in their comparison in order to find out the differences between them, in their comparison in order to identify indicators that change in one direction, and, finally, in the prediction of certain facts based on the conclusions that the results lead to. This is precisely the purpose of statistics in the sciences in general, especially in the humanities. There is nothing absolutely reliable in the latter, and without statistics, the conclusions in most cases would be purely intuitive and could not form a solid basis for interpreting the data obtained in other studies.

In order to appreciate the enormous benefits that statistics can provide, we will try to follow the progress of deciphering and processing the data obtained in the experiment. Thus, based on the specific results and the questions that they pose to the researcher, we will be able to understand the various methods and simple ways to apply them. However, before embarking on this work, it will be useful for us to consider in the most general terms the three main branches of statistics.

1. Descriptive statistics, as the name suggests, allows you to describe, summarize and reproduce in the form of tables or graphs

data of one or another distribution, calculate the average for a given distribution and its scope and dispersion.

2. Challenge inductive statistics- checking whether it is possible to disseminate the results obtained at this sampling, for the entire population from which this sample was taken. In other words, the rules of this section of statistics make it possible to find out to what extent it is possible, by induction, to generalize to a larger number of objects this or that regularity discovered when studying their limited group in the course of any observation or experiment. Thus, with the help of inductive statistics, some conclusions and generalizations are made based on the data obtained during the study of the sample.

3. Finally, measurement correlations allows us to know how related two variables are, so that we can predict the possible values ​​of one of them if we know the other.

There are two types of statistical methods or tests that allow you to generalize or calculate the degree of correlation. The first type is the most widely used parametric methods, which use parameters such as the mean or variance of the data. The second variety is nonparametric methods, which provide an invaluable service when the researcher is dealing with very small samples or with high-quality data; these methods are very simple in terms of both calculation and application. As we become familiar with the various ways of describing data and move on to statistical analysis of it, we will look at both of these varieties.

As already mentioned, in order to try to understand these various areas of statistics, we will try to answer the questions that arise in connection with the results of a particular study. As an example, we will take one experiment, namely, the study of the effect of marijuana consumption on oculomotor coordination and reaction time. The methodology used in this hypothetical experiment, as well as the results we could get from it, are presented below.

If you wish, you can replace some specific details of this experiment with others - for example, marijuana use for alcohol consumption or sleep deprivation - or, even better, substitute for these hypothetical data that you actually received in your own research. In any case, you will have to accept the "rules of our game" and perform the calculations that are required of you here; only under this condition will the essence of the object “reach” you, if this has not already happened to you before.

Important note. In the sections on descriptive and inductive statistics, we will consider only those experimental data that are relevant to the dependent variable “targets hit”. As for such an indicator as the reaction time, we will turn to it only in the section on calculating the correlation. However, it goes without saying that from the very beginning, the values ​​of this indicator should be treated in the same way as the variable “targets hit”. We leave it to the reader to do this on their own with pencil and paper.

Some basic concepts. Population and sample

One of the tasks of statistics is to analyze data obtained from a part of a population in order to draw conclusions about the population as a whole.

population in statistics does not necessarily mean any group of people or natural community; this term refers to all beings or objects that form a common study population, whether they are atoms or students visiting this or that cafe.

Sample- this is a small number of elements selected using scientific methods so that it is representative, i.e. reflected the population as a whole.

(In the domestic literature, the terms “general population” and “sample population”, respectively, are more common. - Note. transl.)

Data and its varieties

Data in statistics, these are the main elements to be analyzed. Data can be any quantitative results, properties inherent in certain members of the population, a place in a particular sequence - in general, any information that can be classified or categorized for the purpose of processing.

"Data" should not be confused with the "values" that data can take. In order to always distinguish between them, Chatillon (1977) recommends remembering the following phrase: “Data often takes on the same values” (so if we take, for example, six data - 8, 13, 10, 8, 10 and 5, they take only four different values ​​- 5, 8, 10 and 13).

Building distribution- this is the division of primary data obtained in the sample into classes or categories in order to obtain a generalized ordered picture that allows them to be analyzed.

There are three types of data:

1. quantitative data obtained during measurements (for example, data on weight, dimensions, temperature, time, test results, etc.). They can be distributed on a scale with equal intervals.

2. Ordinal data, corresponding to the places of these elements in the sequence obtained by placing them in ascending order (1st, ..., 7th, ..., 100th, ...; A, B, C. ...) .

3. Qualitative data, representing some properties of the elements of the sample or population. They cannot be measured, and their only quantitative assessment is the frequency of occurrence (the number of persons with blue or green eyes, smokers and non-smokers, tired and rested, strong and weak, etc.).

Of all these types of data, only quantitative data can be analyzed using methods based on options(such as the arithmetic mean, for example). But even for quantitative data, such methods can be applied only if the number of these data is sufficient to show a normal distribution. So, in principle, three conditions are necessary for the use of parametric methods: the data must be quantitative, their number must be sufficient, and their distribution must be normal. In all other cases, it is always recommended to use nonparametric methods.

As you know, the relationship between psychology and
mathematics in recent years has become
ever closer and more complex.
Current practice shows that
psychologist should not only operate
methods of mathematical statistics, but also
represent the subject of your science from the point of view
view of the "queen of sciences", otherwise
he will be the carrier of tests that issue
finished results without their comprehension.

Mathematical methods are
general name of the complex
mathematical disciplines united
to study social and
psychological systems and processes.

Basic mathematical methods recommended for
teaching psychology students:
Methods of mathematical statistics. Here
includes correlation analysis, univariate
analysis of variance, two-way analysis of variance, regression analysis and factorial
analysis.
Mathematical modeling.
Information theory methods.
system method.

Psychological measurements

At the heart of the application of mathematical
methods and models in any science lies
measurement. In psychology, objects
measurements are properties of the system
psyche or its subsystems, such as
perception, memory, direction
personalities, abilities, etc.
Measurement is attribution
objects of numerical values ​​reflecting
the measure of whether a given object has a property.

Let's name the three most important properties
psychological measurements.
1. The existence of a family of scales,
allowing different groups
transformations.
2. The strong influence of the measurement procedure on
the value of the measured quantity.
3. Multidimensionality of the measured
psychological quantities, i.e. essential
their dependence on a large number
parameters.

STATISTICAL ANALYSIS OF EXPERIMENTAL DATA

Questions:
1. Methods of primary statistical

2. Methods of secondary statistical
processing the results of the experiment

METHODS OF PRIMARY STATISTICAL PROCESSING OF EXPERIMENTAL RESULTS

Statistical processing methods
experimental results are called
mathematical tricks, formulas,
methods of quantitative calculations, with
through which indicators
obtained during the experiment,
to generalize, to bring into system, to reveal
patterns hidden in them.

Some of the methods of mathematical and statistical analysis make it possible to calculate
so-called elementary
mathematical statistics,
characterizing the sampling distribution
data, for example
*sample mean,
*sample variance,
*fashion,
*median and a number of others.

10.

Other methods of mathematical statistics,
For example:
analysis of variance,
regression analysis,
make it possible to judge the dynamics of change
individual sample statistics.

11.

With
using the third group of methods:
correlation analysis,
factor analysis,
methods for comparing sample data,
can reliably judge
statistical links that exist
between variables that
investigated in this experiment.

12.

All methods of mathematical and statistical analysis are conditionally
divided into primary and secondary
Methods are called primary, with the help of
which you can get indicators
directly reflective results
measurements made in the experiment.
Methods are called secondary.
statistical processing, using
which are identified on the basis of primary data
hidden in them statistical
patterns.

13. Consider methods for calculating elementary mathematical statistics

Sample mean as
statistic represents
is an average assessment of the studied in
psychological quality experiment.
The sample mean is determined using
the following formula:
n
1
x k
nk 1

14.

Example. Let us assume that as a result
application of psychodiagnostic methods
to assess some psychological
properties in ten subjects we got
the following partial exponents
the development of this property in individual
test subjects:
x1=5, x2=4, x3=5, x4=6, x5=7, x6=3, x7=6, x8=
2, x9=8, x10=4.
10
1
50
x xi
5.0
10k1
10

15.

Dispersion as a statistical value
characterizes how private
values ​​deviate from the mean
values ​​in this sample.
The greater the dispersion, the more
deviations or scatter of data.
2
S
1
2
(xkx)
nk 1
n

16. STANDARD DEVIATION

Sometimes instead of variance to identify
scatter of private data relative to
average use the derivative of
variance is a quantity called
standard deviation. It equals
square root taken from
dispersion, and is denoted by the same
the same sign as the dispersion, only without
square
n
S
S
2
2
x
kx)
k 1
n

17. MEDIAN

The median is the value of the studied
feature that divides the sample, ordered
according to the value of this sign, in half.
To the right and left of the median in an ordered series
remains the same number of characters.
For example, for sample 2, 3.4, 4, 5, 6, 8, 7, 9
the median will be 5, since left and right
it leaves four indicators.
If the series includes an even number of features,
then the median is the mean, taken as half the sum
values ​​of the two central values ​​of the series. For
next row 0, 1, 1, 2, 3, 4, 5, 5, 6, 7 median
will be equal to 3.5.

18. FASHION

Fashion is called quantitative
the value of the trait under study,
most frequently found in selection
For example, in the sequence of values
features 1, 2, 5, 2, 4, 2, 6, 7, 2 fashion
is the value 2, since it
occurs more often than other values ​​-
four times.

19. INTERVAL

An interval is a group of ordered by
the value of the characteristic values, replaced in the process
calculations by average.
Example. Let us imagine the following series of quotients
signs: O, 1, 1, 2, 2, 3, 3, 3, 4, 4, 5, 5, 5, 5, 6, 6, 6, 7,
7, 8, 8, 8, 9, 9, 9, 10, 10, 11, 11, 11. This series includes
yourself 30 values.
Let's break the presented series into six subgroups
five features each
Calculate the average values ​​for each of the five
formed subgroups of numbers. They are respectively
will be equal to 1.2; 3.4; 5.2; 6.8; 8.6; 10.6.

20. Control task

For the following rows, calculate the average,
mode, median, standard deviation:
1) {3, 4, 5, 4, 4, 4, 6, 2}
2) {10, 40, 30, 30, 30, 50, 60, 20}
3) {15, 15, 15, 15, 10, 10, 20, 5, 15}.

21. METHODS FOR SECONDARY STATISTICAL PROCESSING OF EXPERIMENTAL RESULTS

With Secondary Methods
statistical processing
experimental data directly
tested, proven or
hypotheses related to
experiment.
These methods tend to be more difficult than
methods of primary statistical processing,
and demand from the researcher a good
training in elementary
mathematics and statistics.

22.

Regression calculus -
is a mathematical method
statistics, allowing
bring together private, disparate
data to some
line chart,
roughly reflective
their interrelationship, and
get the opportunity to
one of the variables
approximate
probable meaning of another
variable.