Biographies Characteristics Analysis

The population is random and natural sampling. Population and sampling method

Population (in English - population) - a set of all objects (units) regarding which a scientist intends to draw conclusions when studying a specific problem.

The population consists of all objects that are subject to study. The composition of the population depends on the objectives of the study. Sometimes the general population is the entire population of a certain region (for example, when studying the attitude of potential voters towards a candidate), most often several criteria are specified that determine the object of the study. For example, men 30-50 years old who use a certain brand of razor at least once a week and have an income of at least $100 per family member.

Sampleor sample population- a set of cases (subjects, objects, events, samples), using a certain procedure, selected from the general population to participate in the study.

Sample characteristics:

· Qualitative characteristics of the sample - who exactly we choose and what sampling methods we use for this.

· Quantitative characteristics of the sample - how many cases we select, in other words, sample size.

Necessity of sampling

· The object of study is very extensive. For example, consumers of a global company's products - great amount geographically dispersed markets.

· There is a need to collect primary information.

Sample size

Sample size- the number of cases included in the sample population. For statistical reasons, it is recommended that the number of cases be at least 30 to 35.

Dependent and independent samples

When comparing two (or more) samples, an important parameter is their dependence. If a homomorphic pair can be established (that is, when one case from sample X corresponds to one and only one case from sample Y and vice versa) for each case in two samples (and this basis of relationship is important for the trait being measured in the samples), such samples are called dependent. Examples of dependent samples:

· pairs of twins,

· two measurements of any characteristic before and after experimental influence,

· husbands and wives

· and so on.

If there is no such relationship between samples, then these samples are considered independent, For example:

· men and women,

· psychologists and mathematicians.

Accordingly, dependent samples always have the same size, while the size of independent samples may differ.

Comparison of samples is made using various statistical criteria:

· Student's t-test

· Wilcoxon test

· Mann-Whitney U test

· Sign criterion

· and etc.

Representativeness

The sample may be considered representative or non-representative.

Example of a non-representative sample

In the USA, one of the most famous historical examples non-representative sample believed to have occurred during the 1936 presidential election. The Literary Digest, which had successfully predicted the events of several previous elections, was wrong in its predictions when it sent out ten million test ballots to its subscribers, as well as to people selected from nationwide telephone books and people from car registration lists. In 25% of returned ballots (almost 2.5 million), the votes were distributed as follows:

· 57% preferred Republican candidate Alf Landon

· 40% chose then-Democratic President Franklin Roosevelt

In the actual elections, as is known, Roosevelt won, gaining more than 60% of the votes. The Literary Digest's mistake was this: wanting to increase the representativeness of the sample - since they knew that most of their subscribers considered themselves Republicans - they expanded the sample to include people selected from telephone books and registration lists. However, they did not take into account the realities of their time and in fact recruited even more Republicans: during the Great Depression, it was mainly representatives of the middle and upper class who could afford to own telephones and cars (that is, most Republicans, not Democrats).

Types of plan for constructing groups from samples

There are several main types of group building plans:

1. A study with experimental and control groups, which are placed in different conditions.

2. Study with experimental and control groups using a pairwise selection strategy

3. A study using only one group - an experimental group.

4. A study using a mixed (factorial) design - all groups are placed in different conditions.

Sampling types

Samples are divided into two types:

· probabilistic

· non-probabilistic

Probability samples

1. Simple probability sampling:

oSimple resampling. The use of such a sample is based on the assumption that each respondent is equally likely to be included in the sample. Based on the list of the general population, cards with respondent numbers are compiled. They are placed in a deck, shuffled and a card is taken out at random, the number is written down, and then returned back. Next, the procedure is repeated as many times as the sample size we need. Disadvantage: repetition of selection units.

The procedure for constructing a simple random sample includes the following steps:

1. must be received full list members of the population and number this list. Such a list, recall, is called a sampling frame;

2. determine the expected sample size, that is, the expected number of respondents;

3. extract from table random numbers as many numbers as we need sample units. If there should be 100 people in the sample, 100 random numbers are taken from the table. These random numbers can be generated by a computer program.

4. select from the base list those observations whose numbers correspond to the written random numbers

· Simple random sampling has obvious advantages. This method is extremely easy to understand. The results of the study can be generalized to the population being studied. Most approaches to statistical inference involve collecting information using a simple random sample. However, the simple random sampling method has at least four significant limitations:

1. It is often difficult to create a sampling frame that would allow simple random sampling.

2. the result of applying a simple random sample can be a large population, or a population distributed over a large geographical area, which significantly increases the time and cost of data collection.

3. the results of using a simple random sample are often characterized by low accuracy and greater standard error than the results of applying other probabilistic methods.

4. As a result of using SRS, a non-representative sample may be formed. Although samples obtained by simple random sampling, on average, adequately represent the population, some of them are extremely misrepresentative of the population being studied. This is especially likely when the sample size is small.

· Simple non-repetitive sampling. The sampling procedure is the same, only the cards with respondent numbers are not returned to the deck.

1. Systematic probability sampling. It is a simplified version of simple probability sampling. Based on the list of the general population, respondents are selected at a certain interval (K). The value of K is determined randomly. The most reliable result is achieved with a homogeneous population, otherwise the step size and some internal cyclic patterns of the sample may coincide (sampling mixing). Disadvantages: the same as in a simple probability sample.

2. Serial (cluster) sampling. The sampling units are statistical series(family, school, team, etc.). The selected elements are subject to a complete examination. The selection of statistical units can be organized as random or systematic sampling. Disadvantage: Possibility of greater homogeneity than in the general population.

3. Regional sampling. In the case of a heterogeneous population, before using probability sampling with any selection technique, it is recommended to divide the population into homogeneous parts, such a sample is called district sampling. Zoning groups can include both natural formations (for example, city districts) and any feature that forms the basis of the study. The characteristic on the basis of which the division is carried out is called the characteristic of stratification and zoning.

4. "Convenience" sample. The “convenience” sampling procedure consists of establishing contacts with “convenient” sampling units - a group of students, a sports team, friends and neighbors. If you need to obtain information about people's reactions to new concept, such a sample is quite justified. Convenience sampling is often used to pretest questionnaires.

Non-probability samples

Selection in such a sample is carried out not according to the principles of randomness, but according to subjective criteria - availability, typicality, equal representation, etc.

1. Quota sampling - the sample is constructed as a model that reproduces the structure of the general population in the form of quotas (proportions) of the characteristics being studied. The number of sample elements with different combinations of studied characteristics is determined so that it corresponds to their share (proportion) in the general population. So, for example, if our general population consists of 5,000 people, of which 2,000 are women and 3,000 are men, then in the quota sample we will have 20 women and 30 men, or 200 women and 300 men. Quota samples are most often based on demographic criteria: gender, age, region, income, education, and others. Disadvantages: usually such samples are not representative, because it is impossible to take into account several social parameters at once. Pros: readily available material.

2. Snowball method. The sample is constructed as follows. Each respondent, starting with the first, is asked for contact information of his friends, colleagues, acquaintances who would fit the selection conditions and could take part in the study. Thus, with the exception of the first step, the sample is formed with the participation of the research objects themselves. The method is often used when it is necessary to find and interview hard-to-reach groups of respondents (for example, respondents with a high income, respondents belonging to the same professional group, respondents with any similar hobbies/interests, etc.)

3. Spontaneous sampling – sampling of the so-called “first person you come across”. Often used in television and radio polls. The size and composition of spontaneous samples is not known in advance, and is determined only by one parameter - the activity of respondents. Disadvantages: it is impossible to establish which population the respondents represent, and as a result, it is impossible to determine representativeness.

4. Route survey – often used when the unit of study is the family. On the map settlement, in which the survey will be carried out, all streets are numbered. Using a table (generator) of random numbers, large numbers are selected. Each big number is considered as consisting of 3 components: street number (2-3 first numbers), house number, apartment number. For example, the number 14832: 14 is the street number on the map, 8 is the house number, 32 is the apartment number.

5. Regional sampling with selection of typical objects. If, after zoning, a typical object is selected from each group, i.e. an object that is close to the average in terms of most of the characteristics studied in the study, such a sample is called regionalized with the selection of typical objects.

Group Building Strategies

Selection of groups for their participation in psychological experiment carried out through various strategies that are needed to ensure that internal and external validity are maintained to the greatest possible extent.

· Randomization (random selection)

· Pairwise selection

· Stratometric selection

· Approximate Modeling

· Attracting real groups

Randomization, or random selection, is used to create simple random samples. The use of such a sample is based on the assumption that each member of the population is equally likely to be included in the sample. For example, to make a random sample of 100 university students, you can put pieces of paper with the names of all university students in a hat, and then take 100 pieces of paper out of it - this will be a random selection (Goodwin J., p. 147).

Pairwise selection- a strategy for constructing sampling groups, in which groups of subjects are made up of subjects who are equivalent in terms of secondary parameters that are significant for the experiment. This strategy is effective for experiments using experimental and control groups, with the best option being the involvement of twin pairs (mono- and dizygotic), as it allows you to create...

Stratometric selection - randomization with the allocation of strata (or clusters). At this method When forming a sample, the general population is divided into groups (strata) with certain characteristics (gender, age, political preferences, education, income level, etc.), and subjects with the corresponding characteristics are selected.

Approximate Modeling - drawing limited samples and generalizing conclusions about this sample to the wider population. For example, with the participation of 2nd year university students in the study, the data of this study applies to “people aged 17 to 21 years”. The admissibility of such generalizations is extremely limited.

Approximate modeling is the formation of a model that, for a clearly defined class of systems (processes), describes its behavior (or desired phenomena) with acceptable accuracy.

A set of homogeneous objects is often studied in relation to some characteristic that characterizes them, measured quantitatively or qualitatively.

For example, if there is a batch of parts, then quantitative characteristic The size of the part may be according to GOST, and the quality may be the standard of the part.

If it is necessary to check them for compliance with standards, they sometimes resort to a complete examination, but in practice this is used extremely rarely. For example, if the general population contains a huge number of studied objects, then it is almost impossible to conduct a continuous survey. In this case, a certain number of objects (elements) are selected from the entire population and examined. Thus, there is a general population and a sample population.

General is the totality of all objects that are subject to inspection or study. The general population, as a rule, contains final number elements, but if it is too large, then in order to simplify mathematical calculations it is assumed that the entire set consists of an infinite number of objects.

A sample or sampling frame is a portion of the selected elements from the entire population. The sample can be repeated or non-repetitive. In the first case, it is returned to the general population, in the second - not. IN practical activities repeatless random selection is more often used.

The population and the sample must be related to each other by representativeness. In other words, in order to confidently determine the characteristics of the entire population based on the characteristics of the sample population, it is necessary that the sample elements represent them as accurately as possible. In other words, the sample must be representative (representative).

A sample will be more or less representative if it is drawn at random from a very large number of the entire population. This can be stated on the basis of the so-called law of large numbers. In this case, all elements have an equal probability of being included in the sample.

Available various options selection. All these methods can basically be divided into two options:

  • Option 1. Elements are selected when the population is not divided into parts. This option includes simple random repeated and non-repetitive selections.
  • Option 2. The general population is divided into parts and elements are selected. These include typical, mechanical and serial sampling.

Simple random - selection in which elements are selected one at a time from the entire population at random.

Typical is a selection in which elements are selected not from the entire population, but from all its “typical” parts.

Mechanical selection is when the entire population is divided into the number of groups equal to the number elements that should be in the sample, and, accordingly, one element is selected from each group. For example, if you need to select 25% of the parts produced by a machine, then every fourth part is selected, and if you need to select 4% of the parts, then every twenty-fifth part is selected, and so on. It must be said that sometimes mechanical selection may not provide sufficient

Serial is a selection in which elements are selected from the entire population in “series”, subjected to continuous research, and not one at a time. For example, when parts are manufactured a large number automatic machines, then a comprehensive survey is carried out only in relation to the products of several machines. Serial selection is used if the trait under study has insignificant variability in different series.

In order to reduce the error, estimates of the general population are used using a sample. Moreover, sampling control can be either single-stage or multi-stage, which increases the reliability of the survey.

In the previous section, we were interested in the distribution of a feature in a certain set of elements. A set that unites all elements that have this characteristic is called general. If the characteristic is human (nationality, education, IQ, etc.), then the general population is the entire population of the earth. This is a very large collection, that is, the number of elements in the collection n is large. The number of elements is called the volume of the population. Collections can be finite or infinite. The general population - all people, although very large, is, naturally, finite. The general population is all stars, probably infinitely.

If a researcher measures some continuous random variable X, then each measurement result can be considered an element of some hypothetical unlimited population. In this population countless results are distributed according to probability under the influence of errors in instruments, inattention of the experimenter, random interference in the phenomenon itself, etc.

If we carry out n repeated measurements of a random variable X, that is, we obtain n specific different numerical values, then this experimental result can be considered a sample of volume n from a hypothetical general population of results of single measurements.

It is natural to assume that the real value of the measured quantity is the arithmetic mean of the results. This function of n measurement results is called statistics, and it itself is a random variable having a certain distribution called the sampling distribution. Determining the sampling distribution of a particular statistic -- the most important task statistical analysis. It is clear that this distribution depends on the sample size n and on the distribution of the random variable X of the hypothetical population. The sampling distribution of statistics is the distribution of X q in the infinite population of all possible samples of size n from the original population.

You can also measure a discrete random variable.

Let the measurement of a random variable X be the throwing of a regular homogeneous triangular pyramid, on the sides of which the numbers 1, 2, 3, 4 are written. A discrete, random variable X has a simple uniform distribution:

The experiment can be performed an unlimited number of times. A hypothetical theoretical population is an infinite population in which there are equal shares (0.25 each) of four different elements, designated by the numbers 1, 2, 3, 4. A series of n repeated throwing of a pyramid or simultaneous throwing of n identical pyramids can be considered as a sample of volume n from this general population. As a result of the experiment, we have n numbers. It is possible to introduce some functions of these quantities, which are called statistics; they can be associated with certain parameters of the general distribution.

The most important numerical characteristics distributions are the probabilities Р i, expected value M, variance D. Statistics for probabilities P i are relative frequencies, where n i is the frequency of result i (i = 1,2,3,4) in the sample. The mathematical expectation M corresponds to statistics

which is called the sample mean. Sample variance

corresponds general variance D.

The relative frequency of any event (i=1,2,3,4) in a series of n repeated trials (or in samples of size n from the population) will have a binomial distribution.

This distribution has a mathematical expectation of 0.25 (does not depend on n), and the average standard deviation equals (decreases quickly as n increases). The distribution is a sampling distribution of statistics, the relative frequency of any of the four possible outcomes of a single pyramid toss in n repeated testing. If we were to select from an infinite general population, in which four different elements (i = 1,2,3,4) have equal shares of 0.25, all possible samples of size n (their number is also infinite), we would get the so-called mathematical sample size n. In this sample, each of the elements (i=1,2,3,4) is distributed according to the binomial law.

Let's say we threw this pyramid, and the number two came up 3 times (). We can find the probability of this outcome using the sampling distribution. It is equal

Our result was highly unlikely; in a series of twenty-four multiple throws it occurs approximately once. In biology, such a result is usually considered practically impossible. In this case, we will have doubts: is the pyramid correct and homogeneous, is the equality valid in one throw, is the distribution and, therefore, the sampling distribution correct.

To resolve the doubt, you need to throw it four times again. If the result appears again, the probability of two results with is very small. It is clear that we have obtained an almost completely impossible result. Therefore, the original distribution is incorrect. Obviously, if the second result turns out to be even more unlikely, then there is even more reason to deal with this “correct” pyramid. If the result of the repeated experiment is and, then we can assume that the pyramid is correct, and the first result () is also correct, but simply improbable.

We could not bother checking the correctness and homogeneity of the pyramid, but consider a priori the pyramid to be correct and homogeneous, and, therefore, the sampling distribution correct. Next, we should find out what knowledge of the sampling distribution provides for studying the general population. But since establishing the sampling distribution is the main task statistical research, detailed description experiments with the pyramid can be considered justified.

We assume that the sampling distribution is correct. Then the experimental values ​​of the relative frequency in different series of n throwings of the pyramid will be grouped around the value of 0.25, which is the center of the sampling distribution and the exact value of the estimated probability. In this case, the relative frequency is said to be an unbiased estimate. Since the sample dispersion tends to zero as n increases, the experimental values ​​of the relative frequency will be more and more closely grouped around the mathematical expectation of the sample distribution as the sample size increases. Therefore, it is a consistent estimate of probability.

If the pyramid turned out to be directional and heterogeneous, then the sample distributions for different (i = 1,2,3,4) would have different mathematical expectations (different) and variances.

Note that the binomial sampling distributions obtained here for large n() are well approximated by the normal distribution with parameters and, which greatly simplifies the calculations.

Let's continue the random experiment - throwing a regular, uniform, triangular pyramid. The random variable X associated with this experiment has a distribution. The mathematical expectation here is

Let us carry out n casts, which is equivalent to a random sample of size n from a hypothetical, infinite, population containing equal shares (0.25) of four different elements. We obtain n sample values ​​of the random variable X (). Let's choose a statistic that represents the sample mean. The value itself is a random variable that has a certain distribution depending on the sample size and the distribution of the original random variable X. The value is the averaged sum of n identical, random variables(that is, with the same distribution). It's clear that

Therefore, the statistic is an unbiased estimate of the mathematical expectation. It is also a valid estimate because

Thus, the theoretical sampling distribution has the same mathematical expectation as the original distribution; the variance is reduced by n times.

Recall that it is equal to

A mathematical, abstract infinite sample associated with a sample of size n from the general population and with the entered statistics will contain, in our case, elements. For example, if, then the mathematical sample will contain elements with statistics values. There will be 13 elements in total. Share extreme elements in a mathematical sample will be minimal, since the results have equal probabilities. Among the many elementary outcomes of throwing the pyramid four times, there is only one favorable one each. As statistics approach average values, the probabilities will increase. For example, the value will be realized when elementary outcomes, etc. Accordingly, the share of element 1.5 in the mathematical sample will increase.

The average value will have the maximum probability. As n increases, the experimental results will cluster more closely around the average value. The fact that the sample mean is equal to the original population mean is often used in statistics.

If you perform probability calculations in the sample distribution c, you can be sure that even with such a small value of n, the sample distribution will look like normal. It will be symmetric, in which the value will be the median, mode and mathematical expectation. As n increases, it is well approximated by the corresponding normal one, even if the original distribution is rectangular. If the original distribution is normal, then the distribution is the Student distribution for any n.

To estimate the general variance, it is necessary to choose a more complex statistic that provides an unbiased and consistent estimate. In the sampling distribution for S 2 the mathematical expectation is equal to and the variance. With large sample sizes, the sampling distribution can be considered normal. For small n and a normal initial distribution, the sampling distribution for S 2 will be h 2 _distribution.

Above we tried to present the first steps of a researcher trying to carry out a simple statistical analysis repeated experiments with the correct homogeneous triangular prism(tetrahedron). In this case, we know the original distribution. It is possible, in principle, to theoretically obtain sample distributions of relative frequency, sample mean and sample variance depending on the number repeated experiments n. For large n, all these sample distributions will approach the corresponding normal distributions, since they represent the laws of distribution of sums of independent random variables (central limit theorem). So we know the expected results.

Repeated experiments or samples will provide estimates of the parameters of the sampling distributions. We argued that the experimental estimates would be correct. We did not perform these experiments and did not even present the experimental results obtained by other researchers. It can be emphasized that when determining the distribution laws theoretical methods are used more often than direct experiments.

In mathematical statistics, there are two fundamental concepts: population and sample.
A set is an almost countable set of some objects or elements of interest to the researcher;
A property of a collection is a real or imaginary quality that is shared by some of its elements. The property may be random or non-random.
A population parameter is a property that can be quantified as a constant or variable.
A simple set is characterized by:
a separate property (for example: all students in Russia);
a separate parameter in the form of a constant or variable (All female students);
a system of non-overlapping (incompatible) properties, for example: All teachers and students of Vladivostok schools.
A complex set is characterized by:
a system of at least partially overlapping properties (Students of the psychological and mathematical faculties of Far Eastern State University who graduated from school with a gold medal);
a system of independent and dependent parameters in the aggregate; at comprehensive study personality.
Homogeneous or homogeneous is a set, all the characteristics of which are inherent in each of its elements;
Heterogeneous or heterogeneous is a population whose characteristics are concentrated in separate subsets of elements.
An important parameter is the volume of the population - the number of elements forming it. The size of the volume depends on how the population itself is defined, and what questions specifically interest us. Let's say we are interested emotional condition 1st year student during the period of passing a specific exam in the session. Then the population is exhausted within half an hour. If we are interested in the emotional state of all 1st year students, then the totality will be much larger, and even larger if we take the emotional state of all 1st year students of this university etc. It is clear that large populations can only be studied selectively.
A sample is a certain part of the general population, something that is directly studied.
Samples are classified according to representativeness, size, selection method and test design.
Representative - a sample that adequately reflects the general population in qualitative and quantitatively. The sample must adequately reflect the population, otherwise the results will not coincide with the objectives of the study.
Representativeness depends on the volume; the larger the volume, the more representative the sample. According to the selection method.
Random - if the elements are selected randomly. Since most methods mathematical statistics is based on the concept of random sampling, then naturally the sampling should be random.
Non-random sampling:
mechanical selection, when the entire population is divided into as many parts as there are units planned in the sample and then one element is selected from each part;
typical selection - the population is divided into homogeneous parts, and a random sample is taken from each;
serial selection - the population is divided into a large number of different-sized series, then a sample of one particular series is made;
combined selection - the types of selection under consideration are combined at different stages.
According to the test design, samples can be independent and dependent. Based on sample size, samples are divided into small and large. Small samples include samples in which the number of elements n is 200 and the average sample satisfies the condition 30. Small samples are used for statistical control of known properties of already studied populations.
Large samples are used for installation unknown properties and population parameters.

More on topic 1.3. Population and sample:

  1. 7.2 Characteristics of the sample and population
  2. 1.6. Point and interval estimates of correlation coefficients of a normally distributed population

Population- the totality of all objects (units) regarding which the scientist intends to draw conclusions when studying a specific problem. The population consists of all objects that are subject to study. The composition of the population depends on the objectives of the study. Sometimes the general population is the entire population of a certain region (for example, when studying the attitude of potential voters towards a candidate), most often several criteria are specified that determine the object of the study. For example, women 18-29 years old who use certain brands of hand cream at least once a week and have an income of at least $150 per family member.

Sample- a set of cases (subjects, objects, events, samples), using a certain procedure, selected from the general population to participate in the study.

  1. Sample size;
  2. Dependent and independent samples;
  3. Representativeness:
    1. An example of a non-representative sample;
  4. Types of plan for constructing groups from samples;
  5. Group building strategies:
    1. Randomization;
    2. Pairwise selection;
    3. Stratometric selection;
    4. Approximate modeling.

Sample size- the number of cases included in the sample population. For statistical reasons, it is recommended that the number of cases be at least 30-35.

Dependent and independent samples

When comparing two (or more) samples, an important parameter is their dependence. If it is possible to establish a homomorphic pair (that is, when one case from sample X corresponds to one and only one case from sample Y and vice versa) for each case in two samples (and this basis for the relationship is important for the trait being measured in the samples), such samples are called dependent. Examples of dependent samples: pairs of twins, two measurements of a trait before and after experimental influence, husbands and wives, etc.

If there is no such relationship between the samples, then these samples are considered independent, for example: men and women, psychologists and mathematicians.

Accordingly, dependent samples always have the same size, while the size of independent samples may differ.

Comparison of samples is made using various statistical criteria:

  • Student's t-test;
  • Wilcoxon T-test;
  • Mann-Whitney U test;
  • Sign criterion, etc.

Representativeness

The sample may be considered representative or non-representative.

Example of a non-representative sample

In the United States, one of the most famous historical examples of non-representative sampling is considered to be the case that occurred during the presidential election in 1936. The Literary Digest magazine, which had successfully predicted the events of several previous elections, was wrong in its predictions by sending out ten million test ballots to its subscribers, people selected from telephone books throughout the country, and from people on car registration lists. In 25% of returned ballots (almost 2.5 million), the votes were distributed as follows:

57% preferred Republican candidate Alf Landon

40% chose then-Democratic President Franklin Roosevelt

In the actual elections, as is known, Roosevelt won, gaining more than 60% of the votes. The Literary Digest's mistake was this: wanting to increase the representativeness of the sample - since they knew that most of their subscribers considered themselves Republicans - they expanded the sample to include people selected from telephone books and registration lists. However, they did not take into account the realities of their time and in fact recruited even more Republicans: during the Great Depression, it was mainly representatives of the middle and upper classes who could afford to own phones and cars (that is, most Republicans, not Democrats).

Types of plan for constructing groups from samples

There are several main types of group building plans:

  1. A study with experimental and control groups, which are placed in different conditions;
  2. A study with experimental and control groups using a pairwise selection strategy;
  3. A study using only one group - experimental;
  4. A study using a mixed (factorial) design - all groups are placed in different conditions.

Group Building Strategies

The selection of groups for participation in a psychological experiment is carried out using various strategies, which are necessary in order to ensure the greatest possible respect for internal and external validity:

  1. Randomization (random selection);
  2. Pairwise selection;
  3. Stratometric selection;
  4. Approximate modeling;
  5. Attracting real groups.

Randomization

Randomization, or random sampling, is used to create simple random samples. The use of such a sample is based on the assumption that each member of the population is equally likely to be included in the sample. For example, to make a random sample of 100 university students, you can put pieces of paper with the names of all university students in a hat, and then take 100 pieces of paper out of it - this will be a random selection

Pairwise selection

Pairwise selection is a strategy for constructing sampling groups in which groups of subjects are composed of subjects who are equivalent in terms of secondary parameters that are significant for the experiment. This strategy is effective for experiments using experimental and control groups, with the best option being the involvement of twin pairs (mono- and dizygotic), as it allows you to create.

Stratometric selection

Stratometric selection - randomization with the allocation of strata (or clusters). With this method of sampling, the general population is divided into groups (strata) with certain characteristics (gender, age, political preferences, education, income level, etc.), and subjects with the corresponding characteristics are selected.

Approximate Modeling

Approximate modeling - drawing limited samples and generalizing conclusions about this sample to a wider population. For example, with the participation of 2nd year university students in the study, the data of this study applies to “people aged 17 to 21 years”. The admissibility of such generalizations is extremely limited.