Validity of the methodology, types of validity. Correlation analysis as one of the methods for determining reliability and validity

A person uses various methods and tools to check or measure some kind of quality. The extent to which this technique and tool are able to produce results qualitatively indicates their validity. What does this concept mean in psychology? What are the types of validity? In psychology, this property is usually applicable to tests and methods used by specialists.

What is validity?

The concept under consideration has many definitions. What is validity? This is the suitability and validity of applying some technique or result in a particular situation. The applied meaning of this word is the degree of compliance of the results and methods with the tasks set.

Validity is another measurement that measures specific qualities. Thus, the technique aims to measure a specific quality, such as intelligence, and its validity should indicate how well the technique helps in obtaining results.

In other words, validity can be called reliability. It measures those tests and methods that measure certain psychological qualities. The better they measure the qualities they measure, the higher their validity.

Validity becomes important in two cases:

When a certain technique is developed.
When a certain technique shows results, and it is necessary to establish how good these results are.

Thus, validity is a characteristic that indicates the suitability of a particular technique for measuring some quality and the usefulness, quality, and effectiveness of this technique.

Typically, several types of validity are used to validate a particular test or technique. It also compares the indicators that are given by different tools. There are many ways to measure a particular psychological quality or characteristic. More often, psychologists will use the method that gives more reliable results. This will show its high validity.

Together with validity, such a concept as reliability is often considered. Methods and tests must be reliable, that is, they must be constant, reliable. The experimenter must be sure that he is considering exactly the quality that he wants to consider. This is why reliability may not always be valid, but validity must always be reliable.

Validity in psychology

Validity is used in many areas of life, where various indicators are measured. In psychology, validity also becomes necessary, especially in experimental psychology. Validity in psychology is:

the experimenter's confidence that he is measuring the quality he needs;
reliability of indicators that measure this quality.

If the reader has ever passed psychological tests, then he knows about the inner desire to get a specific answer to the question posed. The validity of a test shows the experimenter the specific result that he achieves through testing. Here is a specific task, the answer to which he must receive after performing all the necessary actions.

Methods and tests should be useful and valid, which is measured by their validity.

There are three ways to check for validity:

Estimation of content validity is the correspondence of the results of the assessment of the subject to the real qualities that appear in reality. Here such a concept as facial validity is used - a person must see a real connection between the content of the methodology itself and its results and reality, in which the measured quality is manifested.
Assessing construct validity is the determination that the technique calculates evidence-based and predetermined constructs. Convergent validation allows the use of several techniques that consider similar characteristics and give more accurate results of the considered quality. Discriminant validation excludes other techniques that consider qualities that are not correlated with the desired quality.
Assessment of criterion validity is the correspondence of the results to the expected indicators, which are revealed in other ways. Here predictive validity is used, which helps to predict future behavior.

Types of validity

There are several types of validity, which we will discuss below:

External validity is a generalization of the conclusion of a situation, population, independent variables. It is divided into:

operational validity.
Construct validity - an explanation of the behavior of a person at the time of passing the test.

Internal validity - change during the experiment under the influence of unchanging factors.
differential validity.
incremental validity.
Ecological validity is an indicator that a person is capable of performing various actions that in one situation may be successful, but not in another.

This classification is used by experimental psychology. Organizational psychology and psychodiagnostics use a different classification:

construct validity. It is divided into:

convergent validity.
divergent validity.

Criteria (empirical) validity - calculation of correlation by score on the test of an external parameter that was chosen as a valid indicator. It is divided into:

Current validity - the study of the parameter in the present time.
Retrospective validity - a state or event that happened in the past.
Predictive validity - prediction of behavior, quality.

Content validity - used in experiments where some interaction, activity is considered. Has a subspecies:

obvious validity.

Other types of validity are:

A priori.
Congruent.
Related.
Constructive.
Consensus.
Factorial.
theoretical, etc.

What is test validity?

Lots of people get tested. There are special psychological tests used by psychologists, and other tabloid tests. What is the validity of the test, which is its important criterion? This is an indicator of the correspondence of a characteristic, quality, property to the test that measures them.

The tests are different. They are used to measure the psychophysiological parameters of a person. The highest validity remains 80%. The usefulness of using tests becomes when they allow you to obtain accurate data on certain specific characteristics. There are several approaches to studying the validity of a test:

Constructive validity, which allows you to more deeply study the qualities of a person in a situation, activity, system.
Validity by criterion - studying the parameter in the present time and predicting it in the future.
Content validity is the correspondence of psychological constructs, their diversity.
Predictive validity - predicts the development of a particular quality in the future, which is difficult because it can develop differently in different people.

Until the reliability and validity of the test are determined, it is not used in psychological practice. Much depends on the areas in which the tests are applied. There are educational, professional and other tests that are used in individual institutions to predict and identify the characteristics of applicants.

On the psychological help site, you can also take tests that already have high validity and show reliable results.

What is method validity?

What is method validity? This is an indicator that indicates whether the technique under consideration is studying the quality, characteristics for which it is intended. At the same time, emphasis is placed on the fact that the subject who is being tested may see and characterize himself differently. That is why the results do not always take into account the opinion of people who may not notice certain characteristics in themselves.

Validation is the verification of the validity of a methodology. To determine the effectiveness, efficiency, practicality of the methodology used, an external independent indicator is used - a quality that is observed in everyday life. There are 4 types of external indicators:

The performance criterion is the time spent, the amount of work, the level of academic performance, the growth of professional skills, etc.
Subjective criteria - opinion, views, preference, attitude of the subject to someone or something. Questionnaires, interviews, questionnaires are used here.
Physiological criteria - the influence of the outside world on the psyche and human body. It measures pulse, respiratory rate, symptoms of fatigue, etc.
The criterion of randomness - is it possible, for example, to select persons who are not prone to accidents? Studying the impact of a particular case.

The theoretical approach in measuring the validity of methods allows us to recognize whether the technology really studies the quality for which it was intended.

Validity is also determined by the occurrence of the studied quality. It is good if it is common, which makes the technique necessary and useful. Ethical and cultural changes in society also become important.

Outcome

In psychological practice, tests and techniques are often used to help in the study of a person's personality. Here we are talking in particular about internal parameters that are not visible to the eye. The qualities of character, demeanor, possible forecast for the future, what a person will be and what his life will be - all this is studied by various tests and methods that pursue a single result - the study of a person.

The result of a successful determination of the validity of a particular instrument is the successful knowledge of each person, regardless of how he looks at himself. People often do not notice certain qualities in themselves, rarely look at themselves with a sober look. Tests and techniques allow you to reveal individual parameters.

The forecast of valid tests and methods is a quick and high-quality knowledge of another person with the ability to help him in solving any psychological problem. This is not achieved soon, but the tools available have already shown their effectiveness. Usually, this question is of interest only to those people who are involved in determining the quality of tests and methods. However, it will also be useful for ordinary people to know which exercises should be trusted and which should not.

Ticket number 9

Questionnaires of motivation and their characteristics.

Questionnaires of motives - a group of questionnaires designed to diagnose the motivational-need sphere of a person, which allows you to establish what the activity of an individual is aimed at (motives as reasons that determine the choice of direction of behavior). In addition, the question of how the dynamics of behavior is regulated is of great importance. In this case, one often resorts to measuring attitudes. The development of questionnaires of motives in psychodiagnostics is largely associated with the need to assess the influence of the “social desirability” factor, which has an attitudinal nature and reduces the reliability of data obtained using personality questionnaires. The most famous questionnaires of motives include the “List of Personal Preferences” developed by A. Edwards (1954), which is designed to measure the “strength” of needs borrowed from the list proposed by G. Murray for the thematic apperception test. These needs include, for example, the need for success, respect, leadership, etc. The “strength” of each need is expressed not in absolute terms, but relative to the “strength” of other needs, i.e. personal indicators are used. To study the role of the “social desirability” factor, A. Edwards (1957) proposed a special questionnaire. Other questionnaires of motives are also widely used, for example, D. Jackson's Personality Study Form (1967), A. Mehrabyan's questionnaires (1970), etc.

After reliability, another key criterion for assessing the quality of methods is validity. The question of the validity of the methodology is decided only after its sufficient reliability has been established, since an unreliable methodology cannot be valid. But the most reliable technique without knowing its validity is practically useless.

It should be noted that the issue of validity until recently seems to be one of the most difficult. The most rooted definition of this concept is the one given in the book by A. Anastasi: “The validity of a test is a concept that tells us what the test measures and how well it does it.”

For this reason, there is no single universal approach to determining validity. Depending on which side of validity the researcher wants to consider, different methods of proof are also used. In other words, the concept of validity includes its different types, which have their own special meaning. Checking the validity of a technique is called validation.

Validity in its first sense is related to the methodology itself, that is, it is the validity of the measuring tool. This check is called theoretical validation. Validity in the second sense already refers not so much to the methodology as to the purpose of its use. This is pragmatic validation.

Summarizing, we can say the following:

in theoretical validation, the researcher is interested in the very property measured by the technique. This, in essence, means that the actual psychological validation is being carried out;

with pragmatic validation, the essence of the subject of measurement (psychological property) is out of sight. The main emphasis is on proving that something measured by the methodology has a connection with certain areas of practice.

Conducting theoretical validation, in contrast to pragmatic validation, is sometimes much more difficult. Without going into specific details for now, let us dwell in general terms on how pragmatic validity is checked: some external criterion independent of the methodology is selected that determines success in a particular activity (educational, professional, etc.), and with it the results of the diagnostic technique are compared. If the connection between them is recognized as satisfactory, then a conclusion is made about the practical significance, efficiency, and effectiveness of the diagnostic technique.

To determine theoretical validity, it is much more difficult to find any independent criterion that lies outside the methodology. Therefore, in the early stages of the development of testology, when the concept of validity was just taking shape, there was an intuitive idea that the test measures:

1) the technique was called valid, since what it measures is simply obvious;

2) the proof of validity was based on the researcher's confidence that his method makes it possible to understand the subject;

3) the methodology was considered valid (i.e., the statement was accepted that such and such a test measures such and such a quality) only because the theory on the basis of which the methodology was built is very good.

Acceptance on faith of allegations about the validity of the methodology could not last for a long time. The first manifestations of truly scientific criticism debunked this approach: the search for scientifically sound evidence began.

Thus, to conduct a theoretical validation of a methodology is to prove that the methodology measures exactly the property, quality that it, according to the researcher's intention, should measure.

So, for example, if a test was developed in order to diagnose the mental development of children, it is necessary to analyze whether it really measures this development, and not some other features (for example, personality, character, etc.). Thus, for theoretical validation, the cardinal problem is the relationship between psychological phenomena and their indicators, through which these psychological phenomena are trying to be known. This shows how much the author's intention and the results of the methodology coincide.

It is not so difficult to theoretically validate a new method if there is already a method with proven validity to measure a given property. The presence of a correlation between the new and similar already tested method indicates that the developed method measures the same psychological quality as the reference one. And if the new method at the same time turns out to be more compact and economical in carrying out and processing the results, then psychodiagnostics get the opportunity to use the new tool instead of the old one.

But theoretical validity is proved not only by comparison with related indicators, but also with those where, based on the hypothesis, there should not be significant relationships. Thus, to test theoretical validity, it is important, on the one hand, to establish the degree of connection with a related technique (convergent validity) and the absence of this relationship with methods that have a different theoretical basis (discriminant validity).

It is much more difficult to carry out theoretical validation of the method when such a way of verification is impossible. Most often, this is the situation faced by the researcher. In such circumstances, only the gradual accumulation of various information about the property under study, the analysis of theoretical premises and experimental data, and considerable experience in working with the technique make it possible to reveal its psychological meaning.

An important role in understanding what the methodology measures is played by the comparison of its indicators with practical forms of activity. But here it is especially important that the methodology be thoroughly worked out in theoretical terms, that is, that there be a solid, well-founded scientific basis. Then, when comparing the methodology with an external criterion taken from everyday practice, corresponding to what it measures, information can be obtained that reinforces theoretical ideas about its essence.

It is important to remember that if the theoretical validity is proven, then the interpretation of the obtained indicators becomes clearer and more unambiguous, and the name of the methodology corresponds to the scope of its application. As for pragmatic validation, it implies testing the methodology in terms of its practical effectiveness, significance, usefulness, since it makes sense to use a diagnostic technique only when it is proved that the property being measured is manifested in certain life situations, in certain types of activity. It is given great importance, especially where the question of selection arises.

If we turn again to the history of the development of testology, then we can single out such a period (20-30s of the XX century) when the scientific content of tests and their theoretical baggage were of less interest. It was important that the test worked and helped to quickly select the most prepared people. The empirical criterion for evaluating test items was considered the only true guideline in solving scientific and applied problems.

The use of diagnostic methods with a purely empirical justification, without a clear theoretical basis, often led to pseudoscientific conclusions and unjustified practical recommendations. It was impossible to accurately name those features, qualities that the tests revealed. Essentially, they were blind trials.

This approach to the problem of test validity was typical until the early 1950s. 20th century not only in the USA, but also in other countries. The theoretical weakness of empirical methods of validation could not but cause criticism from those scientists who, in the development of tests, called for relying not only on bare empiricism and practice, but also on a theoretical concept. Practice without theory is blind, and theory without practice is dead. Currently, the theoretical and practical assessment of the validity of methods is perceived as the most productive.

To conduct a pragmatic validation of a methodology, i.e. to assess its effectiveness, efficiency, practical significance, an independent external criterion is usually used - an indicator of the manifestation of the studied property in everyday life. Such a criterion can be both academic performance (for learning ability tests, achievement tests, intelligence tests), and production achievements (for professional orientation methods), and the effectiveness of real activity - drawing, modeling, etc. (for tests of special abilities), subjective assessments (for personality tests).

American researchers D. Tiffin and E. McCormick, after analyzing the external criteria used to prove the validity, distinguish four types of them [according to 31):

1) performance criteria (they may include such as the amount of work performed, academic performance, time spent on training, the rate of growth of qualifications, etc.);

2) subjective criteria (they include various types of answers that reflect a person's attitude to something or someone, his opinion, views, preferences; usually subjective criteria are obtained through interviews, questionnaires, questionnaires);

3) physiological criteria (they are used in studying the influence of the environment and other situational variables on the human body and psyche; pulse rate, blood pressure, skin electrical resistance, symptoms of fatigue, etc. are measured);

4) randomness criteria (applied when the purpose of the study concerns, for example, the problem of selecting for work such persons who are less prone to accidents).

The external criterion must meet three basic requirements:

it must be relevant;

free from interference;

reliable.

Relevance refers to the semantic correspondence of a diagnostic tool to an independent vital criterion. In other words, there must be confidence that the criteria involve precisely those features of the individual psyche that are also measured by the diagnostic technique. The external criterion and the diagnostic technique must be in internal semantic correspondence with each other, be qualitatively homogeneous in psychological essence. If, for example, a test measures the individual characteristics of thinking, the ability to perform logical actions with certain objects, concepts, then in the criterion one should look for the manifestation of precisely these skills. This applies equally to professional activities. It has not one, but several goals, tasks, each of which is specific and imposes its own conditions for implementation. This implies the existence of several criteria for the performance of professional activities. Therefore, one should not compare the success of diagnostic methods with production efficiency in general. It is necessary to find a criterion that, by the nature of the operations performed, is comparable with the methodology.

If it is not known with respect to the external criterion whether it is relevant to the measured property or not, then comparing the results of the psychodiagnostic technique with it becomes practically useless. It does not allow to come to any conclusions that could assess the validity of the methodology.

The requirements of freedom from interference are caused by the fact that, for example, educational or industrial success depends on two variables: on the person himself, his individual characteristics, measured by methods, and on the situation, conditions of study, work, which can introduce interference, “contaminate” the applied criterion . In order to avoid this to some extent, groups of people who are in more or less the same conditions should be selected for research. You can also use another method. It consists in correcting the influence of interference. This adjustment is usually statistical in nature. So, for example, productivity should not be taken in absolute terms, but in relation to the average productivity of workers working in similar conditions.

When it is said that a criterion must have statistically significant reliability, this means that it must reflect the constancy and stability of the function under study.

The search for an adequate and easily identifiable criterion is one of the most important and difficult tasks of validation. In Western testology, many methods are disqualified only because they could not find a suitable criterion for testing them. For example, for most questionnaires, the data on their validity is questionable, since it is difficult to find an adequate external criterion that corresponds to what they measure.

The assessment of the validity of methods can be quantitative and qualitative.

To calculate a quantitative indicator - the coefficient of validity - the results obtained using the diagnostic technique are compared with the data obtained by the external criterion of the same persons. Different types of linear correlation are used (according to Spearman, according to Pearson).

How many subjects are needed to calculate validity?

Practice has shown that there should not be less than 50 of them, but more than 200 are best. The question often arises, what should be the value of the coefficient of validity in order for it to be considered acceptable? In general, it is noted that it is sufficient that the coefficient of validity be statistically significant. A coefficient of validity of about 0.20-0.30 is recognized as low, 0.30-0.50 as medium, and over 0.60 as high.

But, as A. Anastasi, K. M. Gurevich and others emphasize, it is not always right to use linear correlation to calculate the validity coefficient. This technique is justified only when it is proved that success in some activity is directly proportional to success in performing a diagnostic test. The position of foreign testologists, especially those involved in professional suitability and professional selection, most often comes down to the unconditional recognition that the one who completed the most tasks in the test is more suitable for the profession. But it may also be that in order to be successful in an activity, it is necessary to have a property at the level of 40% of the test solution. Further success in the test no longer matters for the profession. An illustrative example from the monograph by K. M. Gurevich: a postman must be able to read, but whether he reads at a normal speed or at a very high speed is no longer of professional importance. With such a correlation between the indicators of the methodology and the external criterion, the most adequate way to establish validity may be the criterion of differences.

Another case is also possible: a higher level of property than is required by the profession interferes with professional success. So, at the dawn of the 20th century. the American researcher F. Taylor found that the most developed workers in production have low labor productivity. That is, the high level of their mental development prevented them from working highly productively. In this case, analysis of variance or calculation of correlation ratios would be more suitable for calculating the coefficient of validity.

As the experience of foreign testologists has shown, not a single statistical procedure is able to fully reflect the diversity of individual assessments. Therefore, another model is often used to prove the validity of methods - clinical assessments. This is nothing more than a qualitative description of the essence of the studied property. In this case, we are talking about the use of techniques that are not based on statistical processing.

Types of validity

Validity in its essence is a complex characteristic, including, on the one hand, information about whether the technique is suitable for measuring what it was created for, and on the other hand, what is its effectiveness, efficiency, and practical usefulness.

Checking the validity of a technique is called validation.

4 types of external criteria:

performance criteria (these may include such as the amount of work performed, academic performance, time spent on training, rate of growth of qualifications, etc.);

subjective criteria (they include various types of answers that reflect a person's attitude to something or someone, his opinion, views, preferences; usually subjective criteria are obtained through interviews, questionnaires, questionnaires);

physiological criteria (they are used in studying the influence of the environment and other situational variables on the human body and psyche; the pulse rate, blood pressure, skin electrical resistance, symptoms of fatigue, etc. are measured);

randomness criteria (applied when the purpose of the study concerns, for example, the problem of selecting for work such persons who are less prone to accidents).

empirical validity.

If, in the case of content validity, the test is evaluated by experts (who establish the correspondence of test items to the content of the subject of measurement), then empirical validity is always measured using statistical correlation: the correlation of two series of values is calculated - test scores and indicators for an external parameter chosen as validity criterion.

construct validity.

Construct validity has to do with the theoretical construct itself and involves looking for factors that explain behavior when performing a test. As a special type, construct validity is canonized in an article by Cronbach and Mil (1955). The authors assessed with this type of validity all test studies that were not directly aimed at predicting some significant criteria. The study contained information about psychological constructs.

Validity "by content".

Content validity requires that each item, problem, or question that belongs to a particular area has an equal chance of being a test item. Content validity assesses the conformity of the content of the test (tasks, questions) with the measured area of behavior. Tests compiled by two development teams are conducted on a sample of subjects. The reliability of tests is calculated by splitting items into two parts, resulting in an index of content validity.

"Prognostic" validity.

"Predictive" validity is also determined by a fairly reliable external criterion, but information on it is collected some time after the test. The external criterion is usually the ability of a person, expressed in some assessments, to the type of activity for which he was selected based on the results of diagnostic tests. Although this technique is most appropriate for the task of diagnostic techniques - the prediction of future success, it is very difficult to apply it. The accuracy of the forecast is inversely related to the time given for such forecasting. The more time passes after the measurement, the more factors must be taken into account when assessing the prognostic significance of the technique. However, it is almost impossible to take into account all the factors that affect the prediction.

"Retrospective" validity.

It is determined on the basis of a criterion that reflects the events or state of quality in the past. It can be used to quickly obtain information about the predictive capabilities of the technique. Thus, to test the extent to which good scores on an aptitude test correspond to rapid learning, one can compare past grades, past expert opinions, and so on. in individuals with high and low diagnostic indicators at the moment.

convergent and discriminant validity.

How the psychologist defines the diagnostic construct determines the strategy for including certain items in the test. If Eysenck defines the property "neuroticism" as independent of extraversion-introversion, then this means that in his questionnaire there should be approximately equal numbers of items that neurotic introverts and neurotic extraverts will agree with. If in practice it turns out that the test will be dominated by items from the "Neuroticism-Introversion" quadrant, then, from the point of view of Eysenck's theory, this means that the "neuroticism" factor turns out to be loaded with an irrelevant factor - "introversion". (Exactly the same effect occurs if there is a bias in the sample - if there are more neurotic introverts in it than neurotic extraverts.)

In order not to encounter such difficulties, psychologists would like to deal with such empirical indicators (points) that unambiguously inform only about one factor. But this requirement is never really fulfilled: any empirical indicator turns out to be determined not only by the factor that we need, but also by others - irrelevant to the measurement task.

Thus, in relation to factors that are conceptually defined as orthogonal to the measured (occurring with it in all combinations), the test compiler must apply an artificial balancing strategy when selecting items.

Correspondence of items to the measured factor ensures the convergent validity of the test. The balance of items with respect to irrelevant factors provides discriminant validity. Empirically, it is expressed in the absence of a significant correlation with a test that measures a conceptually independent property.

Types of validity

There are several types of validity due to the peculiarities of diagnostic methods, as well as the temporary status of the external criterion. ; In Cherny, 1983; "General Psychodiagnostics", 1987, etc.) the following are most often called:

1. Validity "by content". This technique is mainly used in achievement tests. Usually, achievement tests do not include all the material that students have passed, but some small part of it (3-4 questions). Is it possible to be sure that the correct answers to these few questions testify to the assimilation of all the material. This is what the content validity check should answer. To do this, a comparison of success on the test with expert assessments of teachers (for this material) is carried out. Validity "by content" also applies to criteria-based tests. This technique is sometimes called logical validity.

2. "Simultaneity" validity, or current validity, is determined by an external criterion by which information is collected at the same time as the test method experiments. In other words, data is collected relating to the present time performance in the test period, performance in the same period, etc. The success results on the test are correlated with it.

3. "Predictive" validity (another name is "predictive" validity). It is also determined by a fairly reliable external criterion, but information on it is collected some time after the test. The external criterion is usually the ability of a person, expressed in some assessments, to the type of activity for which he was selected based on the results of diagnostic tests. Although this technique is most appropriate for the task of diagnostic techniques - the prediction of future success, it is very difficult to apply it. The accuracy of the forecast is inversely related to the time given for such forecasting. The more time passes after the measurement, the more factors must be taken into account when assessing the prognostic significance of the technique. However, it is almost impossible to take into account all the factors that affect the prediction.

4. "Retrospective" validity. It is determined on the basis of a criterion that reflects the events or state of quality in the past. It can be used to quickly obtain information about the predictive capabilities of the technique. Thus, to test the extent to which good scores on an aptitude test correspond to rapid learning, one can compare past grades, past expert opinions, and so on. in individuals with high and low diagnostic indicators at the moment.

Correlation

Correlation (correlation dependence) - a statistical relationship of two or more random variables (or variables that can be considered as such with some acceptable degree of accuracy). At the same time, changes in the values of one or more of these quantities lead to a systematic change in the values of another or other quantities. A mathematical measure of the correlation of two random variables is the correlation ratio, or the correlation coefficient (or). If a change in one random variable does not lead to a regular change in another random variable, but leads to a change in another statistical characteristic of this random variable, then such a relationship is not considered a correlation, although it is statistical.

For the first time, the term “correlation” was introduced into scientific circulation by the French paleontologist Georges Cuvier in the 18th century. He developed the "law of correlation" of parts and organs of living beings, with the help of which it is possible to restore the appearance of a fossil animal, having at its disposal only a part of its remains. In statistics, the word "correlation" was first used by the English biologist and statistician Francis Galton at the end of the 19th century.

Some types of correlation coefficients can be positive or negative (it is also possible that there is no statistical relationship - for example, for independent random variables). If it is assumed that a strict order relation is specified on the values of the variables, then negative correlation is a correlation in which an increase in one variable is associated with a decrease in another variable, while the correlation coefficient can be negative; positive correlation in such conditions - a correlation in which an increase in one variable is associated with an increase in another variable, while the correlation coefficient can be positive.

Validity (from the English valid - “valid, suitable, valid”) is a complex characteristic of a methodology (test), including information about the area of \u200b\u200bthe studied phenomena and the representativeness of the diagnostic procedure in relation to them.

In its simplest and most general formulation, the validity of a test is "a concept that tells us what the test measures and how well it does it." In the standard requirements for psychological and educational tests, validity is defined as a set of information about which groups of psychological properties of a person can be concluded using the methodology, as well as the degree of validity of the conclusions when using specific test scores or other forms of assessment. In psychodiagnostics, validity is an obligatory and most important part of the information about the methodology, including (along with the above) data on the degree of consistency of test results with other information about the person being studied, obtained from various sources (theoretical expectations, observations, expert assessments, results of other methods, the reliability of which has been established, etc.), a judgment about the validity of the forecast for the development of the quality under study, the connection of the studied area of behavior or personality traits with certain psychological constructs. Validity also describes the specific orientation of the methodology (the contingent of subjects by age, level of education, socio-cultural affiliation, etc.) and the degree of validity of the conclusions in the specific conditions of using the test. The totality of information characterizing the validity of the test contains information about the adequacy of the applied activity model in terms of reflecting the studied psychological characteristics in it, about the degree of homogeneity of tasks (subtests) included in the test, their comparability in the quantitative assessment of test results as a whole.

The most important component of validity - the definition of the area of the studied properties - is of fundamental theoretical and practical importance in choosing a research methodology and interpreting its data. The information contained in the name of the test, as a rule, is insufficient to judge the scope of its application. This is just a designation, the “name” of a particular research procedure.

Types of test validity. Methods for determining validity

According to the definition of the American textologist A. Anastasi, "the validity of a test is a concept that tells us what the test measures and how well it does it." Validity indicates whether the technique is suitable for measuring certain qualities, features and how effectively it does this. The most common way to find the theoretical validity of a test (method) is convergent validity, that is, comparing a given technique with authoritative related methods and proving significant links with them.

Comparison with methods that have a different theoretical basis, and the constant lack of significant relationships with them is called discriminant validity. Another type of validity - pragmatic validity - testing the methodology in terms of its practical significance, efficiency, usefulness. To conduct such a test, as a rule, so-called independent external criteria are used, that is, an external source of information independent of the test is used about the manifestation in real life and the activity of people of a measured mental property. Among such external criteria may be academic performance, professional achievements, success in various activities, subjective assessments (or self-assessments). If, for example, the methodology measures the features of the development of professionally important qualities, then for the criterion it is necessary to find such an activity or individual operations where these qualities are realized.

To check the validity of the test, you can use the method of known groups, when people are invited about whom it is known which group according to the criterion they belong to (for example, a group of "highly successful, disciplined students" - a high criterion and a group of "poor, undisciplined students" - a low criterion, and students with average values do not participate in testing), conduct testing and find a correlation between the test results and the criterion.

Here a is the number of subjects who fell into the high group according to the test and according to the criterion, c is the number of subjects who fell into the high group according to the criterion and have low test results. If the test is completely valid, the elements b and c must be equal to zero. The measure of coincidence, the correlation between the extreme groups according to the test and the criterion is evaluated using the Guilford phi coefficient. There are many different ways to prove the validity of a test. A test is said to be valid if it measures what it is intended to measure. External validity - in relation to psychodiagnostic methods means the correspondence of the results of psychodiagnostics carried out by means of this method to external signs, independent of the method, attributable to the subject of the survey. It means approximately the same thing as empirical validity, with the difference that here we are talking about the relationship between the indicators of the methodology and the most important, key external features related to the behavior of the subject. A psychodiagnostic technique is considered externally valid if, for example, it is used to evaluate the character traits of an individual and his externally observed behavior is consistent with the results of the testing.

Validity is internal - in relation to psychodiagnostic methods, it means the correspondence of the tasks, subtests contained in it; compliance of the results of psychodiagnostics carried out by means of this technique with the definition of the evaluated psychological property used in the technique itself. A methodology is considered not internally valid or insufficiently valid when all or part of the questions, tasks and subtests included in it do not measure what is required by this methodology. Apparent validity - describes the perception of the test that has developed in the subject. The test should be perceived by the subject as a serious tool for understanding his personality. Obvious validity is of particular importance in modern conditions, when the idea of tests in the public mind is formed by numerous publications in popular newspapers and magazines of what can be called quasi-tests, with the help of which the reader is invited to determine anything: from intelligence to compatibility with a future spouse.

Competitive validity is assessed by the correlation of the developed test with others, the validity of which in relation to the measured parameter has been established. P. Kline notes that data on competitive validity are useful when there are unsatisfactory tests for measuring some variables, and new ones are created in order to improve the quality of the measurement. But the question arises: if an effective test already exists, why do we need the same new one? Predictive validity is established by the correlation between test scores and some criterion that characterizes the property being measured, but at a later time. For example, the predictive validity of an intelligence test can be shown by correlating its scores obtained from a subject at age 10 with academic performance during the period of high school graduation. L. Cronbach considers predictive validity to be the most convincing evidence that the test measures exactly what it was intended for. The main problem faced by a researcher trying to establish the predictive validity of his test is the choice of an external criterion. In particular, this most often concerns the measurement of personality variables, where the selection of an external criterion is an extremely difficult task, the solution of which requires considerable ingenuity. The situation is somewhat simpler when determining an external criterion for cognitive tests, however, even in this case, the researcher has to “turn a blind eye” to many problems. Thus, academic performance is traditionally used as an external criterion for the validation of intelligence tests, but at the same time it is well known that academic achievement is far from the only evidence of high intelligence. Incremental validity is of limited value and refers to the case where one test from a battery of tests may have a low correlation with a criterion, but not overlap with other tests from the battery. In this case, the test has incremental validity. This can be useful when conducting professional selection using psychological tests. Differential validity can be illustrated using tests of interest. Interest tests usually correlate with academic performance, but in different ways for different disciplines. The value of differential validity, as well as incremental validity, is limited.

Content validity is defined as confirmation that the test items reflect all aspects of the area of behavior being studied. Usually it is determined in achievement tests (the meaning of the measured parameter is completely clear), which, as it was already mentioned, are not actually psychological tests. In practice, to determine content validity, experts are selected who indicate which area of behavior is most important, for example, for musical abilities, and then, based on this, test items are generated, which are again evaluated by experts. The construct validity of a test is demonstrated as complete as possible by describing the variable that the test is intended to measure. In fact, construct validity includes all the approaches to determining validity that have been listed above. Cronbach and Meehl, who introduced the concept of construct validity into psychodiagnostics, tried to solve the problem of selecting criteria for test validation. They emphasized that in many cases no single criterion can serve to validate a single test. We can assume that the solution to the question of the construct validity of the test is a search for an answer to two questions: 1) does a certain property really exist; 2) whether this test reliably measures individual differences in this property. It is quite clear that the problem of objectivity in the interpretation of the results of the study of construct validity is associated with construct validity, but this problem is general psychological and goes beyond validity.

In order for a psychological and pedagogical experiment to be a sufficiently reliable means of research and to allow obtaining completely reliable results that can be trusted and on the basis of which correct practical conclusions can be drawn, it is necessary that the psychodiagnostic methods used in it be scientifically substantiated. These are considered methods that meet the following requirements: validity, reliability, unambiguity and accuracy.

Term "validity" literally means: "full", "suitable", "corresponding". Validity is inherently this is a complex characteristic, including, on the one hand, information about whether the technique is suitable for measuring what it was created for, and on the other hand, what is its effectiveness, efficiency. Testing the validity of a technique is called validation.

There are several varieties of validity, each of which should be considered and evaluated separately when it comes to determining the validity of a psychodiagnostic technique. Validity may be theoretical and practical (empirical), internal and external.

Theoretical validity is determined by the correspondence of the indicators of the quality under study, obtained using this method, to the indicators obtained by other methods - those with the indicators of which there should be a theoretically justified dependence. Theoretical validity is checked by correlations of indicators of the same property obtained using different methods based on or proceeding from the same theory.

Empirical validity is checked by the correspondence of diagnostic indicators to the real behavior, observed actions and reactions of the subject. If, for example, with the help of some methodology, we evaluate the character traits of a given subject, then the applied methodology will be considered practically or empirically valid when we establish that this person behaves in life exactly as the methodology predicts, i.e. according to his personality trait. According to the criterion of empirical validity, the methodology is checked by comparing its indicators with real life behavior or the results of people's practical activities.

Validity is internal means the compliance of the tasks, subtests, judgments, etc. contained in the methodology. the overall goal and design of the methodology as a whole. It is considered internally invalid or insufficiently internally valid when all or part of the questions, tasks or subtests included in it do not measure what is required from this methodology.

External validity - this is approximately the same as empirical validity, with the only difference that in this case we are talking about the relationship between the indicators of the methodology and the most important, key external features related to the behavior of the subject.

When creating a methodology, it is difficult to immediately assess its validity. Usually, the validity of a methodology is checked and refined in the course of its fairly long use, especially since we are talking about verification from at least the four sides described above.

There is no single universal approach to the definition of validity. Depending on which side of validity the researcher wants to consider, different methods of proof are also used. In other words, the concept of validity includes its different types, which have their own special meaning.

There are four types external criteria used to prove validity:

1) performance criteria (these may include such as the amount of work performed, academic performance, time spent on training, the rate of growth of qualifications;

2) subjective criteria (they include different kinds of responses that
reflect a person's attitude to something or someone, his opinion, views,
preferences; usually subjective criteria are obtained using interviews, questionnaires, questionnaires);

3) physiological criteria (they are used in studying the influence of the environment and other situational variables on the human body and psyche; the pulse rate, blood pressure, skin electrical resistance, fatigue symptoms, etc. are measured);

4) randomness criteria (used when the purpose of the study concerns, for example, the problem of selecting for work such persons who are less prone to accidents).

The search for an adequate and easily identifiable criterion is one of the most important and difficult tasks of validation.

There are several types of validity, due to the peculiarities of diagnostic methods, as well as the temporary status of an external criterion:

1) Validity "by content". This technique is mainly used in achievement tests. Usually, achievement tests do not include all the material that students have passed, but some small part of it (3-4 questions). Is it possible to be sure that the correct answers to these few questions testify to the assimilation of all the material. This is what the content validity check should answer. To do this, a comparison of success on the test with expert assessments of teachers (for this material) is carried out. Validity "by content" also applies to criteria-based tests. Sometimes this technique is called logical validity.

2) Validity "by simultaneity", or current validity, is determined by an external criterion, according to which information is collected simultaneously with
experiments according to the tested method. In other words, data is collected that relates to the present (progress during the trial period, performance during the same period, etc.). They correlate with the results of success on the test.

3) "predictive validity . It is also determined by a fairly reliable external criterion, but information on it is collected some time after the test. The external criterion is usually the ability of a person, expressed in some assessments, to the type of activity for which he was selected based on the results of diagnostic tests. Although this technique is most appropriate for the task of diagnostic techniques - the prediction of future success, it is very difficult to apply it. The accuracy of the forecast is inversely related to the time given for such forecasting. The more time passes after the measurement, the more factors must be taken into account when assessing the prognostic significance of the technique. However, it is almost impossible to take into account all the factors that affect the prediction.

4) "Retrospective" validity. It is determined based on the criteria
reflecting an event or state of quality in the past. It can be used to quickly obtain information about the predictive capabilities of the technique. Thus, to test the extent to which good scores on an aptitude test correspond to rapid learning, one can compare past grades, past expert opinions, and so on. in individuals with high and low diagnostic indicators at the moment.

When presenting data on the validity of the developed methodology, it is important to specify exactly what type of validity is meant (by content, by simultaneity, etc.). It is also desirable to report information on the number and characteristics of individuals on whom validation was carried out. Such information allows the researcher using the technique to decide how valid this technique is for the group to which
which he intends to use it. As with reliability, it must be remembered that a technique may have high validity in one sample and low validity in another. Therefore, if the researcher plans to use the methodology on a sample of subjects that is significantly different from the one on which the validity test was carried out, he needs to re-perform such a test.

In addition to the types of validity, it is important to know validity criteria . These are the main signs by which one can practically judge whether this technique is or is not valid. These criteria could be the following:

1. Behavioral indicators - reactions, actions and deeds of the subject in various life situations.

2. Achievements of the subject in various activities: educational, labor, creative and others.

3. Data indicating the performance of various control samples and tasks.

4. Data obtained using other methods, the validity of which or the relationship of which with the tested method is considered to be reliably established.

12. The concept of validity, reliability, reliability in psychodiagnostics.

Reliability- one of the criteria for testing methods. A. Anastasi, Kronbach, Thorndike contributed to the development of this criterion.

Reliability is the relative constancy, stability, consistency of test results during the initial and repeated measurements on the same subjects. It is necessary to repeat the measurement on the same sample. Differences are possible, but they should be minor. Thus, reliability speaks about the accuracy and stability of the results to the action of random factors.

The overall spread can be the result of two groups of causes:

The variability inherent in the trait itself.

Environmental factors that may affect

Reliability calculation procedures:

Conducting the same form of test. (retest reliability), calculation of the correlation coefficient. The interval between testing is from one to several months.

Conducting parallel forms of tests. When conducting a study using an equivalent form of the test, the specialist is convinced of the correctness of the selected feature. In order for the test forms to be considered equivalent, the same number of tasks in both tests must be the same, the tasks must be unified, the tasks must be arranged in the same degree of complexity, there must be the same average and standard deviation. Two approaches are used to calculate reliability using parallel forms of tests:

The same subjects are examined using the same test. And then another, and if the correlation coefficient is greater than 0.7, then the reliability is high.

The subjects are divided into two groups, one group passes test A, the other test B, a week later - vice versa.

Splitting the test and calculating the correlation coefficient. Subjects perform two parts of the test, which are equivalent. All even-numbered tasks fall into one part, odd-numbered ones into the other. This procedure shows the sequence of the test within itself, the measure of the adequacy of the selection of questions. The correlation coefficient is calculated.

The reliability coefficient corresponds to the Spearman or Pisron correlation coefficient.

Reliability factor-dispersion - a method of determining the reliability, based on the analysis of variance of the test results. The reliability of the test corresponds to the ratio of the true variance (i.e., the variance of the factor under study) to the empirical variance actually obtained. The latter is the sum of the true variance and the variance of the measurement error. The factor-analytical approach to the definition of reliability additionally dismembers the dispersion of the true indicator (J. Gilford, 1956).

The variance of the true indicator, in turn, may consist of the variance of the common factor for groups of similar tests, specific factors that provide tests of a specific focus, and the variance of factors inherent in a particular test method. Therefore, the total variance of the test is equal to the sum of the variances for the general, specific, and unit factors plus the error variance

The factor-dispersion method for determining reliability is suitable for evaluating an already factorized test, but not for tests that measure a wide range of various parameters, since some of them may not be included in the established area of validity of the methodology.

Reliability and allowable measurement error:

Reliability is determined by the measurement error criterion. Error is a statistic that characterizes the degree of accuracy of individual measurements. It is assumed that for any trait, each individual has a true score. Any indicator obtained in the test differs from its true value by some random error. And if you test a person several times, you get a spread of the indicator around the true value. This value fluctuates within certain limits. The fluctuation of this value may depend on systematic errors and random ones. The reasons for systematic errors may be incorrect testing, non-compliance with the procedure, inaccuracy in processing, low validity of the method. Random errors associated with the human factor are also likely. If such failures are not included in the methodology, then it cannot be considered accurate. With a large number of observations, individual estimates form a certain type of distribution of the statistic, which will reveal measurement errors. The measurement error is determined by statistical methods - the value of the square deviation associated with the dispersion of the distribution of individual measurements. The error should not exceed 5%.

Validity:

Validity is the ability of a test to measure what it is intended to measure. This concept no longer refers to the test, but to its purpose. A test may be reliable but not valid. But if the test is valid, then it is reliable.

Sources of ideas of validity:

The first ideas appeared even before the creation of tests. Often, researchers have linked research findings to future successes. For example, Pythagoras connected thinking and speech with the help of intuition.

The idea of the need for practical verification of the suitability of the test. Outside of practice, the problem of validity cannot exist.

Philosophical ideas: truth is the correspondence of thought to reality. The criterion of truth is utility.

The measurements we make are not obvious, they require a theoretical basis. Theoretical = empirical validity.

Development of statistical science - correlation and factor analysis.

Five sources gave rise to five kinds of validity.

In the early 20th century, intuition played a leading role. If the creator of the test was a famous person, then the belief in validity was taken at their word.

In the 1920s and 1930s, the demands of practice grew, tests began to be created based on empirical sources. Three empirical approaches have been developed:

All applicants for work are tested. Over time, their productivity and efficiency are measured. Then came the correlation of indicators. So the tests were used for utility.

First, those who are already successfully working are tested, and then the results of this testing are correlated with the results of candidates. If there is a relationship, then the test is valid.

Works by Binet and Simon. To make sure that the test measures intelligence. All test tasks were conducted on two groups, which were selected not by psychologists, but by teachers. In group 1 there were children with high mental abilities, in group 2 - with unexpressed abilities. This method was called "expert". Further testing was carried out. And if in both groups the majority answered in accordance with the expectations of the authors, then the test was recognized as valid.

Thus, empirical methods for substantiating validity existed until the 1950s. They began to believe that it is possible to prove validity not only with the help of practice. Analysis and correspondence between theory and practice. Content validity, comparison with the study program and test content. Comparison is facilitated when the program highlights the problem, purpose, and key concepts. Conceptual validity, as psychologists are interested in relating scientific concepts to empirically observed facts.

In empirical methods of proving validity, a special role is played by external criteria that serve as evidence of validity. American psychologists Tiffany and McCormick analyzed the use of external criteria and identified 4 types of them:

Performance criterion - the amount of work performed, the rate of growth of skill

Subjective criterion - the inclusion of various types of answers that reflect the attitude towards something.

Physiological criterion - used in the study of the environment.

The criterion of randomness is the consideration of many factors.

External criteria must meet the requirements of relevance, freedom from interference, reliability. Relevance is a semantic correspondence between a test and a vital criterion independent of it. Freedom from interference (contamination) is considered important since the activity is influenced by the person himself and his working conditions. Reliability - consistency of results

Diagnostic (competitive) B. reflects the ability of the test to differentiate the subjects according to the trait being studied. The analysis of diagnostic V. is related to establishing the correspondence between the test indicators and the real state of the psychological characteristics of the subject at the time of the examination. An example of the definition of this type of V. can be a study using the method of contrast groups. Conducting an intelligence test in normally developing children and their peers with intellectual disabilities can reveal profound quantitative and qualitative differences in the performance of tasks by the compared groups. The degree of reliability of differentiation of children of the first and second groups according to the test data will be a characteristic of the diagnostic V. assessment of mental development obtained using this technique.

Content validity(internal, logical) - a set of information about the representativeness of test items in relation to the measured properties and features. One of the main requirements for the validation of the methodology in this direction is the reflection in the content of the test of the key aspects of the studied psychological phenomenon. If the area of behavior or feature is very complex, then the meaningful V. requires the presentation in the tasks of the test of all the most important constituent elements of the phenomenon under study.

differential AT. - validity, which considers the internal relationships between psychological factors diagnosed using a psychodiagnostic technique. Content D.V. can be illustrated by the example of tests of interest, which for the most part are usually moderately correlated with indicators of overall academic performance, but are associated to varying degrees with performance in individual disciplines. V. d. is especially important as an indicator of the diagnostic value of the techniques used in professional selection.

illusory V.(false) - the illusion of conformity of the conclusion based on the test results to the personal characteristics of the subject. Arises as a result of the use of extremely general, and therefore applicable to almost all the examined formulations, such as, for example, “reasonable in choosing a goal”, “strives for a better life”, etc. Such statements are accepted by almost all people as an accurate description their personalities, which creates the basis for the activities of various kinds of soothsayers and soothsayers.

incremental AT. - (English incremental - increment, profit) - one of the components of criterion validity, predictive validity of the test, reflecting the practical value of the methodology in the selection. In and. can be quantified in terms of coefficient validity.

Consensus B. (consensual validity) - a type of validity based on the establishment of a connection (correlation) of test data with data received from external experts who are well acquainted with those individuals who have been tested. The concept and procedure of V. to. were introduced by R. McCrae in 1982 in order to ensure the validation of personality questionnaires, which is often difficult (and sometimes impossible) due to the lack of criteria necessary to establish the validity.

Constructive V. is one of the main types of validity, reflecting the degree of representation of the studied psychological construct in the test results. Practical or verbal intelligence, emotional instability, introversion, speech comprehension, attention switching, etc. can act as a construct. In other words, V. to. determines the area of the theoretical structure of psychological phenomena measured by the test.

However, it should be noted that, unlike criterion validation, the analysis of V. to. does not require a high degree of connection between the results of two tests. If it turns out that the new and reference tests are almost identical in content and results, and the developed methodology does not have the advantages of brevity or ease of application, this means only duplication of the test, justified only from the point of view. creating a parallel test form. The meaning of the procedure of V. to. is to establish both the similarity and difference of psychological phenomena measured by the new test in comparison with the known one.

An important aspect of V. to. is internal consistency, reflecting how certain items (tasks, questions) that make up the test material are subordinate to the main direction of the test as a whole, focused on the study of the same constructs. The analysis of internal consistency is carried out by correlating the answers to each task with the overall test result. When determining V. to. An important place belongs to the study of the dynamics of the measured construct. At the same time, we can rely on hypotheses about its age development, the impact of training, training, mastering a profession, etc.

Criteria V. - a set of characteristics, including the validity of the current and prognostic methods and reflecting the compliance of the diagnosis and prognosis with a certain range of criteria for the phenomenon being measured. The criterion validation is independent of the test results and direct measures of the quality under study, such as the level of achievement in any activity, the degree of ability development, the severity of a certain personality trait, etc. When validating achievement tests, the measurement result is compared with the opinion of teachers about knowledge surveyed in a certain area, with academic assessments, control tests, etc. In the case of validation of career guidance tests and methods, test assessments are compared both with expert assessments of colleagues and managers, and with objective indicators of achievements in the professional field.

Obvious V. - an idea about the test, its scope, effectiveness and predictive value, which occurs in the subject or another person who does not have special information about the nature of the use and the goals of the methodology. V. o. is not a component of objectively established validity. At the same time high V. about. in most cases is highly desirable. It acts as a factor that encourages the subjects to examine, contributes to a more serious and responsible attitude to the work of completing the tasks of the test and to the conclusions formulated by the psychologist.

Sufficient level of V. about. especially significant for the methods of examination of adults. Representations of subjects and users of psychodiagnostic information about V. o. to a large extent determined by the name of the technique, since this part of the information about the test is most accessible to non-specialists. V. o. significantly improves the use of understandable formulations and terms, as well as tasks that are the most natural in content, taking into account the age, gender, and professional specifics of the subjects. Inadequately overestimated V. about. contributes to a more pronounced manifestation of the effect of contamination of the criterion.

V. o. sometimes called external (face validity), or "confidence" (faith validity), validity.

V. by age differentiation is one of the components of construct validity, associated with the age dynamics of changes in the quality under study. The characterization of construct validity here consists in determining the compliance of test results with theoretically expected and practically observed age-related changes in a given construct or property.

Prognostic V. - information about the degree of accuracy and validity of the technique (test) allows you to judge the diagnosed psychological quality after a certain time after the measurement. V. p. reflects the time interval to which the rationale for such a judgment extends. Information about V. p. is most directly related to the disclosure of the predictive power of the technique, elucidation of the degree of validity of the immediate and more distant forecast formulated on its basis, and analysis of the significance of the indicators obtained in the test with the so-called. extrapolating results for the future.

As a criterion validation, not only indicators of actual behavior, but also the expected results of activities, treatment, training, etc. can act. the outcome of treatment, etc. At the same time, the difference between the two types of criterion validity is associated not only with the time limits of criterion comparisons.Current validity and V. p. development of quality or success in activity - on the other.

The importance of indicators of V. p. in the analysis of test procedures aimed at selection is emphasized by the introduction of a special concept of incremental validity. This V. p. indicator provides information on how much the selection procedure improves using this test compared to the traditional one (based only on formal information about previous activities, analysis of personal files, conversations).

V.'s complex of information with. traditionally has the greatest value for tests that examine activities that are close or coincide with real (most often educational or professional). The activity being studied is, as a rule, synthetic in nature, it consists of many, sometimes heterogeneous factors (manifestations of personal characteristics, a set of necessary knowledge and skills, specific abilities, etc.). Therefore, one of the most important tasks in creating an adequate model of the tested activity is the selection of such tasks that will cover the main aspects of the phenomenon under study in the correct proportion to the real activity as a whole.

Current V. (diagnostic, competitive) is a test characteristic that reflects its ability to distinguish subjects on the basis of a diagnostic feature that is the object of study in this technique. Levels of general abilities, claims, verbal intelligence, anxiety, etc. can serve as such signs. In a narrower sense, V.t. is the establishment of the compliance of the results of a validated test with an independent criterion that reflects the state of the quality being studied by the test at the time of the study.

A peculiar indicator of V. t. is a set of information about how convenient, economical the test is in comparison with obtaining information about the quality being studied from other sources (observation, analysis of objective data, expert assessment, etc.).

Ecological V. - the validity of the test in relation to the measured property in the context of a particular situation. V. e. is a property of the test, manifested in the fact that its use in solving various practical problems leads to a qualitatively different interpretation of the test results (V. N. Druzhinin, 1990).

Empirical B.-a set of characteristics of the validity of the test, obtained by a comparative statistical method of evaluation. It is related mainly to the field of criterion validity and its two types: current validity and predictive validity. If, when determining the validity of the content, the assessment of the test is carried out using various qualitative procedures for obtaining information by descriptive methods using expert assessments and other sources of information (to make a judgment about the compliance of the test tasks with the content of the subject of measurement), then V. e. always measured by statistical correlation. A correlation analysis of the relationship between two series of values is carried out - test scores and indicators for the external parameter of the studied property (or the results of another test, the validity of which is known).