The Logic behind the ACT
Steve Johnson
Intelligence test scores are always based
on the correctness or rightness
(right or wrong, correct or incorrect) of the items and on the time it took to answer them.
Correctness
There are three reasons why an answer may
be incorrect
1. Lack of knowledge and, in the case of multiple
choice items, incorrectly guessing or coping (see
van der Ven, 1992a
and 1992b and van der Ven
& Ellis,
2000). A special case of lack of
knowledge
is mind set.
2. Inaccuracy due to working too fast.
3. Lack of time. This only applies to tests or items,
which are administered with a time limit.
Time
It is obvious that time is an important
factor. This applies in time limit tests as well as in work-limit tests. In
time limit tests the total number of items correct is equal to the total number
of items answered (or attempted) minus the total number of incorrect items. In
work-limit tests, such as for example the subtest Block Design of the Wechsler
test, the final test score is partly dependent on the time it took to complete
the items is.
Knowledge
Now it is absolutely clear that knowledge, including mind set, should
not play any part in intelligence tests. No matter what opinion one may have
about what intelligence tests should measure, it can never be knowledge. The
latter is the purpose of achievement tests. Now, if knowledge is not allowed to
play any part in intelligence tests, then one should only use very simple
problems in intelligence test, which will always be solved by the testee when
unlimited time was given to the testee.
This has led to the introduction of so-called speed tests, in which the
final test score is only dependent on accuracy
and speed and knowledge indeed does
not play a part. Parallel to this development, however, psychologists started
to notice, that the variability of performance also might be of great
importance (see supplement 1). Therefore, they introduced so-called (attention)
concentration tests.
Speed
- and Concentration tests
Both speed tests and concentration tests
consist of only simple mental tasks which are easy to perform and do not
require any specific knowledge. Mental set also does not play any role as it is
quite obvious what must be done and how it must be done. In the case of speed
tests, no attempts are made to time individual items or groupings of items.
Only gross measures are used such as the number of items correct, given some
limited test time or the total time needed to complete the test. In the case of
concentration tests, the series of response times of individual items or
groupings of items is used for the assesment of subject's performance or the
series of response counts in fixed periods of time. A well-known example of the
former is the Bourdon-Vos test (Vos, 1988), which is a children's version of
the Bourdon-Wiersma test (see Huiskamp and de Mare, 1947 and Kamphuis, 1962)
used in The Netherlands. A well-known example of the latter is the Pauli test
(see Arnold, 1964) used in Germany, which is a single digit addition task. The
time series consists of the number of additions per minute during a
thirty-minute period. The difference between speed tests and concentration
tests is not in test content or test instruction (Work as quickly and as
accurately as possible!), but in performance registration. In speed tests one
only uses some gross measure, such as total number of correct items or total
time needed for the test. In concentration tests one assumes that the relevant
information appears to be contained in the time series.
Practice
Another property of all intelligence
tests is that practice during the
test also may play a very important role for the final test score. Due to
previous experience some testees may already be more used to the task then
others (see supplement 2). In addition to that some testees may quickly get
used to the task while others are getting used to the task more slowly (see
supplement 2). It may well be the case that although some subjects may be
different regarding their rate of learning, they may be the same regarding
their performance when progress in learning has ended. This means that, if
practice was allowed, those subjects would have obtained the same test result.
However, for all intelligence tests, including the speed and concentration
tests mentioned above, pre-test practice is not allowed. In fact, everything is
done in order to prevent that people can familiarize the test. This, however,
means, that subject might get a different score due to differences in rate of
learning, while, when given ample time of practicing, they would have got the
same score. This means that, in addition to knowledge, also practice should
not play a part in intelligence tests. This means, that ample opportunity
should be given to the subject to practice the test before actually taking
them.
Reminiscence
In speed- and concentration tests
knowledge does not play a part, but practice still does. Now in order to
circumvent the practice factor, one could propose only to use the last part of
the test for the final evaluation or scoring of the test. For example, in the
case of the Bourdon Vos test, one could decide only to use let us say the last
10 lines, assuming that learning would then have come to an end. The total
number of lines is equal to 33. However, in that case one ignores the effect of
reminiscence. The study of
reminiscence has a long history, which is shortly described in Eysenck and
Frith (1977, chapter 1)
"Reminiscence is a technical term, coined by Ballard in
1913, denoting improvement in the performance of a partially
learned act that occurs while the subject is resting, that is, not
performing the act in question."
(Eysenck and Frith, 1977, p.3).
The reality of the phenomenon was first
experimentally demonstrated by Oehrn (1895). In experiments on reminiscence the
same task is always administered twice or more. One is mainly interested in the
effect of the rest periods between test administrations. Learning is not only
apparent within tests but also, and very distinctive, across tests. These
effects have clearly been demonstrated by van Breukelen et al. (1987, p. 187, Fig. 3) and Jansen (1990, page 78, Fig.
8). Across test administrations, the occurrence of
decreasing reaction time curves gradually vanished in favor of increasing
curves and the average reaction time in the end of the task decreased. Now, if
one wants to cancel out all learning effects, one should also include the
possible reminiscence effect. This means, that the testee should be given the
opportunity and also should be encouraged to do the tests several times until
complete habituation is obtained.
Inaccuracy
It is a well-known fact that subjects are
able to exchange speed against accuracy. A task can be done faster, but at the
cost of accuracy and subjects are able to work more accurately, but at the cost
of speed. This phenomenon is referred to as the speed-accuracy trade-off and
can be described in terms of the so-called speed-accuracy trade-off function,
in which the probability of a correct answer is given as a function of the time
needed to answer the item. It is a monotone increasing function and it may vary
from subject to subject. Depending on the subject, the function may be shifted
to the right (a lower ability) or to the left (a higher ability). As long as
one does not know the ability of the subject, one cannot know whether the
observed reaction times are high or low when they would have been corrected for
errors. This is only possible when the trade-off function of the subject in
question is known. This, however, is generally not the case. The whole problem
can be circumvented by only accepting error free results. The design of a
speed-accuracy trade-off experiment and the statistical analysis of the results
are discussed in Donders, 1997). The author also reports some actual
experiments, in which problems were used such as they occur in actual
intelligence tests.
Continuous
work vs. discontinuous work
In concentration tests, as well as in
speed tests with a time limit, the testee is supposed to respond in a
self-paced, continuous manner. The person controls his/her own speed the
subject's response to each part (or item) of the test releases the next one in
the sequence. He/she is not supposed to take rest pauses between parts (or
items). In time limit tests, resting between successive items would reduce the
number of items answered and therefore also the number of items correct.
However, especially in concentration tests psychologists could have chosen for
a discontinuous format, in which deliberately interposed resting periods are
given to the subject instead of the continuous format of the task. This,
however, never was an option, probably due to the unspoken assumption, that a
self-paced, continuous form of the tests would be sensitive to possible
long-term fluctuations in attention, whereas a discontinuous format would not.
In experiments reported by van Breukelen et al. (1987) and Jansen (1990)
several experimental conditions were run, among them a continuous work condition and a discontinuous work condition. In the continuous work condition, no
rest pauses were given between blocks of stimuli. Stimuli were presented in
blocks and only block reaction times were used for further analysis. In the
discontinuous work condition, rest pauses of three seconds were interposed
between blocks of stimuli. After these three seconds the subject could prolong
this pause until they chose to resume the task by pressing a button. In both
conditions the task was overlearned. When the interposed rest periods are
sufficiently large, this resulted in a flat (no trend) reaction time curve. So,
it is clear that it matters a lot whether one uses the continuous format or the
discontinuous format.
Inhibition theory
As was explained
above, the ACT actually may be considered as an experiment in which unintended
factors, such as knowledge, practice and inaccuracy, which should not occur in
intelligence tests are cancelled out. This is typical for experimentation, which
is the second most fundamental tool of science. What one finally has is a type of
concentration test, which consist of an overlearned,
continuous response task, also
referred to as overlearned, prolonged
work task. What one finally obtains is a time series of consecutive
reaction times. This times series has properties, which have to be explained by
a theory. The
theory concerned is known as Inhibition Theory and is described in Smit and van
der Ven (see van der Ven & Smit, 1982; van der Ven, Smit & Jansen,
1989; Smit & van der Ven, 1995 and van der Ven, 2001). Theory is the third most important tool of
science (and observation the first).
Many psychologists, however,
are not aware of what a theory exactly is. Therefore in Supplement 3 a
short description is given of how science works and what a theory constitutes
in practice.
If you want read about
Inhibition Theory, please, click
To be continued...
References
Arnold, W. (1975).
Der Pauli-Test.
New York Springer-Verlag.
Breukelen, G.J. van,
Jansen, R.W., Roskam, E.E.,
Ven, A.H. van der &
Smit, J.C. (1987).
Concentration, speed and precision in
simple mental tasks.
In E.E. Roskam & R. Suck (Eds).
Progress
in mathematical psychology,
Amsterdam North-Holland.
Donders, A.R.T. (1997).
The
validity of basic assumptions underlying models for time limit tests.
Doctoral Dissertation. University of
Nijmegen, the Netherlands.
Eysenck, H.J. and Frith, C.D. (1977).
Reminiscence,
motivation and personality.
London Plenum Press.
Huiskamp, J. and Mare, H.
de (1947).
[Dutch Journal of Psychology], 2, 75-78.
Jansen, R.W.T.L. (1990).
Mental
Speed and Concentration.
Doctoral Dissertation. University of
Nijmegen, the Netherlands.
Kamphuis, G.H. (1962).
Een bijdrage tot de
geschiedenis van de Bourdon-test
[A contribution to the history of the
Bourdon test].
Nederlands Tijdschrift voor de Psychologie
[Dutch Journal of Psychology], 17, 247-268.
Oehrn, A. (1896).
Experimentelle Studiën zur
Individualpsychologie
[Experimental research on the study of
individual differences].
Psychologische Arbeiten, 1, 92-151.
Smit, J.C. & van der
Ven, A.H.G.S. (1995). Inhibition in Speed
and Concentration Tests the Poisson
Inhibition Model.
Journal of Mathematical Psychology, 39, 265-273.
Spearman, C. (1927).
The
Abilities of Man.
London MacMillan.
Ven, A.H.G.S. van der
\& Smit, J.C. (1982). Serial Reaction
Times in Concentration Tests and Hull's
Concept of Reactive
Inhibition. In Micko, H.C. and Schulz, U.
(Eds.)
Formalization
of Psychological Theories.
Proceedings of the 13th European
Mathematical Psychology
Group Meeting, Bielefeld. Report of the
Universitaet Bielefeld,
Schwerpunkt
Mathematisierung, D-4800 Bielefeld, F.R. Germany.
Ven, A.H.G.S. van der,
Smit, J.C. \& Jansen, R.W. (1989).
Inhibition in prolonged work tasks.
Applied
Psychological Measurement, 13, 177-191.
Ven, A.H.G.S. van der (1992a). Item
Homogeneity in Verbal Tests
a Rasch Analysis of Amthauer's Verbal
Tests.
Educational
and Psychological Measurement, 52, 623-639.
Ven, A.H.G.S. van der
(1992b). Item Homogeneity in a Spatial Test
a Rasch Analysis of Amthauer's Cubes
Test.
European
Journal of Psychological Assessment,
8, 189-199.
Ven, A.H.G.S. van der (1998). Inhibition
Theory and the Concept of Perseveration.
In Cornelia E. Dowling, Fred S. Roberts
& Peter Theuns
Recent
Progress in Mathematical Psycholgy.
London Lawrence Erlbaum Associates.
Ven, A.H.G.S. van der &
J.L. Ellis, J.L. (2000). A Rasch Analysis
of Raven's Standard Progressive Matrices.
Personality
and Individual Differences, 29, 45-64.
Ven, A. H. G. S. van der. (2001).
A Theoretical Foundation of Speed and Concentration Tests.
In: Frank Columbus (Editor): Advances in Psychology Research, Volume 4,
Hauppauge, NY: Nova Science Publishers.
Vos, P. (1988).
De Bourdon concentratietest voor kinderen
[The Bourdon concentration test for
children].
Lisse Swetz en Zeitlinger.
Supplement
1: Fluctuations in continuous work
Several authors, among them Binet (1900)
and Godefroy (1915), stressed the importance of the fluctuation in speed suggesting
the mean deviation as a measure of performance. In this connection it is also
worthwhile to mention a study by Hylan (1898). He used, in his experiment B, a
27 single digits addition task. He not only pointed to the importance of the
fluctuation of reaction times, but he was also the first one who reported
gradually increasing (marginally decreasing) reaction time curves (Hylan, 1898,
page 15, figure 5). It was assumed by many authors that the relevant
information should be searched for in the short-term oscillation of the
reaction times. Spearman considered even oscillation
to be a separate universal factor in addition to what he called the general factor (not further identified)
and perseveration (Spearman, 1927,
p. 327). A typical manifestation of this factor (oscillation)
"... is supplied by the fluctuations which always occur in any
person's continuous output of mental work, even when this is
so devised as to remain of approximately constant difficulty."
(Spearman, 1927, p. 320).
At the next page of his book Spearman
argues that
"... almost any kind of continuous work can be arranged so as
to manifest the same phenomenon. In all cases alike, the output
will throughout exhibit fluctuations that cannot be attributed to
the nature of the work, but only to the worker himself."
(Spearman, 1927,
p. 321).
More recently, Jensen (1982), discussing
his reaction time experiments, noted that trial-to-trial variability (measured
by the standard deviation of subject's reaction times) frequently surpassed
response speed (measured by the mean of subject's reaction times) as a
predictor of intelligence.
Supplement
2: The learning curve in continuous
work
To be written ...
Supplement
3: The Concept of Scientific Theory
Underneath is a
short outline of how science works in terms of experimentation and theory
building. Exactly the same procedure is followed in this study.
The progress of
science is, among others, dependent on: controlled observation and
experimentation on the one hand and the development of models about structural
information (obtained from the observations and experiments) on the other hand.
With structural information is meant the availability of data as patterns or
structures. Experimentation and/or controlled observation are needed in order
to eliminate the possible influence of unintended factors. The more such
factors play a role in the emergence of the final data, the more complex the
coming into existence of the data is and the more complex the models have to be
in order to explain the data. Structural information is needed in order to make
predictions possible, which is needed to check the theory empirically.
Predictions are about properties of data structures, not about single data
points. For example, in the case of the development of Bohrs original atom
theory the data consisted of spectral lines, that is positional patterns of
spectral lines. Experimentation was needed to study the spectral lines of
elements, and not of compound of elements, such as molecules and mixtures of
molecules. It is impossible to make an explanatory model for the spectral
composition of a single element. The spectral compositions of many elements are
needed in order to develop and test a model to explain the various spectral
compositions. A theory (or model) is always about data (coded observed
phenomena) and is intended to explain these data. Bohrs's original atom theory
is used to exemplify the relation between data and theory.
At the end of
19th century, physicists already had assumed the existence of electrons inside
atoms, and that the wiggling of these electrons gave off light and other
electromagnetic radiation. But there was still a curious mystery to solve.
Physicists would heat up different elements until they glowed, and then direct
the light through a prism. If one does that with sunlight, one sees the whole
rainbow because the prism breaks the light into all of its separate colors. But
when scientists looked at the light coming off of just one element, hydrogen
for instance, they didn't see the whole rainbow. Instead they just got bright
lines of certain colors. (Actually, "color" isn't the right term,
because only some of the lines were visible, but for now we'll just talk about
visible light.) Each type of atom gives off a unique set of colors. The colored
lines (or Spectral Lines) are a kind of "signature" for the atoms.
Spectral lines
were first seen in the sun's spectrum by William Wollaston in 1802. However,
they were not systematically studied until 1814, when a German optician named
Joseph von Fraunhofer observed and catalogued them. Fraunhofer carefully
recorded the positions of the lines (recorded observations), but he didn't
attempt to explain (that is making a theory) why they were there. In the late
1850's, the physicist Gustav Kirchhoff decided to investigate further, with the
help of the chemist Robert Bunsen. Bunsen was the man who invented the Bunsen
burner. They held various substances in the flame of a Bunsen burner. The light
emitted from the heated elements was separated into spectra using a prism and
they found that each element had its own unique set of lines. A given element
would always produce the same spectrum, which was different from that of any
other element.
As already was
mentioned each element has a different spectral "signature" and
scientists can tell what elements they are looking at just by reading the
lines. To explain (theory) the spectral line puzzle (structural information),
Bohr came up with a radical model of the atom, which had electrons orbiting
around a nucleus. It was already known, that electrons could orbit around a
positively charged nucleus. In order to explain the "signature
colors," Bohr came up with an extraordinary rule the electrons had to
follow: Electrons can only be in "special" orbits. All other orbits
just were not possible. They could "jump" between these special
orbits, however, and when they jumped they would wiggle a little bit, which
would cause radiation! When an electron jumps to a smaller orbit a little burst
of light goes shooting out. When an electron jumps to a higher orbit a little
bit of energy is needed to "bumps" the electron up. These little bits
of (electromagnetic) energy are called photons. Now one can see why the
Bohr model was considered so radical! It said that energy could only change in
little jumps. These are called quanta and that's why this kind
of physics is
called Quantum Mechanics. According to Bohr's original conception of the atom,
atoms look like little solar systems with electrons making quantum jumps
between special orbits. However, the idea of an electron actually flying around
in little circles turned out to have lots of problems, and physicists were
eventually forced to discard that model. The concept of "special
orbits", however, was extremely useful, it's just the orbits themselves
they were not going to use anymore. Instead, they were going to think about electrons
being in special energy levels.
For the name of
the theory one often makes use of the most important latent (unobservable)
quantity, which is used to explain the data. For example,
in Bohr's case this
was the concept of "quantum" (Quantum Mechanics). Now, one must
understand very
well, that Quantum Mechanics is
not about quanta (plural of quantum), but about
spectral lines and in order to understand the empirical structure of these
lines a theory was developed in which the concept of a the unobservable quantum
played a central role. Another example is Newton's gravitation theory. It is
about the movement of the planets along the sky. The most important explanatory
quantity was "gravitation", which is different for each planet, and
dependent on
its mass. Note, that the latent quantity, such as, for example, Newton's
gravitation, has only meaning within the theory itself. Within the theory no
answer is given to such questions as what gravitation is or where it
comes
from. The existence of gravitation is assumed and NOT further explained. For
example, when Newton was asked: What is gravitation? his answer was: Hypotheses
non fingo (I will not make any assumptions about that.). Further theory
development is needed in order to understand such questions. In the case of
Newton's gravitation, many years later Einstein could answer the question in
terms of his theory of relativity. A similar argument holds for Bohr's concept
of a quantum.
To be continued
...