Structural information theory
My
research on perceptual organization starts from the conglomerate of ideas in structural information
theory
(
SIT). SIT began as a coding model of visual pattern
classification which, in interaction with empirical research,
developed
into a competitive theory of perceptual organization.
Central to SIT is the simplicity
principle,
which implies that the visual system is assumed to prefer the simplest
interpretation
among all possible interpretations of a stimulus. To make quantifiable
and verifiable predictions,
the interpretations are represented by symbol strings, and the symbol
string with the overall simplest code is taken to specify the preferred
interpretation. A simplest code enables
the reproduction of
the stimulus by means of a minimum number of descriptive parameters. It
is
obtained by capturing a
maximum amount of regularity, and it implies a
hierarchical stimulus
organization in terms of wholes and parts. These wholes and parts then
are predicted to be the perceived objects.
My work on SIT focuses on the foundations of its conglomerate of ideas, and it includes:
- An improved complexity metric to predict preferred
stimulus
interpretations --- see below
- A Bayesian translation which supports the veridicality
of simplicity in everyday life --- see below
- A mathematical characterization of visual regularity,
which
underlies
SIT's coding rules --- see below
- Empirical support for this mathematical characterization of visual regularity --- see below
- A transparallel algorithm to compute guaranteed
simplest
codes of symbol strings --- see below
- Support for the neural plausibility of this transparallel algorithm --- see below
Further reading:
Spin-off:
- A research line concerning detection and detectability
of
visual
regularities --- see below
- A research line concerning cognitive architecture and
cognitive processing --- see below
- A research line concerning amodal completion and
object perception --- see
Rob
van Lier
Information measurement
The Morse Code marks the
beginning of
the
Information Age, in the mid 19th century. Since that time, an on-going
problem has been the quantification of the
amount
of
information in objects --- where an object may be anything, including
messages and perceptual interpretations. In
communication theory, this problem was addressed
to minimize the long-term burden on transmission channels. Shannon
(1948) came
with
a ground-breaking solution to minimize this burden. Following Nyquist
(1924)
and
Hartley
(1928), this solution involves a measure of probabilistic
information which implies:
The
more often an object occurs, the less
information
it contains.
In many domains, however,
probabilities of
occurrence are unknown, if not unknowable. This
drawback triggered a rethinking about information, both in
mathematics (most prominently by Kolmogorov, 1965) and in
perception research (most prominently by Garner, 1962).
In mathematics, this rethinking led to algorithmic information theory
(AIT) and, in perception research, it led to structural information
theory (SIT). Both AIT and SIT define the complexity of
an object by the length of its shortest reconstruction
recipe. This involves a measure of descriptive
information which implies:
The
more regularity an object
exhibits, the
less information it contains.
In contrast to AIT, SIT quantifies
information to differentiate between perceptual organizations,
makes a distinction between metrical
and structural information (following MacKay, 1950, taking the
structural information to be decisive in perception), and does not
consider any imaginable regularity but only visual regularities. At
first,
SIT
used a complexity metric that performed well empirically but that
was
not very compelling theoretically. The mathematical
characterization
of visual regularities as being transparant holographic regularities
(see
below),
however, paved the way for an improved metric which not only is
theoretically compelling but also performs better empirically. Since
1990, this improved metric is the standard in SIT.
Further reading:
Spin-off:
Simplicity versus likelihood
Usually, vision is sufficiently
reliable (i.e., veridical, or truthful) to
guide action. But what makes vision so reliable? Around 1900, Helmholtz
proposed the likelihood
principle which suggests that the visual system selects
interpretations most likely to be true, that is, with
the highest probability of occurrence in the world. This sounds
attractive but
the,
thus far unsolved, question then is: How can vision scientists, or the
visual system
for that matter, acquire knowledge about these probabilities? After
all,
the tool
to assess these probabilities can be nothing else than the visual
system
itself, so that one would measure still-to-be-explained perceptual
preferences
rather
than probabilities of occurrence in the world.
In the early 20th century, the Gestaltists argued conversely that the
visual system follows its own internal rules of perceptual
organization. Hochberg and McAlister (1953) proposed that one of these
internal rules is the simplicity
principle which suggests that the visual system selects
interpretations with the lowest
descriptive complexity (see
above).
Findings in mathematics and
psychology show that descriptive simplicity has a stable
quantification.
Furthermore, SIT's empirically successful model of amodal
completion (
van Lier et
al., 1994) shows that it is possible to integrate
viewpoint-independent factors
and viewpoint-dependent factors, quantified both in terms of
complexities.
A
Bayesian translation of this integration, combined with
findings in
mathematics, suggests that simplicity and likelihood may be far apart
for viewpoint-independent factors (Bayesian priors), but also that they
are close
for viewpoint-dependent factors (Bayesian conditionals) which seem
decisive in the everyday perception by a moving observer. This implies
that
either principle may have guided the evolution of vision: the
likelihood principle is
a special-purpose principle in that it is highly adapted to one
specific world, whereas the simplicity principle is a general-purpose
principle in that it is fairly adaptive to many different worlds.
Further reading:
Spin-off:
- Support for SIT's claim that veridicality is an
emergent
feature of simplicity
- Interaction with parallel research in mathematics ---
see
invited lectures NIPS
2001
and DIMACS 2003
Transparant holographic regularity
During the past century, formal
descriptions
of visual regularity relied on the transformational approach, which
defines visual regularities as configurations that are
invariant under
motion (i.e., under rotations or translations). This traditional
definition is adequate for object recognition but not for object
perception. Therefore, a new formalization was developed, defining
visual regularities as transparent holographic
configurations; this refers to the following concepts:
- Holographic
regularity. This
concept singles out regularities that are invariant under growth. For
instance, by analogy, a person may grow but retains a
symmetrical
body shape, and a queue of penguins may grow but remains a
repetition
of penguins. This illustrates
that symmetry
and repetition are invariant under growth, so that they qualify as
holographic regularities. Among all imaginable regularities, only
twenty regularities are holographic.
- Transparant
hierarchy. Whereas
the concept of holographic regularity applies to single regularities,
the concept of transparent hierarchy applies to combinations of
regularities. It singles out regularities that, when combined
hierarchically with other regularities, still specify hierarchical
pattern organizations. Only
three of the twenty holographic regularities allow for this and qualify
therefore as transparent regularities.
The three transparent holographic regularities are
symmetry, repetition,
and so-called alternation. Alternation covers, among others,
so-called Glass patterns which consist of randomly
positioned
but coherently oriented dot pairs.
Further reading:
Spin-off:
- A mathematical foundation of SIT's coding rules
---
see above
- An improved complexity metric to predict preferred
pattern
interpretations --- see above
- A research line concerning detection and
detectability of
visual regularities --- see
below
- A transparallel algorithm to compute guaranteed
simplest
codes of symbol strings --- see below
- A research line concerning cognitive architecture
and
cognitive processing --- see below
The holographic approach to goodness
My empirical research
focuses on the perceptual goodness (i.e., detectability) of visual
regularities such as
symmetry,
repetition, and Glass patterns. The
literature contains many empirical studies
into
symmetry, which indeed
forms a perfect
case to investigate general perceptual processes -- but more so if it
is
contrasted with other visual regularities as is done in
the holographic approach.
The holographic approach was introduced in the mid 1990s. It adheres
the idea that insight in the actual
detection process starts with insight in the structures to be detected.
To this end, it builds on the mathematical
characterization
of visual regularities as being transparant holographic regularities
(see
above).
This shared property implies that
symmetry, repetition, and
Glass patterns have different visual structures, namely, a point
structure, a block structure, and a
dipole structure, respectively.
This unique structural differentiation forms the heart of a
quantitative model of the detectability of single, perturbed, and
nested regularities. A faithful translation into a qualitative
process model relates this quantitative model to general
perceptual processes. The two models explain a wide range of phenomena;
for instance, thus far, these are the only models explaining the
key phenomenon that mirror symmetries and Glass patterns are about
equally good and better than repetitions. This
research line also extends to the role of visual
regularities in perceptual organization, that is, in the formation of
perceived objects.
Further reading:
Spin-off:
Transparallel processing by hyperstrings
A serious problem to SIT's simplicity principle (see
above)
was the computation of
simplest codes of symbol strings. This problem seemed unsolvable,
because such a simplest code has to be selected from among a
superexponential number of possible codes. Hence, even parallel
processing of all possible codes would not
be a realistic option. The mathematical
characterization
of visual regularities as being transparant holographic regularities
(see
above),
however, paved the way for a previously uncharacterized form of
processing, namely, transparallel
processing:
Simultaneous
processing of many items as if only one item were concerned.
In general, items can be processed serially by one processor or in
parallel by many processors. Compared to serial processing, parallel
processing reduces the time to finish a job but not the amount of work
to be done.
The amount of work can be reduced substantially
if the items can be gathered in a distributed
representation, which is a data-structure that
exploits the
fact that many items share item parts (think of routes as represented
in a
road map). Then, to select an item that satisfies certain criteria, for
instance, it may suffice to process (serially or in parallel) only all
different item parts. This form of processing is called distributed
processing.
Transparallel processing goes
one step further, by using special distributed representations called
hyperstrings.
This graph-theoretical concept refers to a collection of
strings
that can be searched for regularities as if only one string were
concerned. Transparent holographic regularities
group by nature into hyperstrings, which enables a hierarchically
recursive regularity search using hyperstrings. This has been
implemented in an algorithm that computes
guaranteed simplest codes of symbol strings.
Further reading:
Spin-off:
- Support for SIT's simplicity principle as a feasible
selection criterion in perceptual organization
- Support for the relevance of "smart" algorithms in
cognitive
science --- see Smart processing
- A research line concerning cognitive architecture
and
cognitive processing --- see below
Human cognitive architecture
The brain is typically assumed to be
attuned to
relevant regularities in the world. Transparent holographic
regularities are visually relevant regularities that lend themselves
for transparallel processing (see
above). Hence, just as
parallel distributed processing,
transparallel processing might well be a form of cognitive processing.
This idea has been elaborated into a concrete picture of flexible
representational cognitive architecture implemented in the relatively
rigid neural
architecture of the brain.
To give a gist, in neuroscience, transient neural assemblies in the
visual hierarchy in the brain -- which
signal their presence by way of firing
synchronization of the neurons involved -- are believed to bind similar
features in
an incoming
stimulus. Because binding of similar
features is also what hyperstrings do, and because hyperstrings allow
for transparallel processing of these similar features, it might well
be that the neuronal synchronization in those transient neural
assemblies is a manifestation of transparallel
feature processing mediated by those assemblies.
Hence, those temporarily synchronized neural assemblies can be
conceived of as neural
counterparts of hyperstrings or, in other words, as cognitive
information processors which mediate transparallel processing of
similar
features. This suggests that they are the constituents
of flexible self-organizing cognitive architecture in between the
relatively rigid level of neurons and the still elusive level of
consciousness. They are therefore proposed to be called "gnosons", that
is, fundamental particles of cognition.
Further reading:
Spin-off:
- Support for the neural plausibility of SIT's
representational approach
- A pluralist integration of
representational, connectionist, en dynamic-systems ideas --- see Marr's levels and Metaphors of cognition