Classical Cooperative Schemes
In the context of feature-based stereovision algorithms, cooperative schemes
were developed in order to find a reasonable solution to the ill-conditioned
matching problem inherent in these type of algorithms. Feature-based stereo
algorithms have the following two problems:
The cooperative schemes are based on two assumptions, first formulated by D.
Marr and T. Poggio in a 1976 Nature article:
- Disparities can only be estimated where features are present. For all
other image areas, values have to be interpolated by some heuristic. The
sparseness of stable features makes the estimation of dense disparity maps
- Any feature detected in the left image can potentially be matched
with every feature of the same class in the right image. The number
of possible matches explodes as the feature-density increases.
Constraints like these define in conjunction with the raw matches a
complicated error-measure which is minimized through cooperative network
dynamics. Clearly, this process is iterative and therefore slow. Some of the
more common cooperative schemes are described below; in all the displays,
image data is fed diagonally into the network composed of horizontal
disparity layers. Due to this layout, planes of different depth are stacked
vertically above each other.
- Uniqueness: in a fixed view-direction, you can see only one
object. Clearly, this is not always true. A simple counterexample are
transparent objects. In any case, this constraint translates into allowing
only one match along each view-direction. In cooperative schemes,
network states with such a property can be facilitated by introducing
inhibitory interactions among all disparity detectors responding to the same
- Continuity: object surfaces are continuous - most of the time.
Therefore, the distance of objects varies smoothly with viewing direction,
except at object borders. This means that neighboring matches should
have similar disparity values. One can favor such configurations if one
introduces spatial facilitation between neighboring units.
Overview of Cooperative Schemes
Two of the earliest cooperative schemes which were proposed as a basis
for stereo vision were the schemes developed by Dev, 1975, and Nelson, 1975.
Each disparity unit was supposed to have excitatory connections with its
spatial neighbors. Units at the same horizontal position but tuned to
different depths were connected by inhibitory links.
The above schemes were criticized by Marr & Poggio in 1976 for failing
to provide a satisfactory algorithm; they switched the vertically running
inhibitory lines to lines running along the view-directions
of the left and right eye. Their algorithm was able to solve random-dot
Earlier, Sperling had proposed a model without direct spatial
facilitation. The spatial spread of inhibitory connections in his model
leads to an indirect spatial facilitation by increasing the inhibition at
neighboring spatial positions lying at a different depth.
In 1985, Prazdny criticized the introduction of inhibition along the
direction of the depth coordinate. He realized that this would rule out the
perception of transparency. His computational model used only spatial
facilitation, but with spreading somewhat in depth. This leads, along an argument
similar to Sperling, to an indirect inhibition in the depth direction.
Evenso the network structure used for coherence-based stereo looks very
similar to the network structures of the cooperative algorithms described
above (one trivial reason: I used the same base drawing), coherence-based stereo operates actually very
different from any feature-based stereo algorithm.
One of the most evolved network implementations so far is the one by Pollard,
Mayhew & Frisby, 1985. This scheme uses a variety of rules to sort out invalid
matches. In addition, disparity units interact along cone-shaped regions.
This is very similar in spirit to a combination of the approaches of Sperling and Prazdny.
In feature-based stereo, units simply mark the possibility or
non-possibility of matching features at a certain position. To put it differently:
feature-detectors are primarily binary units.
Since one can imagine many different types of features which might be
used as input to the above decribed cooperative algorithms, one quickly
arrives at the "bag of tricks" assumption sometimes put forward in this
context: there exists not a single unique type of feature used in
stereovision, but many, including egdes, zerocrossings, local extrema,
texture boundaries, etc. (for an argument against this rather
unsatisfactory view of utilizing multiple types of features, see here)
Coherence-based stereo uses in contrast disparity units which have to
supply a disparity estimate, i.e., these units have to output
a real number. Otherwise, coherence woul occur trivially between units in a
single stack. The same is true if the units could output only a few discrete
values: coherence would occur trivially. Therefore, disparity units for
coherence-based stereo have to supply an disparity estimate with sub-pixel
Of course, the estimate of a sinlge unit might be correct or wrong; it is
exactly the purpose of the coherence-detection network to sort out the
The link structure which is used in the coherence-network looks
superficially similar to classical cooperative schemes, but it is really
more or less the inverse of these schemes: First, interactions between units
occur only in the direction of the depth axis - no spatial interaction is
taking place. Accordingly, no spatial spreading of activity occurs, which
makes it easy to detect sharp object boundaries. Second, the interactions
between units are excitatory and opportunistic (only
occuring with a certain coherence difference), and not inhibitory as in most
cooperative schemes. So several disparity estimates can be obtained in a
single view-direction, which is necessary for the perception of transparency. Finally, coherence-based stereo is
non-iterative, so it's much faster than the
© 1994-2003 - all rights reserved.