Up: ^ Stereo Mainpage ^

Classical Cooperative Schemes

In the context of feature-based stereovision algorithms, cooperative schemes were developed in order to find a reasonable solution to the ill-conditioned matching problem inherent in these type of algorithms. Feature-based stereo algorithms have the following two problems:
  • Disparities can only be estimated where features are present. For all other image areas, values have to be interpolated by some heuristic. The sparseness of stable features makes the estimation of dense disparity maps difficult.
  • Any feature detected in the left image can potentially be matched with every feature of the same class in the right image. The number of possible matches explodes as the feature-density increases.
The cooperative schemes are based on two assumptions, first formulated by D. Marr and T. Poggio in a 1976 Nature article:
  • Uniqueness: in a fixed view-direction, you can see only one object. Clearly, this is not always true. A simple counterexample are transparent objects. In any case, this constraint translates into allowing only one match along each view-direction. In cooperative schemes, network states with such a property can be facilitated by introducing inhibitory interactions among all disparity detectors responding to the same view-direction.
  • Continuity: object surfaces are continuous - most of the time. Therefore, the distance of objects varies smoothly with viewing direction, except at object borders. This means that neighboring matches should have similar disparity values. One can favor such configurations if one introduces spatial facilitation between neighboring units.
Constraints like these define in conjunction with the raw matches a complicated error-measure which is minimized through cooperative network dynamics. Clearly, this process is iterative and therefore slow. Some of the more common cooperative schemes are described below; in all the displays, image data is fed diagonally into the network composed of horizontal disparity layers. Due to this layout, planes of different depth are stacked vertically above each other.

Overview of Cooperative Schemes

Dev/Nelson Scheme Two of the earliest cooperative schemes which were proposed as a basis for stereo vision were the schemes developed by Dev, 1975, and Nelson, 1975. Each disparity unit was supposed to have excitatory connections with its spatial neighbors. Units at the same horizontal position but tuned to different depths were connected by inhibitory links.
The above schemes were criticized by Marr & Poggio in 1976 for failing to provide a satisfactory algorithm; they switched the vertically running inhibitory lines to lines running along the view-directions of the left and right eye. Their algorithm was able to solve random-dot stereograms successfully. Marr/Poggio Scheme
Sperling's Scheme Earlier, Sperling had proposed a model without direct spatial facilitation. The spatial spread of inhibitory connections in his model leads to an indirect spatial facilitation by increasing the inhibition at neighboring spatial positions lying at a different depth.
In 1985, Prazdny criticized the introduction of inhibition along the direction of the depth coordinate. He realized that this would rule out the perception of transparency. His computational model used only spatial facilitation, but with spreading somewhat in depth. This leads, along an argument similar to Sperling, to an indirect inhibition in the depth direction. Prazdny's Scheme
MFP Scheme One of the most evolved network implementations so far is the one by Pollard, Mayhew & Frisby, 1985. This scheme uses a variety of rules to sort out invalid matches. In addition, disparity units interact along cone-shaped regions. This is very similar in spirit to a combination of the approaches of Sperling and Prazdny.

Difference to Coherence-based Stereo

Evenso the network structure used for coherence-based stereo looks very similar to the network structures of the cooperative algorithms described above (one trivial reason: I used the same base drawing), coherence-based stereo operates actually very different from any feature-based stereo algorithm.

In feature-based stereo, units simply mark the possibility or non-possibility of matching features at a certain position. To put it differently: feature-detectors are primarily binary units.

Since one can imagine many different types of features which might be used as input to the above decribed cooperative algorithms, one quickly arrives at the "bag of tricks" assumption sometimes put forward in this context: there exists not a single unique type of feature used in stereovision, but many, including egdes, zerocrossings, local extrema, texture boundaries, etc. (for an argument against this rather unsatisfactory view of utilizing multiple types of features, see here) Coherence-based Stereo

Coherence-based stereo uses in contrast disparity units which have to supply a disparity estimate, i.e., these units have to output a real number. Otherwise, coherence woul occur trivially between units in a single stack. The same is true if the units could output only a few discrete values: coherence would occur trivially. Therefore, disparity units for coherence-based stereo have to supply an disparity estimate with sub-pixel precision.

Of course, the estimate of a sinlge unit might be correct or wrong; it is exactly the purpose of the coherence-detection network to sort out the correct estimates.

The link structure which is used in the coherence-network looks superficially similar to classical cooperative schemes, but it is really more or less the inverse of these schemes: First, interactions between units occur only in the direction of the depth axis - no spatial interaction is taking place. Accordingly, no spatial spreading of activity occurs, which makes it easy to detect sharp object boundaries. Second, the interactions between units are excitatory and opportunistic (only occuring with a certain coherence difference), and not inhibitory as in most cooperative schemes. So several disparity estimates can be obtained in a single view-direction, which is necessary for the perception of transparency. Finally, coherence-based stereo is non-iterative, so it's much faster than the cooperative schemes.

© 1994-2003 - all rights reserved.