Segmentation in Scale Space

Rolf D. Henkel

Prev: Abstract ^ Up: Table of Contents ^ Next: Global/Local Analysis


Despite a wide variety of different segmentation techniques [1, 2, 11], no general theory of segmentation exists. In this paper, the segmentation task is explored in a biological context. Specifically, we want to show how neurons with small and large receptive fields can cooperate in order to group visual data into disjunct classes. One might call this process early segmentation, its sole purpose being the grouping of raw data into some meaningful chunks. The grouping operation leads to a tremendous data reduction for higher cognitive functions, which we assume not to interfere with the segmentation process in question. Thus grouping will only be done on the basis of some intrinsic image characteristics and not be based on pre-learned knowledge about the data being processed.

It is not a trivial question what kind of image characteristic one should choose for early segmentation. The possibility to find one is closely connected to the fact that the visual signals in question are generated by only a few physical processes, which in turn create some regularities in the sensory signals. One can exploit these regularities in two ways: by trying to invert the image formation processes, or by utilizing some invariants of these processes. A simple example of the first kind are various shape-from-shading algorithms. They can be interpreted as trying to invert the physical processes leading to shadows on the surface of objects.

Easier than the inversion type of processing is the grouping of data based on some invariant image feature. One trivial candidate for such a feature is the optical flow induced by the motion of objects in the visual field or by ego-motion. Using this feature for segmentation corresponds to utilizing the ''common fate'' paradigm of gestalt-theory. However, we will not be concerned with the appropriate image features to choose for segmentation. At the moment, this seems to be a largely heuristic question. Instead, we ask how to segment a given feature set into a few correct object chunks.

Usually, the regions grouped together into one object are required to be uniform and homogeneous with respect to the chosen image feature. However, strictly uniform and homogeneous regions are simply not present in generic datasets. This is caused by a variety of reasons, some connected with sensor noise, others with the interference of image formation processes causing shading and highlights to appear on the objects in question. Accordingly, the segmentation based on a strict notion of uniform feature values across objects leads to regions full of small holes, with ragged or no border at all. Relaxing the uniformity requirement helps, but also tends to merge regions corresponding to different objects.

A generic segmentation process has to deal with this problem. In the following, we propose a scheme which utilizes neurons with different sized receptive fields for the segmentation of data given over a two-dimensional input field. The objects detected by the algorithm possess some nontrivial properties, like being compact and enclosed by orientable borders.

Prev: Abstract ^ Up: Table of Contents ^ Next: Global/Local Analysis

© 1994-2003 - all rights reserved.