4.4 Spatial Parsing Versus Image Processing

Note that the visual communication objects under discussion are fundamentally symbolic objects, although the symbols are text-graphic. In spite of the fact that they are referred to as `images,' they are very different from the raster image produced by a TV camera scanning a diagram on a blackboard*. The entire lower level of interpretation called recognition has been finessed when a human constructs a visual phrase within the vmacs graphics editor. Figure 4 presents the taxonomy of visual objects available to the vmacs user. Text-graphic objects are either lines or patterns: lines are visual atoms (single drawlines or textlines); and patterns are groups of text-graphic objects. Since the user employs vmacs to construct phrases, they are input as drawn lines and pieces of text (i.e. atomic text-graphic symbols)** in spatial juxaposition. Thus the focus of the research is on the recognition/parsing of the spatial arrangement of atomic visual elements serving as terminals in visual phrases. This is roughly analogous to language understanding work which begins with a typed-in sentence as input rather than starting with an aural image.


A text-graphic object is either a line or a pattern


A line is a drawline or a character or a textline

A drawline is a vector chain drawn through one or more locations

A textline is one or more characters


A pattern is a group of none or more lines and/or patterns


Figure 4. Taxonomy for graphic objects available in the vmacs graphics editor.



* Robert Futrelle has described an expert system under development which will parse X-Y plots in the form of digital binary images [Futrelle85].

**The point of spatial parsing is exactly that the user does not have to manually create higher-level pattern structures (even though vmacs offers that capability).