The detection of rectangular 3-D objects in the presence of other structures can have many applications. For example, packing boxes, trailers of tracks and rectangular buildings are a few objects which could be detected with the ability to detach 3-D (three dimensional) rectangular solids. The detection of these objects may be applied in variety of uses, such as: security systems, surveillance systems, targeting systems in weapons, and other military, commercial and consumer products. Detection of rectangular solids can also be used for monitoring tracks in a parking lot or highway, and detecting buildings. Consequently, the detection of rectangular solids can be very beneficial in a variety of fields.
In regards to detecting buildings, a system may find roof tops (mostly piece-wise rectangular roof tops) using edge contours and line segments of vertical views of buildings (R. Mohan and R. Nevatia. Using perceptual organization to extract 3-D structures. IEEE Transactions on Pattern Analysis and Machine Intelligence, 11 (11): 1121-1139, November 1989.). Another system uses both an edge and a region finder; they use shadows to confirm evidence of a building (Y-T Liow and T. Pavlidis. Use of shadows for extracting buildings in aerial images. Computer Vision, Graphics, and Image Processing, 49: 242-277, 1990.). Another system has been developed to find piece-wise rectangular roofs in aerial images (P. Fua and A. J. Hanson. Objective functions for feature discrimination theory. In Proceedings of the DARPA Image Understanding Workshop, pages 443-460, Palo Alto, Calif., May 1989. Morgan Kaufmann Publishers.).
A non-stereo approach for building detection has also been developed (V. Venkateswar and R. Chellappa. A framework for interpretation of aerial images. In Proceedings, International Conference on Pattern Recognition, Volume 1, pages 204-206, Atlantic City, N.J., June 1990.; V. Venkateswar and R. Chellappa. A hierarchical approach to detection of buildings in aerial images. Technical Report CAR-TR-567, Center For Automation Research, University of Maryland, August 1991.). This approach is based on the correspondence between the building of interest and its shadow. The shadow, along with the position of the sun, is used to estimate the dimensions of the building. This approach is also based on edge contours (V. Venkateswar and R. Chellappa. A hierarchical approach to detection of buildings in aerial images. Technical Report CAR-TR-567, Center For Automation Research, University of Maryland, August 1991.). Other related systems have also been developed (Z. Aviad and D. M. McKeown, Jr. The generation of building hypothesis from monocular views. Technical Report CMU-TR-, Carnegie-Mellon University, 1991.; E. L. Walker, M. Hera, and T. Kanade. A framework for representing and reasoning about three-dimensional objects for vision. AI Magazine, 9(2): 47-58, Summer 1988.; M. Herman and T. Kanade. Incremental reconstruction of 3-D scenes from multiple, complex images. Artificial Intelligence, 30: 289-341, 1986.).
An alternative approach could be to use template matching to detect boxes (J. Ooi and K. Rao. New insights into correlation-based template matching. In SPIE Conference on Applications of Artificial Intelligence IX, OE/Aerospace Sensing Symposium, Orlando, Fla., April 1991.; A. Margalit and A. Rosenfeld. Using probabilistic domain knowledge to reduce the expected computational cost of template matching. Computer Vision, Graphics, and Image Processing, 51(3): 219-234, September 1990.; X. Li, M. Ferdousi, M. Chen, and T. T. Nguyen. Image matching with multiple templates. In Proceedings of Computer Vision and Pattern Recognition Conference, pages 610-613, Miami Beach, Fla., June 1986.; A. Goshtasby, S. H. Gage, and J. F. Bartholic. A two-stage cross correlation approach to template matching. IEEE Transactions on Pattern Analysis and Machine Intelligence, 6(3) 374-378, May 1984.; S. L. Tanimoto. Template matching in pyramids. Computer Graphics and Image Processing, 16: 356-369, 1981.). These techniques, however, rely on a priori knowledge of the specific object and viewing conditions. In general these methods are suitable for simple images in which the object size and image intensity stay almost constant. For these techniques to be successful for rectangular solid detection, the system will require a very large number of templates to capture different orientations, image size and image intensity. Even with such a large number of templates these techniques will be restricted to a particular kind of box.
In related work, others have used line labeling and segmentation of scenes with polyhedral objects (A. Guzman. Decomposition of a visual scene into three-dimensional bodies. In AFIPS Proceedings Fall Joint Comp. Conf., volume 33, 1968.). This method, however, assumes that perfect edges from the object are formed. In this method, the edges and the junctions are used to trace the full contour of the rectangular object This technique performs satisfactorily only if a full view of the object is available. Other related work exists in the blocks world research (G. Falk. Interpretation of imperfect line data as a three-dimensional scene. Artificial Intelligence, 3: 101-144, 1972; Y. Shirai. Analyzing Intensity Arrays Using Knowledge About Scenes, chapter 3. McGraw-Hill Book Co., New York, 1972. Editor, P. H. Winston; D. L. Waltz. Generating Semantic descriptions from drawings of scenes with shadows, chapter 2. McGraw-Hill Book Co., New York, 1972. Editor, P. H. Winston; D. A. Huffman. Impossible Objects as Nonsense Sentences, volume 6, pages 295-323. Edinburgh University Press, Edinburgh, 1971. Edited by B. Meltzer and D. Michie; M. Clowes. On seeing things. Artificial Intelligence, 2(1): 79-116, 1971.).