1. Technical Field
This invention is directed towards a system and method for creating a geometric floor model of a room interactively, based on images of the room.
2. Background Art
Video cameras are sometimes used to monitor people in a room for security, safety, or assistance. These goals can be facilitated by a knowledge of the locations of the room""s walls, doors, furniture, and devices. For instance, the positions of computer screens in a room can be used to decide where to display a message for a person in the room. One way to build a geometric model of the room is to manually measure the locations of the room""s contents with, say, a tape measure. This is tedious, and the manual work must be repeated each time one of the objects in the room is moved.
The present invention overcomes the aforementioned limitations in prior room modeling systems by a system and method that allows the user to interactively generate a geometrically accurate floor map of the room without having to measure the locations of various objects in the room first. Based on images of the room, the system and method according to the present invention lets a user interactively build a floor plan of the room by dragging pictorial representations of the objects in the room on a computer monitor screen.
By way of overview, this system and method according to the present invention begins with one or more images of a room. These images preferably include views of calibration markers that were placed at known (x,y) locations on the floor of the room. It is necessary to have at least one view or image of everything the user desires to have modeled in the floor plan. However, more than one view of each item is not necessary. The calibration markers may be marked with an index number and possibly their (x,y) locations in a world coordinate system in each image. The user preferably runs a program that lets him or her select the centers of the calibration markers in each image displayed on a computer monitor screen to establish a correspondence between the absolute (x,y) coordinates on the floor and their image coordinates in each view. These correspondences are used to compute a xe2x80x9chomographyxe2x80x9d for each image. The homography is in turn used to compute a warped or ground plane version of each image, showing what the room would look like if viewed from above. These warped or ground plane images are each preferably rendered with respect to the same origin. With the images warped to a common viewpoint and scale, they can be used as guides in a conventional drawing program such as MICROSOFT VISIO(copyright). For example, the user can drag icons of furniture, walls, etc. on top of the warped images. The warped images are then deleted, leaving behind the room""s objects in their correct locations to create an accurate floor plan of the room. The next paragraphs detail each of these steps.
For each image a homography is computed that is later used for warping. The homography is a mathematical mapping from physical ground plane (floor) coordinates (x,y) to pixel coordinates (u,v) in the image. To compute the homography for each image, there must be at least four (u,v) points in the image for which the corresponding absolute (x,y) coordinates on the ground plane are known. One way to get these known (x,y) points is to measure certain visible floor points in the room, like the corners of a rug, with a tape measure. Another method, as discussed previously, is to lay down calibration targets at known (x,y) locations on the floor.
As discussed above, an index number may be associated with each prescribed point or calibration target, and the system and method according to the present invention uses a previously generated data file that gives the (x,y) location of each target based on its index number. To determine the correspondence between the (x,y) locations in the room and their associated positions in each image, the user selects the calibration target centers in the image with a mouse or other input device and enters the corresponding index number for each. The point thus selected is taken to be the (u,v) image location of the target in pixels. This process of building up the data file could also be performed automatically, for example, by using character recognition. This is especially feasible if the index number and (x,y) locations of each target are indicated on each image. The program thus builds an array of corresponding (u,v) and (x,y) points for each image. When the user has entered four corresponding (u,v) and (x,y) points for a particular image, the homography can be calculated directly. If the user has entered more than four corresponding points, a least squares approach can be employed to compute a potentially more accurate homography. It is noted that once the homography has been generated, this data can be used without repeating the calibration process unless the camera used to generate the image employed to compute the homography is moved.
Given a (x,y) point on the floor or ground plane, the homography, H, gives the corresponding (u,v) pixel coordinates in an image. The homography is used to make an image of the ground plane from the camera image. The ground plane image starts with a 2-D array of ground plane points. This array of points extends from xmin to xmax in the x direction and from ymin to ymax in the y direction. The points are separated by xcex94x and xcex94y in the x and y directions, respectively.
A ground plane image is made by taking each (x,y) point in the ground plane array, computing the corresponding (u,v) pixel location in the image (with rounding to get integer pixel coordinates), and placing that pixel (pixel color intensity) in that cell of the ground plane array. It is preferable to make each cell of the grid of the ground plane a size that corresponds to the area depicted in each pixel of the image of the room under consideration, to ensure a good resolution for accurate modeling. If the computed (u,v) falls outside the bounds of the image, then the pixel value of that cell is made black. The points (xmin, ymax) and (xmax, ymax) are selected to cover the region of the room being modeled, and (xcex94x, xcex94y) is selected such that ground plane images of about 500xc3x97500 pixels are achieved. To make the scale isotropic, xcex94x=xcex94y is typically selected.
A ground plane image is generated for each camera image using the above-generated array. These ground plane images are then imported one by one into a drawing program such as MICROSOFT VISIO(copyright). The ground plane images all have the same scale because they are preferably all based on the same (xcex94x, xcex94y). The coordinate axes of the ground plane image may be drawn in the images to make the images easier to superimpose when importing the images into the drawing program.
The drawing program is then used to make the room model. Using a couch as an example, to begin with a scale model of the couch is made in the drawing program. The same scale is used as was for the ground plane images. The model""s dimensions can come from manually measuring the actual couch in the room or by inferring the dimensions from the ground plane image. The couch model drawing is dragged into the appropriate location on top of the ground plane image. While the ground plane image of the couch is unrealistically distorted because it extends above the ground plane, the bottom of the couch where it rests on the floor is accurately represented in the image. If the model of the couch is lined up with the points where the real couch touches the floor, the drawing will be accurately placed in the room. The user may then continue dragging in other pieces of furniture, walls, and doors. The user repeats the dragging and placing process for each warped image, overlapping them so their x and y axes correspond. If the current image shows an object that was already placed based on one of the previous images, that object can be ignored in the current image. After the user has dragged in all the drawings, he or she deletes the images and is left with the floor plan. The user may also embellish the drawing with other elements to make a model of the room.