Video camera control and image display systems are used in a variety of applications including video conferencing, security systems, and plant monitoring systems. In a typical video camera system, images of locations, objects, or persons captured by one or more video cameras are displayed to a user on one or more remote computer display screens or television monitors. In these systems, the cameras are not always fixed. For example, cameras which are capable of panning, tilting, and zooming may be used. A user may pan or tilt a camera to monitor a different scene or may zoom a camera to observe selected area of a scene in more detail.
One disadvantage of these video camera systems is that when a camera pans, tilts, or zooms, the user may find it difficult to determine what portion of a location or object he is observing. In other words, while the user may have gained a detailed view of a region of the location or object that is of interest to him, he may lose sight of the context within which that region is positioned.
One solution to this problem is to use multiple display screens. For example, a first display screen may show an image of an entire scene as captured by a first camera while a second display screen, spaced from the first, may show an expanded image of a region of the entire scene as captured by a second camera. However, this solution is costly as it requires the use to two display screens and the space necessary to locate them.
A different solution to this problem is to use a single display screen on which the images from two or more cameras may be displayed. In typically security systems, for example, a single monitor may display images from sixteen cameras in a four-by-four array. This is achieved by performing an image reduction process on each source image to reduce its size. As another example, in typical television monitors, picture-in-picture techniques may be used to display two video images simultaneously. However, these solutions are also costly as they require the use of relatively large display screens in order to present the multiple images at a size that is comfortable for user viewing.
Thus, with a single display screen, the images displayed will be relatively small even though the display screen will generally be chosen to be relatively large, whereas with a number of separate display screens, the images will generally be relatively large but spaced from the user even though the individual display screens may be relatively small. Such solutions therefore generally involve the striking of a compromise between having the system as a whole small enough for all the images to be readily observable by the user without inconvenience, and having the individual images large enough for small details to be readily observable.
In any event, with either solution, the relationship between images of the full scene and regions-of-interest in that full scene may be lost to a user. This is often referred to as the “screen real estate problem”.
In addition, video camera systems generally provide for the remote control of video cameras. The terms pan, tilt, zoom and focus are industry standards which define the four major axes for which a camera may be adjusted. Traditional video camera systems provide for rather rudimentary control of these camera functions. That is, the user has a control panel for manually controlling camera functions, such as buttons for up/down, left/right, zoom in/out, and focus. The user can also typically select one of several preset camera settings so that, by the press of a single button, a camera will automatically position and focus itself at some preselected target or region-of-interest. Of course, the preset function requires planning because the camera must be manually adjusted for the preset, and then the settings stored. The preset button then merely recalls these settings and adjusts the camera accordingly. If a location has not been preset, then the user must manually adjust the pan, tilt, zoom, and focus settings for that location.
However, these controls are not intuitively obvious or easy to use, partly because the user may think that the camera should pan in one direction to center an object whereas, because of the of the camera with respect to the user and the region-of-interest, the camera should actually move in the opposite direction. When the system has multiple cameras which are subject to control by the user, typical systems require the user to use buttons on the control keyboard to manually select the camera to be controlled, and/or assigning separate keys to separate cameras. Frequently, the user will select the wrong camera, or adjust the wrong camera. These problems are pronounced in two camera systems when a user has the lost the context of the region-of-interest as captured by one camera within the full scene as captured by a second camera.
A need therefore exists for a video display system that can effectively display both an image of a full scene (i.e. a context image) as captured by a first camera and an image of a region-of-interest (i.e. a detail image) within that full scene as captured by a second camera. A further need exists for a video camera control system that can effectively control cameras for capturing detail and context images. Consequently, it is an object of the present invention to obviate or mitigate at least some of the above mentioned disadvantages.