1. Field of the Invention
The present invention relates to a video processing method and, in particular, it relates to a video processing method that can reduce the load of viewing on a viewer even in a case where all recorded images must be viewed since the presence of the occurrence of an event is not clear.
2. Description of the Related Art
Conventionally, a video surveillance system is deployed in public facilities such as a hotel, a building, a convenience store, a financial institution, a dam and a road for the purpose of the suppression of crimes and/or the prevention of accidents. Such a video surveillance system photographs a subject under surveillance with an imaging apparatus such as a camera and transmits the images to a surveillance center such as a management office and a security room. A surveillance person may monitor the images and be alert on and/or record or save the image or images for the purpose or as required.
In many cases, a recent video surveillance system may use a random access medium typically such as a hard disk drive (HDD) as a recording medium for images, instead of a conventional video tape medium.
FIG. 16 shows a configuration example of the video surveillance system including a recording apparatus having an HDD as a recording medium.
The video surveillance system includes a recording apparatus 301, which is an apparatus generally called digital video recorder, a camera 302, and a monitor 303 having a display unit 321.
The recording apparatus 301 includes a digital converting section 311, an analog converting section 312, compressing section 313, a decompressing section 314, a recording unit 315, an operating unit 316 and a control section 317.
The camera 302 analog-outputs an imaged image as an electric signal.
The monitor 303 displays an input analog image on the display unit 321.
In the recording apparatus 301, an analog image input from the camera 302 is converted to a digital signal by the digital converting section 311. The digital signal undergoes data compression processing by the compressing section 313 and is recorded on the HDD by the recording unit 315. A user operation is detected by the operating unit 316, and, in response thereto, the subject image is loaded from the HDD by the recording unit 315, undergoes data decompression processing by the decompressing section 314, is converted to an analog signal by the analog converting section 312 and is output to the monitor 303. The processing in those steps is controlled by the control section 317.
Here, the control section 317 further includes a CPU (Central Processing Unit). The operating unit 316 may be a generic computer operating device such as a mouse and a keyboard or may be a special control panel having buttons.
FIG. 17 shows another configuration example of the video surveillance system including a recording apparatus having an HDD as a recording medium.
The video surveillance system in this example has a recording apparatus 331, which is an apparatus generally called network digital recorder, a network camera 332 and a surveillance terminal 333.
The recording apparatus 331 includes a network unit 341, a recording unit 342 and a control section 343.
The surveillance terminal 333 includes a network unit 351, a decompressing section 352, a display unit 353, an operating unit 354 and a control section 355.
The network camera 332 converts an imaged image to a digital signal, performs data compression processing thereon and digitally outputs the result over an IP (Internet Protocol) network.
In the recording apparatus 331, a digital image input from the network camera 332 to the network unit 341 is recorded on the HDD by the recording unit 342. A request from the surveillance terminal 333 is received by the network unit 341, and, in response thereto, the subject image is loaded from the HDD by the recording unit 342 and is output through the network unit 341. The processing in the steps above is controlled by the control section 343.
Here, the control section 343 further includes a CPU.
In the surveillance terminal 333, data decompression processing is performed on a digital image input by the network unit 351 by the decompressing section 352, and the result is displayed on the terminal screen by the display unit 353. The operating unit 354 detects a user operation and, in response thereto, transmits a necessary request to the recording unit 331 through the network unit 351. The processing in the steps above is controlled by the control section 355.
Here, the control section 355 further includes a CPU. The operating unit 354 may be a generic computer operating device such as a mouse and a keyboard in many cases. The display unit 353 may be a generic computer display device such as a CRT (Cathode-Ray Tube) and an LCD (Liquid Crystal Display) in many cases.
FIG. 18 shows an example of the operation screen of the surveillance terminal 333 (that is, details displayed on the display unit 353 of the surveillance terminal 333) in a case where the recording apparatus 331 having an HDD as a recording medium is used as described above.
Having described the screen example in a case where the recording apparatus 331 shown in FIG. 17 is used above, the screen has an identical function available to a user to that of the screen in a case where the recording apparatus 301 shown in FIG. 16 is used. Furthermore, there is virtually not a difference between them except for slight differences in display formats and operation specifications due to the difference in types of devices used in the operating units and/or the display unit. For those reasons, the case using the recording apparatus 331 will be described as an example here.
A video display unit 361 is an area displaying an image.
In a playback button group 362, unique playback types are respectively assigned to buttons. In order to give a new playback instruction for an image being displayed on the video display unit 361, the playback type button corresponding to the instruction is to be pressed.
In a camera switching button group 363, cameras subject to recording are respectively assigned to buttons, each of which is to be pressed to switch the recorded image displayed on the video display unit 361 to a recorded image by another camera. This function is generally called camera search.
A date-and-time search button group 364 allows specification (or input or selection) of an arbitrary time. By specifying a time and pressing a search button, the image at the specified time of the currently selected camera is displayed on the video display unit 361. This function is generally called date-and-time search.
An alarm recording list display section 365 displays a list of recorded contents by alarm recording by the currently selected camera for each recording event.
Here, the term “alarm recording” refers to a recording type that records irregularly, that is, every time when a recording event occurs and is paired with normal recording that records at all times or periodically according to a predetermined schedule. Various recording events may occur and may include signal input from an external sensor to a contact terminal provided on a camera or a recording apparatus, trigger based on an image recognition processing result and press of an emergency recording button by a surveillance person.
Each row of the list of alarm recording may display the time of occurrence of a recording event (such as a starting time and an ending time), a type of recording event and/or a reduced image of the beginning image by a recording event, for example. Each of the rows can be selected, and the image of the selected recording event is displayed on the video display unit 361. This function is generally called alarm search.
A function generally called marking search, not shown in FIG. 18, may be also available, which is a function similar to the alarm search. While the alarm recording performs recording for each recording event, the marking recording only performs marking on a recorded image upon occurrence of a recording event. The marking search displays a list of the markings, and the screens and operations may be similar to those of the alarm recording list display section 365.
Next, the playback of images will be described.
FIG. 19 shows a state of the playback of video frames.
More specifically, the horizontal axis is a time axis 372, and the left side is older in time, and the right side is newer in time. A series of images are shown, and one frame 371 of images of a part of the series of images is shown.
Conventionally, the playback at a standard speed in the forward direction may be processing of displaying each one frame from the left to right of the time axis 372 sequentially at predetermined time intervals.
Next, the degree of similarity of images (videos) will be described.
Technologies having significantly developed in recent years with the increase in speed of computers may include image recognition technologies. The image recognition technologies may include a technology of calculating the degree of similarity.
The term “degree of similarity” refers to an indicator for evaluating the similarity between two images, and the expression “the degree of similarity is high” refers to the state that two images are similar. The degree of similarity is calculated based on the feature amounts of images to be compared. The feature amount of an image to be used may be a feature amount based on the color distribution or intensity gradient distribution in the spatial direction and may be selected according to the purpose, that is, according to the type of similarity, such as the similarity in color and similarity in composition, to be obtained.
For example, Non-Patent Document 1 discloses a method of calculating the degree of dissimilarity (that is, the inverted indicator of the degree of similarity) from the feature amounts of images.
More specifically, the similarity between images is defined based on the squares distance between feature amount vectors, and the degree of dissimilarity D (X,Y) of two images X and Y in a case where Nf types of image feature amount are defined is obtained by:
                              D          ⁡                      (                          X              ,              Y                        )                          =                              ∑            i            Nf                    ⁢                      wi            ⁢                                                                            xi                  -                  yi                                                            2                                                          [                  EQ          ⁢                                          ⁢          1                ]            where xi and yi are feature amount vectors of X and Y, respectively, and wi is a weight for the feature amount. The feature amount may be a feature amount based on a color distribution or a feature amount based on an intensity gradient distribution, for example.
Having described the degree of dissimilarity, the result of the subtraction of a degree of dissimilarity from a predetermined value or the inverse value of the degree of dissimilarity may be used, for example. In other words, a high degree of dissimilarity is equivalent to a low degree of similarity, and a low degree of dissimilarity is equivalent to a high degree of similarity.
Patent Document 1: JP-A-7-254091
Non-Patent Document 1: Hiroike and Musha, “Daikibo na Gazou Shugo notameno Hyougen Moderu (Representation Model for Large Image Set)”, SPSTJ Journal No. 1, Volume 66, 2003, p. 93 to 101.
A recording apparatus having a random access medium typically such as an HDD as described above is highly convenient to have a characteristic that an image can be output instantly in accessing a target image, unlike a video tape medium requiring to wait for the completion of an operation such as fast forwarding or rewinding.
However, the convenience is exhibited in a case where the place and/or date and time of occurrence of an event is/are known and the camera and date and time can be specified by using the camera search and date-and-time search. On the other hand, in a case where the presence of occurrence of an event is not clear or a case where the presence of occurrence of an event is to be found, the search functions may not be used, and all of recorded images must be basically played and viewed.
Against the problem, the alarm search and the marking search are greatly effective functions for displaying a list of delimiters to an occurring event in a series of images. The information for the delimiters may be based on the input from an external sensor or a processing result of image recognition processing on an input image. However, those kinds of information may be limited under various conditions including bad conditions such as wind, rain and/or snow and backlighting in image recognition processing and may not have 100% reliability in reality. In a case requiring reliability, all of recorded images must be still played and viewed.
A required time is a problem here in playing and viewing all recorded images. One method for saving time is a method of viewing by fast forwarding and playing. However, in this method, an image part having an event to be watched passes by instantly. Therefore, in order not to miss the image part, a viewer is forced to be highly strained for a long period of time. Conversely, it might be a factor responsible for missing an event to be watched.
In recent years, the capacities of HDDs have been increased, and the amount (or time length) of recorded images has been dramatically increased. Those tendencies will be further strengthened.
The invention was made in view of those matters in the past, and it is an object of the invention to provide a video processing method that can reduce the load of viewing on a viewer even in a case where the presence of occurrence of an event is not clear and all of recorded images must be viewed.