There is a multitude of systems currently available for performing image interpretation tasks. Security monitoring devices, road traffic monitors, people counters in lobbies and malls, and countless additional applications. These systems consist of a front-end having an image acquisition unit, possibly a computational device that performs some computations such as image compression, image formatting, or internet access, and a back-end that includes a computational device and/or a human interface mechanism. The computational device of the backend is responsible for most, or all of the computations performed in the system.
FIG. 1 is a schematic block diagram showing the architecture of existing image acquisition and interpretation systems. Standard acquisition devices 100-103 are installed at the required site. These are more often image acquisition devices, but may also include sensors of other types. In the case of image acquisition devices, either analog or digital video cameras, or other off-the-shelf cameras are used. These deliver standard frame rate and resolution, usually in color. The end-units may sometimes include an image processing device, used for either image compression or for Internet connection. Communication channels 110 are then used to transmit the raw or compressed images to a backend computation device 120. This sometimes consists of multiple processing units 121-124. The communication means 110 are most often cables, either analog or digital but can also sometimes be wireless. The processing unit 120 is sometimes near to the acquisition device, as for example in a home security application where the distance can be a few meters, up to a few tens of meters, or else it can be a long distance away, as in the case of highway traffic control systems, where the distances covered may be many miles. Depending on the system, the backend processor may include one or more of the following applications: image recording and storage 130, usually for regulatory and insurance requirements; image analysis, compression, motion and alert detection, or any other application performed on the main processing cabinet 120; application workstation 140 that allows computerized and/or manual analysis and operation of additional parts of the system (opening/closing gates, illumination control and more); and a monitor-wall 150 with observers looking at the video streams. The entire system is connected by a local area network 125. The person skilled in the art of modern computerized surveillance systems will appreciate that this is a basic configuration and a large variety exists between different systems. However, all these systems have in common the fact that image acquisition and image analysis are partitioned into two parts, front-end and backend, where the major part of the processing is performed in the backend and any front-end processing is limited to image compression, network access or format changing.
In simple systems, raw images only are presented to the operator and/or stored in a storage device. In such systems, the computational part of the front-end may perform tasks of image compression, communication, Internet access etc., all of which are designed to facilitate the communication of the captured images to the backend. In more elaborate systems, there is some automatic analysis of images, performed either by the backend or by the front-end or by both. In such cases, the front-end may perform comparison of an image to a “standard” pre-stored image. However, in all prior art systems, a large part of the computation required for interpretation and understanding of the image is performed by the backend, or else the quality of the automatic interpretation of the system is very low. This means a wide transfer of information from front-end to backend, a large expense in communication and computational means, and as a consequence a high price for the system.
All existing systems use a standard, off the shelf image acquisition device that provides too many pixels at a frame rate that is too high, use standard algorithms that perform expensive processing steps such as edge detection, and as a consequence must rely on large, expensive hardware that cannot be integrated into a small independent unit.
Systems for image acquisition and interpretation are subject to several requirements. First, the system must compensate for varying levels of illumination, such as for example day and night, cloudy or bright day and so on. This requires more than a simple change of shutter speed or other means of exposure compensation, since for example comparing the illumination of a scene at morning to one in the afternoon shows that the illumination in different parts of the scene is changed differently, due to variations in color, angle, texture and additional factors. Second, the system must be able to disregard slow or repeating changes in the scene such as moving shadows, growing plants, tree limbs moving in the wind, falling snow etc. Third, the system must be able to discern automatically between areas that are very noisy (for example a street corner with heavy traffic) and a quiet part (area behind a wall or fence), and be able to adapt itself to maximal detection relative to the objective conditions.
Most existing algorithms for object extraction use computation-intensive steps such as edge detection, object morphology, and template comparison. Additionally, systems that analyze video often require large memory storage space since a number of frames is stored in the memory to allow proper analysis.
JP8077487A2, assigned to Toshiba Corp., discloses an on-road obstacle detecting device. The detection is done by comparing an initial background image with incoming images, to detect a change between the two images.
JP2001126069, assigned to Matsushita Electronic Ind. Co. Ltd., discloses a picture recognition method, whereby an incoming image is compared with a pre-stored image by detecting a part where the difference in luminance is greater than a pre-defined threshold, thus reducing the area of investigation.
U.S. Pat. No. 6,493,041 to Hanko et al discloses a method and apparatus for detection motion in incoming video frames. The pixels of each incoming digitized frame are compared to the corresponding pixels of a reference frame, and differences between incoming pixels and reference pixels are determined. If the pixel difference for a pixel exceeds an applicable pixel difference threshold, the pixel is considered to be “different”. If the number of “different” pixels for a frame exceeds an applicable frame difference threshold, motion is considered to have occurred, and a motion detection signal is emitted. In one or more other embodiments, the applicable frame difference threshold is adjusted depending upon the current average motion being exhibited by the most recent frames, thereby taking into account “ambient” motion and minimizing the effects of phase lag. In one or more embodiments, different pixel difference thresholds may be assigned to different pixels or groups of pixels, thereby making certain regions of a camera's field of view more or less sensitive to motion. In one or more embodiments of the invention, a new reference frame is selected when the first frame that exhibits no motion occurs after one or more frames that exhibit motion.
The system disclosed above does not attempt to discern any pattern in the detected changed pixels, thus it is prone to false alarms, since a change in illumination and a change in the scene would both be considered a change. Moreover, a fixed threshold is used by the system to define a change, making the system insensitive to varying illumination conditions. The reference against which incoming images are compared is an image of the scene, giving the system diminished detection potential, due to potential noise and other factors pertaining to one image taken under certain ambient conditions.
Scene interpretation and image recognition systems have a wide variety of applications, some of which are listed below.
Security Systems: The terror wave which has attacked the world in the last 2 years creates the need to defend thousands of kilometers of strategic infrastructure lines such as electric lines (high voltage lines), railroads, water supply lines and public institutes, not to mention international borders. The existing solutions are expensive and are based massively on manpower. This new field, sometimes called homeland defense, is growing in importance all around the world.
Providers of Camera Based Surveillance Systems:
American Security Systems Inc.; Vicon Industries, Inc.; CCS International Inc; Visor Tools Inc, Madrid, Spain; Mate-CCTV LTD, Israel; Sensus Technology Ltd.
Airports: Everyone is familiar with the rush of crowds at airports on any given day, as tens of thousands of people rush from point to point attempting to make connections, keep track of their family members and luggage, grab a bite to eat, and shop. There is a need for an inexpensive, reliable people traffic monitoring system, which will allow airport authorities, vendors, and others to plan effectively based on this flow of people. In today's security threats it should enable better control during an emergency event, such as knowing the number of people in each wing, section, hall and room.
Transportation—Trains: High volume commuter rail systems can greatly benefit by understanding the number of passengers that make use of their service. Ticket sales data provides information regarding paying passengers, types of tickets sold, etc., however ticket sales do not provide information regarding the actual number of passengers making use of the train service in specific travel. Moreover, the distribution of passengers between the train's carriages is important for optimization of the size of the train. During emergency events, knowing the number of people per carriage is critical.
Providers of People Traffic Monitors:
Sensus Technology Ltd., UK; International Communication & Electronics Group, USA (Traffic Pro); Acorel French; CEM Systems Ltd.
Malls and Shopping Centers: There are important marketing needs, which can benefit from monitoring people traffic. Questions such as how many people enter your shop or pass your display, how many customers do or do not make a purchase, correct staffing levels to handle the number of customers, and the adequacy of walking spaces in the shop or display room to handle the pedestrian flow—are of extreme importance for business planning and management. Knowing when and where a customer enters the store can vastly improve on operating effectiveness. By integrating people counting systems with sales data, retailers can obtain conversion ratios or average spending per head and manage cost effectiveness better.
Providers of People Counters for Marketing Research:
Elmech CO., UK; Chamber Electronics, UK; Watchman Electronics, NZ;
RCT Systems Inc from Chicago. USA; FootFall, UK.
Elevator Management: A lot of innovation has been invested in optimizing the operation of a “fleet” of elevators in big buildings (3 elevators and up). Some solutions use queue management with algorithmic scheduling, others rely on artificial intelligence based solutions. Still, we can find ourselves waiting a lot of time for elevators in busy buildings only to find out when the elevator stops that it is full. Simply knowing how many people are in the elevator and how many are waiting at the elevator lobby in each floor can improve the service dramatically. The advantage is not only in service. Improving the efficiency of elevators can reduce operational and maintenance costs, and may help reduce the number of elevators in new buildings.
Industrial management: Many industrial manufacturing processes have a need for counting or overseeing the manufacturing process of the product. These systems must work with very high speeds and high-resolution photography, to correctly count the various products produced.
Providers of Industrial Sensors:
Omron Corporation (Omron Group).