The invention pertains to detection of people, and particularly to detection of occupants in vehicles. More particularly, it pertains to image fusion in the near-infrared band under various kinds of environmental conditions.
The gathering of usage statistics in the high occupancy vehicle (HOV) lane is desired by some government agencies. These statistics are crucial for construction planning. Currently, the gathering of data is performed manually. This approach is obviously laborious, inefficient, and prone to error.
There are compelling reasons for the existence of an automatic occupant counting system in the HOV lane. It would facilitate the gathering of statistical data for road construction planning. It would enable state authorities to charge a nominal fee to single occupant vehicles in HOV lanes. It would also help the state patrols to perform their monitoring tasks more effectively.
The occupant counting system needs to be reliable. In a sensing system, reliability is rarely achieved if the signal is corrupted with noise. The first concern in the present effort is to produce a signal with as distinct a signature for the vehicle occupant as possible. This goal can be achieved only through careful design and arrangement of the sensing elements.
If one manages to acquire a clear imaging signal through sensors, then even moderately powerful pattern recognition algorithms could accomplish the occupant detection task. If, however, the imaging signal were noisy, then even the most powerful pattern recognition algorithms could not accomplish the task.
Related efforts by others have involved the use of a near-infrared camera (0.55 to 0.90 micron) and a near-infrared illumination source in the same range of wavelengths. One reason for using near-infrared sensing was the ability to use non-distracting illumination at night. Illumination at nighttime enhances the quality of the image. However, it appears that the choice of range of wavelengths is not appropriate because of its close proximity to the visible spectrum. Psychophysical experiments have shown that the human eye has some sensitivity to this range of near-infrared wavelengths, however small. This sensitivity may be sufficient to potentially cause accidents under certain conditions. Another reason for this approach, according to others, was to bypass the problems caused by solar illumination during daytime, such as glare. Nevertheless, particularly in that range (i.e., 0.55 to 0.9 micron) solar illumination is still substantial and the associated glare can be reduced only through the use of polarizing filters.
In more general terms, related art projects that involve imaging usually adopt the use of visible spectrum cameras. The strong point of the visible spectrum approach is that the relevant imaging sensors are very advanced and at the same time the cheapest across the electromagnetic (EM) spectrum. Visible spectrum cameras have a particular advantage in terms of speed, which is an important consideration in the HOV lane where vehicles are moving at rates of speed of 65 mph. These cameras can also have very high resolution, resulting in very clear images under certain conditions. Unfortunately, there are serious problems with the visible spectrum approach. For instance, some vehicles have heavily tinted window glass to reduce glare from solar illumination. This glass is nearly opaque to visible spectrum cameras. Also, visible spectrum cameras do not have operational capability during nighttime.
Many researchers adopt the visible spectrum as the spectrum of choice, or, in rare cases, some other EM spectrum based primarily on intuition. The result is that they usually end up with a non-discriminating signal that makes the detection problem appear more difficult than it actually is. Then, they try to address the difficulty by devising powerful pattern recognition algorithms but often to no avail. The loss of information because of a poor sensor choice, spectrum, and arrangement is usually irrevocable.
Visible spectrum or very near infrared detection of people in vehicles has not been very successful under most conditions. The glare and other problems caused by solar illumination, such as through vehicle windows, has prevented effective detection of vehicle occupants. Also, environmental conditions like weather obscure detection. People appear to have darker or lighter faces, depending on the characteristics of the people being detected, and on the incident angle and intensity of deliberate or incidental illumination. Other wavelengths of the EM spectrum do not appear to offer inexpensive, compact, and high resolution sensing and detection of human beings in vehicles.
The lower portion of the EM spectrum consists of the gamma rays, the x-rays, and radiation in the ultra-violet range. Radiation of such wavelengths is harmful. This radiation is typically used in a controlled manner in medical applications.
At the far end of the EM spectrum, there is the microwave and radio radiation. This range was recently started to be exploited for imaging purposes. Sensors operate in an active or in passive mode. The major advantage of these longer wavelengths is that they can penetrate clouds, fog, and rain for producing weather-independent imaging results. The technology for these wavelengths is new, and prohibitively expensive. Also the sensors are bulky in this range of radiation, and feature very low resolution. A useful application of these sensors is currently confined to the military and the remote-sensing domain.
The present invention utilizes radiation in the middle region of the EM spectrum regarded as the infrared spectrum. This spectrum includes wavelengths from 0.7 to 100 microns. Within the infrared range, two bands of particular interest are the 0.7 to 3.0 micron, 3.0 to 5.0 micron and 8.0 to 14 micron bands. The latter two bands are regarded as the thermal infrared band and the first band as the reflected infrared band. The reflected infrared band is associated with reflected solar radiation that contains no information about the thermal properties of materials. This radiation is for the most part invisible to the human eye. The thermal infrared band, on the other hand, is associated with the thermal properties of materials.
The thermal infrared band is significant for several reasons. First, the human body maintains a relatively constant temperature of about 37 degrees Celsius (C), irrespective of physical characteristics or illumination conditions. This indicates a consistent light color pattern for the faces of vehicle occupants subject to thermal infrared imaging. This consistency is lacking in the visible spectrum. Such consistency facilitates interpreting sensed images. Further, the thermal property serves as a differentiator between humans and dummies. Also, a sensor functioning in the thermal region is operational day and night without any need for an external illumination source.
However, one concern is the attenuation of thermal infrared radiation caused by glass, when detecting humans in a vehicle. The glass severely disrupts the transmission of infrared radiation at wavelengths greater than 2.8 microns. At 2.8 microns, thermal energy just begins to appear. To obtain an infrared image under such conditions, one needs a very sensitive mid-infrared camera in the range from 2.0 to 3.0 microns. Vehicle windows are not made from common glass for reasons of safety, energy efficiency, and visibility. Also, the composition of the front windshield differs significantly from the composition of the side windows of a vehicle. The side windows are more transparent to the transmission of thermal infrared radiation. However, detection with a near-infrared camera significantly reduces this problem of radiation attenuation.
A near-infrared camera, if it is restricted to the appropriate range, outputs similar imaging signals for various humans despite their having different colors of skin. However, this camera outputs a much different imaging signal for a dummy having the same visible color as the human skin.
One embodiment of the present invention has two cameras of different sensing wavelengths in the near-infrared bandwidth. These cameras are pointed toward a place where humans may be detected. A near-infrared lamp for the illumination of the scene may be used. The two outputs of the cameras are fused together with a weighted difference to result in an image having an intensified contrast. The image output of the device that performs the fusion of the two camera images goes to a post-processor, which performs binary thresholding on the various pixels of the fused image. The result is an image with each pixel either being black or white. The thresholded output undergoes such operations as fuzzy neural network or analytical processing. The thresholded output diminishes all of the background of the viewed scene, except human skin such as faces. This approach is one embodiment of the human detector.