1. Field of the Invention
The present invention relates to a sensor system capable of measuring the relative position and attitude of moving and stationary objects. In particular, this sensor system can detect and track objects equipped with surfaces that act as retroreflectors in the visible and near-infrared part of the spectrum. Such surfaces are already available in the taillights of all cars, trucks, and motorcycles, as well as in roadway lane markers, and can be easily and cheaply added to objects in other environments, such as railroads, factories, and airports.
2. Description of the Prior Art
One of the emerging trends in technology development is the addition of autonomous capabilities to many new products in ground transportation (cars, buses, trucks, trains), in aviation (commercial aircraft, military drones) and in specialized applications (factory automation, airport service facilities). This trend is greatly facilitated by the miniaturization of electronic components, the rapidly decreasing cost of computing power, and the recent surge of technology transfer from military to commercial applications. These advances have not only made it technically feasible to build systems that would have been unthinkable a few years ago, but have also dramatically decreased the cost of their implementation, thus making them suitable for mass production and commercial deployment.
The motivation for this trend towards autonomous operation comes primarily from considerations of safety, comfort, and cost.
Safety is the main beneficiary in cases where unmanned drones (all-terrain vehicles, airplanes, helicopters) are used in hazardous environments. Examples of such applications include: searching for trapped people in burning buildings, collapsed structures, and spaces filled with poisonous gases; filming exclusive footage of natural disasters such as exploding volcanoes; and military operations for de-mining, reconnaissance, and surveillance behind enemy lines. The use of human-operated vehicles in these environments would endanger the health or even the lives of the operators, and would also impose minimum size restrictions that would make it impossible to explore small spaces where people may be trapped. Increased safety is also the main concern in on-board vehicle systems such as collision warning, collision avoidance, lane departure warning, and lane keeping. These systems warn the driver/operator with an audible and visible signal when the vehicle is about to collide with another object or when it is about to leave its current lane on the roadway, and, if so equipped, they automatically activate the brakes and/or the steering to reduce speed and/or change course to avoid a collision or to maintain the vehicle""s current course.
In applications such as adaptive cruise control, where the speed of the vehicle is automatically adjusted to follow the preceding vehicle at a safe distance, or vehicle following, where the vehicle""s speed and direction are adjusted to follow the course of the preceding vehicle, the main consideration is the comfort and convenience of the driver/operator, with increased safety being a secondary but very important benefit.
Finally, significant cost savings motivate future applications such as electronic towing, highway platooning, automated airport vehicles, and automated manufacturing robots. In electronic towing, two or more commercial vehicles are operated in tandem, with the first vehicle being manually driven by a human operator, and the following vehicles being xe2x80x9celectronically towedxe2x80x9d without drivers, thereby reducing the number of drivers and the associated cost by 50% or more. In highway platooning, traffic is segmented into xe2x80x9cplatoonsxe2x80x9d, each composed of several cars that follow each other at very small distances of 1-2 m, driven not by their human occupants (who can resume manual operation once their car leaves the platoon), but by the on-board electronics that automate the steering, acceleration, and braking functions. This xe2x80x9cautomated highway systemxe2x80x9d has the potential of significantly increasing the traffic throughput of existing highways at a mere fraction of the cost of building new highways that would be able to handle the same additional traffic, while also improving the safety and comfort of the people who use this highway system for their transportation needs. While these applications may be several years away from their actual implementation, the same technology can be used in the near term to automate airport vehicles that carry baggage and goods between terminals and airplanes, at a much lower cost than human drivers. The same concept also applies to factory automation, where driverless vehicles can carry parts that are loaded and unloaded by automated robots.
These applications are currently in different stages of deployment. Collision warning, lane departure warning, and adaptive cruise control systems are already available as commercial products in high-end passenger cars and commercial trucks; unmanned drones are already used in military operations; and automated robots are already fully operational in many modern factories. Collision avoidance, lane keeping, vehicle following, and automated airport vehicles are still under development, but are approaching the point of commercial product release, while electronic towbars and automated highway systems are in the research stage, with several successful demonstrations already completed. The three major factors that differentiate these applications and influence the timeline of their deployment are: (1) whether their operation is autonomous or cooperative, (2) whether they operate in a controlled or uncontrolled environment, and (3) whether their role is passive or active. For example, collision warning systems are autonomous, because they rely only on measurements gathered by the host vehicle and do not require any special modifications to the surrounding cars and highway environment; they operate in the uncontrolled environment of public highways; and they passively warn the driver of an impending collision. Adaptive cruise control is also autonomous and operates in an uncontrolled environment, but it is an active system, since it actuates the throttle and brake to increase or decrease speed in order to maintain a safe distance from the preceding vehicle. Electronic towbar and automated highway systems are active (they actuate the steering in addition to the throttle and brake) and operate in an uncontrolled environment, but they are not autonomous since they rely on cooperation from their environment, namely from the preceding vehicle in the case of the electronic towbar, or from the other platoon members and the roadway infrastructure in the case of automated highways. Finally, airport and factory automation vehicles are active and cooperative systems, but they operate in a controlled environment where unexpected events can be kept to a minimum.
Despite their differences, all these applications share a common trait: they all need sensors that can provide accurate and reliable information about the surrounding environment. From collision warning to automated airport vehicles, and from adaptive cruise control to multi-car platooning, each of these systems depends critically on its xe2x80x9ceyesxe2x80x9d, namely the ranging sensors that xe2x80x9cseexe2x80x9d other cars on the highway or other robots and obstacles on the factory floor, and provide crucial information about how far each of these objects is, which direction it is coming from, and how fast it is approaching.
The currently available sensor technologies can be classified into five main categories: radar (microwave or millimeter-wave), computer vision, time-of-flight laser, sonar, and GPS. These are detailed below in order of increasing utility for the applications discussed above.
Sonar sensors emit acoustic pulses and measure the time it takes for the pulse to bounce off the target and return to the sensor, usually called the xe2x80x9ctime of flightxe2x80x9d. Multiplying this time by the speed of sound yields the distance from the source to the target and back. This process provides very accurate and reliable measurements for targets that are less than 1 m away, but its performance drops off very quickly as the distance increases, and becomes unacceptable for obstacles more than 5 m away. Consequently, sonar is widely used in products whose operating range is up to approximately 3 m, such as systems that help the driver park in tight spaces by providing a visual or audible indication of the distance to the obstacles behind or in front of the host vehicle. In all of the applications discussed above, where the desired operating range is at least 20 m and up to 200 m, sonar is not a viable ranging technology.
Time-of-flight laser uses the same concept as sonar: an infrared laser emits pulses and the sensor measures the time it takes for each pulse to return. The two main differences are that (1) the energy of the laser beam is highly concentrated along a single direction, while the sonar pulses travel in all directions, and (2) the laser pulses travel at the speed of light, not at the speed of sound. The first difference implies that, in order to cover a reasonably wide field of view, the system needs either a lens that disperses the laser beam along the horizontal and vertical directions or a scanning mechanism that automatically points the laser beam at different directions. The advantage of the lens dispersion is that it is easy to implement; the disadvantage is that it makes it impossible to detect the specific direction of the target. One possible remedy for this problem is the use of several laser beams, each with its own small dispersion angle; the number of beams used is proportional to the desired resolution in terms of direction sensing, but also to the complexity and cost of implementation. The scanning mechanism, on the other hand, makes it very easy to detect the direction of the target (it is the same as the direction in which the beam was pointing when the pulse was emitted), but its construction and implementation is very complicated and very fragile, since it involves many moving or spinning parts that must be very accurately positioned with respect to each other. The second difference, namely the fact that the laser pulses travel at the speed of light, means that the time it takes for them to return to the source after being reflected off the target is about one million times shorter than for sonar. Therefore, the instruments that measure this time of flight must be extremely sensitive and accurate: in order to measure the distance to a target 30 m away with an error no larger than 1 m (a not very stringent requirement in the applications we are discussing), the sensor must be able to measure a time interval of 100 ns (30 m/3xc3x97108 m/s=10xe2x88x927 s) with an error no larger than 3.3 ns. While it is entirely possible to measure signals with such accuracy, the corresponding hardware is very expensive. Currently available prototypes intended for mass production use less expensive hardware with lower resolution; as a result, their reported errors are in the order of several meters, which is not suitable for most of the applications we are discussing. Another problem with this technology is that it does not operate reliably in rain, fog, snow, or whenever the road is wet and the preceding vehicle creates xe2x80x9croad sprayxe2x80x9d. This problem is due to the fact that the laser energy reflected from airborne water particles or snowflakes confuses the sensor and results in xe2x80x9cghost imagesxe2x80x9d. This makes time-of-flight laser unsuitable for open-road applications.
Millimeter-wave radar systems transmit a modulated waveform and measure the phase shift of the reflected signal to compute the distance of the target. Since they do not measure the time of flight, they are generally more accurate than time-of-flight laser. Furthermore, their operating frequency is in the order of 10-100 GHz, which means that their wavelength is in the order of 3-30 mm, which is several thousand times larger than the 800 nm wavelength of infrared lasers. The longer wavelength renders water particles, snowflakes, and the irregularities of most surfaces essentially invisible to radar. This has two direct results: First, radar can penetrate rain, fog, snow, and road spray, which makes it ideally suited for use in poor weather conditions. Second, radar waves are efficiently reflected by almost all surfaces and materials found in everyday objects, and therefore radar sensors can detect the presence of almost any obstacle around them. While this property is useful for avoiding potential collisions, it is also the source of the main problem with radar sensors, namely multiple returns. Almost every surface reflects the radar energy, so the returned wave contains the reflections from many different objects that are at different distances and different directions; since these returns are all added into one signal, it becomes very difficult to distinguish the objects that are real targets, such as cars ahead, from others that are not, such as the pavement of the road. This problem is dealt with at both the hardware and the software level with varying degrees of success. At the software level, the solutions include sophisticated algorithms that process the radar returns and attempt to isolate the signals that are produced by targets of interest; these algorithms can be tuned to correctly detect some types of targets, such as vehicles with metal sheet covering, but usually at the expense of not detecting others, such as low-profile fiberglass-bodied sports cars. At the hardware level, the solutions are similar to those employed in time-of-flight laser, including the use of multiple radar beams and scanning mechanisms. Scanning is usually implemented through the use of a multi-beam antenna array whose component antennas have electronically controlled relative phase; appropriate selection of the component phases yields a highly directional overall antenna whose direction of maximum sensitivity scans the desired field of view.
Yet another significant disadvantage of radar is the sensitivity of its own measurements to other similar devices operating around it. The signal sent from the transmitter is reflected in all directions; hence, this reflected signal affects all other receivers operating nearby. As a result, when there are many similar devices operating in the surrounding environment, as would be the case in dense highway traffic, each object in the scene will produce many returns at different time instants, and all of these returns will show up in the signal measured by each receiver. This means that the scene becomes heavily cluttered with multiple returns, and that makes it very difficult to identify the separate targets and reliably compute their respective locations. The problem becomes even worse in the case of vehicles with similar devices traveling in opposing directions of traffic. In that case, the transmitted signal of the oncoming vehicle is much stronger than the reflections of the host signal from surrounding objects. Thus, oncoming vehicles can flood the host vehicle""s receiver and render it momentarily blind.
Computer vision differs from all the above technologies in the sense that it does not transmit anything. In contrast to sonar, laser, and radar, vision is a completely passive sensing approach that simply records images, relying on existing visible light (natural or artificial) to provide the necessary illumination. These images are then processed to extract the information that is needed for the particular application, such as the existence and location of obstacles, or the curvature of the road ahead. The main advantages of computer vision are its high resolution and its ability to track many different targets at the same time. The fact that computer vision can at best detect the same obstacles as human vision means that these sensors do not operate reliably in bad weather and especially at night, if the artificial lighting is inadequate. But the main disadvantage of computer vision is the fact that, in order to realize its potential and provide reliable and accurate data, it has to process images at a rate fast enough for the corresponding application. The computational power required for such real-time image processing depends on the desired accuracy, since higher accuracy is achieved through higher image resolutions, and on the desired speed of response. For applications where the ambient scene is static or changes very slowly, such as a slow factory automation task, these requirements may be satisfied by an inexpensive microprocessor. But for the highly dynamic environment of a busy highway, where it may be necessary to process 20 frames per second and extract the necessary information from each frame in less than 50 ms, the corresponding computing power may be prohibitively expensive. In existing implementations, this obstacle is overcome through the use of specialized image processing techniques that exploit the prior knowledge of the structure of the specific application environment (highway, factory floor, airport) to significantly reduce the computational requirements.
Finally, GPS-based ranging relies on the signals from the satellites of the Global Positioning System. Each host vehicle is equipped with a GPS receiver that processes the available signals to produce a measurement of the vehicle""s current position. An on-board transmitter then broadcasts this measurement to the neighboring vehicles, while a separate receiver receives the transmitted locations of the neighbors (who are assumed to be equipped with the same hardware). Thus, each vehicle knows its own location and the location of its neighbors. The advantages of this technology are (1) that the GPS signals are available everywhere on the planet, and (2) that the necessary on-board hardware is inexpensive. The main disadvantage is that this technology is completely dependent on transmissions from the neighboring vehicles. Since any object that is not equipped with this system cannot be detected by any of its neighbors, this approach can only be used in cooperative scenarios, such as electronic towing or automated airport vehicles, and is entirely unsuitable for any of the near-term autonomous applications, such as collision warning or adaptive cruise control. Another disadvantage is that the position computation based on the commercially available GPS signals is not accurate, with errors in the order of 10-100 m. This problem can be overcome through the use of a Differential GPS (D-GPS) system. In this system, secondary local transmitters at fixed known locations retransmit the GPS satellite signal along with their own position. This allows the D-GPS receiver on a moving vehicle to compute its relative position with respect to the fixed local transmitter, and thus its absolute position, with errors that are claimed to be as small as 2-5 cm. However, this solution amplifies the dependency problem described above, since it requires that not only the other vehicles but also the surrounding environment (roadway, airport) be equipped with GPS receivers and transmitters.
In summary, existing ranging technologies have significant drawbacks, which limit their utility in applications that involve dynamically changing environments. Many of these limitations can be overcome through known techniques, which, however, usually involve a substantial increase in the associated cost of the sensor. Since cost is one of the most important criteria in commercial applications, especially those involving mass markets such as the automotive industry, it would be desirable to develop a sensor technology that can provide accurate and reliable measurements at a reasonable cost.
The present invention discloses a new ranging method that eliminates many of the drawbacks of existing technologies, and does so through the use of low-cost components that are currently mass-produced and commercially available.
The corresponding apparatus has three primary components: (1) a fast on/off illuminator, i.e., a device that generates light and that can be switched on or off in less than 1 ms, such as an array of power Light-Emitting Diodes (LEDs) or a low-power laser, or even the gas-discharge or solid-state headlights used in many modern automobiles, (2) one or more imagers with on-board storage capability, i.e., devices that can record an image and store it on the device itself protecting it from further exposure to light, such as Charge-Coupled Device (CCD) or Complementary Metal-Oxide-Semiconductor (CMOS) imaging chips, and (3) a microprocessor that operates the illuminator and the imagers automatically, and processes the data collected from the imagers to produce ranging information about objects in the imagers"" field of view.
The apparatus can detect objects with retroreflective surfaces, such as those contained in the taillights of all cars, buses, trucks, and motorcycles. The detection of these objects is achieved through the process of image subtraction. The microprocessor first instructs the imager to record an image of the scene in front of it, while the illuminator is turned off; then, the microprocessor turns the illuminator on and instructs the imager to record a second image of the scene. The first image is then subtracted from the second, leaving only the returns of the retroreflective surfaces in the subtracted image. This sparse image is then stored in the microprocessor and processed with appropriate software algorithms whose function is to filter out the noise, identify the targets, and compute the distance and azimuth angle of each detected target through triangulation. The distance can be computed in terms of an absolute measure in meters, or in changes in relative distance, such as a percentage change in a given unit of time or a multiple of some measure of distance in the field of view, such as the distance between the taillights or between the detectors.
In order to guarantee that the subtraction process eliminates all returns except for the reflections of the illuminator""s light from retroreflective surfaces, the two images have to be recorded in rapid succession. The present invention discloses a procedure for drastically reducing the elapsed time between the recordings of the two images. In the simplest embodiment, the bottom ⅔ of the surface of the imaging chip is covered by an opaque mask, which protects the pixels behind it from further exposure. The remaining top ⅓ is exposed and that is where both images are recorded using a four-step xe2x80x9cexpose-shift-expose-shiftxe2x80x9d process: first, the image with the illuminator off is recorded in the exposed part of the chip; second, the contents of the imager are shifted down by ⅓ the total number of rows, which means that the first image now occupies the top half of the area behind the opaque mask and is protected from further exposure; third, the image with the illuminator on is recorded in the exposed part of the chip; fourth, the contents of the imager are again shifted down by ⅓ the total number of rows, which means that the first image now occupies the bottom half and the second image the top half of the covered area, and that both of the pictures are protected from further exposure. Since the process of shifting the contents of the imager down by one row is about 100 times faster than the process of digitizing and reading out one row of data, the on-chip storage scheme renders the invention suitable for use in rapidly changing environments, such as highway traffic.
In particular, the invention is embodied in an apparatus for ranging an object comprising an illuminator to illuminate a field of view potentially including the object and an imager to receive reflected signals from the field of view. The illuminator comprises an LED, a headlight, or a laser. The imager captures a first image having reflected signals from the field of view when the field of view is illuminated by the illuminator and a second image having reflected signals from the field of view when the field of view is not illuminated by the illuminator. A circuit is coupled to the imager to synchronously control the illuminator and the imager, and to generate a subtraction image of the field of view as a pixel difference between the first image and the second image. The imager captures one of the first image and the second image while the other one of the second image and the first image is still captured in the imager.
In one embodiment the illuminator comprises a first and a second illuminator. The first illuminator is arranged and configured to illuminate a near field of view and the second illuminator is arranged and configured to illuminate a far field of view.
The imager comprises an imaging pixel array in which the pixels of the array are organized into a two dimensional array comprised of pixel lines forming a first and a second group of pixel lines. The first group of pixel lines is unmasked and the second group of pixel lines is masked to prevent direct recording of imaged data therein. In a first exposure of the pixel array, the circuit records a first set of image data in the first group of pixel lines when the field of view is illuminated by the illuminator, and then shifts the first set of image data into the second group of pixels. In a second exposure of the pixel array the circuit records a second set of image data in the first group of pixel lines when the field of view is not illuminated by the illuminator, and then shifts the second set of image data into the second group of pixel lines. The second set of pixel lines then contain the stored values of the first and second sets of image data. Since a subtraction image is the goal of the foregoing process, it does not matter whether the first (illuminated) or the second (non-illuminated) images are recorded before the other. Hence, the first exposure can be taken when the field of view is not illuminated by the illuminator, and the second exposure can be taken when the field of view is illuminated by the illuminator.
In one embodiment, the second group of pixel lines comprises a contiguous subarray of pixel lines including two thirds of the pixel array. In a first version of this embodiment the pixel array comprises rows and columns of pixels and the contiguous subarray of pixel lines forming the second group of pixel lines forms a block of columns of the pixels. In a second version of this embodiment the contiguous subarray of pixel lines forming the second group of pixel lines forms a block of rows of the pixels.
In still another embodiment the first group of pixel lines comprises alternating pixel lines in a first half of the pixel array and the second group of pixel lines comprises all remaining pixel lines in the pixel array. In a first version of this embodiment, the pixel array comprises rows and columns of pixels and alternating pixel lines forming the first group of pixel lines forms a set of columns of the pixels. In a second version of this embodiment the alternating pixel lines forming the first group of pixel lines forms a set of rows of the pixels.
In yet another embodiment the first group of pixel lines comprises alternating pixels in each line in a first half of the pixel array with each alternating pixel being offset from ones of the alternating pixels in adjacent lines of pixels to form a checkerboard pattern. The second group of pixel lines comprises all remaining pixel lines in the pixel array. In a first version of this embodiment, the pixel array comprises rows and columns of pixels and the alternating pixel lines forming the first group of pixel lines forms a set of columns of the alternating pixels. In a second version of this embodiment, the pixel array comprises rows and columns of pixels and wherein alternating pixel lines forming the first group of pixel lines forms a set of rows of the alternating pixels.
In yet another embodiment the first group of pixel lines comprises contiguous pixel lines in a middle third of the pixel array, and the second group of pixel lines comprises all remaining pixel lines in the pixel array. In a first version of this embodiment the pixel array comprises rows and columns of pixels and the middle third of the pixel array forming the first group of pixel lines forms a contiguous block of columns of the pixels. In a second version of this embodiment the middle third of the pixel array forming the first group of pixel lines forms a contiguous block of rows of the pixels.
The first and second exposures are taken in time sequence without processing of the image data between each exposure. The first and second images are taken in time sequence separated by a time interval small enough to guarantee that no substantial changes occur between the first and second images of the field of view. The time interval is approximately 10 ms or less.
The circuit further determines distance to the object in the field of view, if any, from the imager. The circuit determines either absolute distance to the object or relative changes in distance to the object in the field of view, if any, from the imager.
In the illustrated embodiment, the illuminator has a substantially single or narrow frequency band. The imager is a camera and further comprises a bandpass filter interposed between the camera and field of view. The filter is centered on the single or narrow frequency band of illumination of the illuminator. The illuminator is modulated and the imager is locked to the modulation to receive reflected signals at the modulation.
In one embodiment, the circuit comprises a computer with a memory. The computer executes several software modules. A driver module activates the illuminator and the imager synchronously with each other to capture the first and second images. An image acquisition module transfers the first and second images from the imager to the circuit. An object detection module detects reflective images in the subtraction image. A ranging module computes the distance to the object.
In one embodiment there is a single imager that is coupled to the circuit, while in a second embodiment there are two imagers coupled to the circuit. The two imagers are separated from each other by a fixed predetermined distance.
The reflected signals indicative of the object are reflected signals from a retroreflective surface, such as taillight reflectors, on the object.
In one embodiment the circuit further comprises a sequence control circuit coupled to the imager for producing a stream of pixels from the imager corresponding to the first and second image. A subtraction circuit is coupled to the sequence control circuit for subtracting the second image from the first image on a pixel-by-pixel basis. An analog-to-digital converter is coupled to the subtraction circuit to generate a digitized subtraction image on a pixel-by-pixel basis. A processor is coupled to the analog-to-digital converter for generating ranging parameters.
Alternatively, the circuit comprises a sequence control circuit, an analog-to-digital converter, a field programmable gate array coupled to the analog-to-digital converter to generate a digitized subtraction image on a pixel-by-pixel basis, and a processor coupled to the field programmable gate array for generating ranging parameters.
Still further the circuit comprises a sequence control circuit, an analog-to-digital converter, an application-specific integrated circuit coupled to the analog-to-digital converter to generate a digitized subtraction image on a pixel-by-pixel basis, and a processor coupled to the application-specific integrated circuit for generating ranging parameters.
The invention is also described as a method for performing ranging as described in connection with the apparatus above. For example, the invention is a method for ranging comprising the steps of periodically or aperiodically illuminating a field of view with an illumination signal, which field of view potentially includes an object. Reflected signals are synchronously received from the field of view with illumination and absence of illumination of the field of view. A first image of the reflected signals is captured from the field of view within an image array when the field of view is illuminated. A second image of the reflected signals is captured from the field of view when the field of view is not illuminated within the array while the first image is still captured within the array. A subtraction image of the field of view is generated which is the pixel difference between the first and second images captured in the array.
The invention can be better visualized by turning to the following drawings, which depict illustrated embodiments of the invention. The invention is expressly not to be understood as necessarily limited by the illustrated embodiments which are depicted.