In the present invention, a visual device has been developed as a device controlling a moving camera and carrying out image processing (for example, refer to Published Unexamined Japanese Patent Application No. 2001-43385, Published Unexamined Japanese Patent Application No. 2001-101403, Published Unexamined Japanese Patent Application No. 2001-148006, Published Unexamined Japanese Patent Application No. 2001-148022, Published Unexamined Japanese Patent Application No. 2001-148024 and PCT Publication Number WO 00/16259). The visual device searches an object and carries out image processing of the object, by controlling a mechanism of pan, tilt and zoom. Image processing which the visual device carries out is mostly local processing, and the local processing can be carried out in parallel by array operation units arranged in the shape of a lattice. The visual device, however, has mainly four problems. First, a figure/ground separation means needs huge computational complexity, in order for nonlinear oscillators to separate at least one object area and a background area. Second, a pattern matching means needs many template images, in order to recognize correctly a normalized image, in which colors and patterns of the object are mixed. Third, a geometrical analysis means must carry out global processing like Fourier transform, Affine transform and Hough transform, in order to detect rough form of the object in an image. Fourth, an area normalization means needs a processor comprising a divider for carrying out division by a natural number, or a look-up table for memorizing a reciprocal number of the natural number, in order to complement between pixels of a normalized image. Therefore, these means heavily hindered from manufacturing of a high-performance image sensor comprising the visual device.
First, in a past figure/ground separation means, each nonlinear oscillator used a random number as an external noise, or formed edge information constructing a pixel of a formed edge-information image inputted. Thus, there is no correlation between the nonlinear oscillator within an object area or a background area segmented by the formed edge-information image and the external noise. As a result, each nonlinear oscillator can not shift its phase from phases of nonlinear oscillators at its neighbors until the phases of these nonlinear oscillators reach suitable combination. This is a factor that computational complexity of the figure/ground separation means increases. By using a position/size detection means, however, the visual device can generate a redundant-information image representing an approximate position of a center of gravity and size of an object area segmented by the formed edge-information image. Since each nonlinear oscillator inputs redundant information constructing a corresponding pixel in the redundant-information image as the random number, each nonlinear oscillator within the object area shifts its phase in order, from the approximate position of a center of gravity toward its circumference.
Considering these facts, a figure/ground separation means comes to separate an object area and a background area more quickly than the past one because a position/size detection means detects an approximate position of a center of gravity and size of the object area segmented by a formed edge-information image.
Next, in a past visual device, a pattern matching means needs a great number of template images, in order to carry out pattern matching of a normalized image, in which a segmented object area in a digital image corresponding to an object area is normalized. The reason is that the normalized image is generally a multi-band image with noise, and also that the pattern matching means compared the normalized image with many template images, without distinguishing colors and patterns of an object represented by the normalized image. In short, at least the same number of the template images as the number of combinations of the colors and the patterns of the object are desired. Consider here two cases of carrying out pattern matching of the colors and the patterns of the object represented by the normalized image, respectively. Suppose first that a color of the object is a color represented by the most of pixels in the normalized image. The pattern matching means can detect the color of the object represented by the normalized image, by preparing only the same number of template images as the number of colors to detect, where the template images are filled by a color different from each other, among the colors to detect. In addition, even though position and size of the object in the digital image are changed, the pattern matching means can detect a color of the object, by comparing a color of each pixel within the segmented object area with colors of the template images. Therefore, it is not necessary for the segmented object area to be normalized. On the other hand, as concern the pattern of the object, suppose that an edge-information formation means once generates a formed edge-information image from the segmented object area, instead of the normalized image corresponding to the segmented object area, followed by that a geometrical analysis means uses the normalized image corresponding to an image generated from the formed edge-information image. Since at least one pixel in the normalized image denotes form and size representing a part of pattern of an object at its circumference, the pattern matching means can easily select the most similar template image with the normalized image, in spite of position and size of an object area.
Considering these facts, plurality of pattern matching means for colors and patterns come to reduce the number of template images very much, respectively, because a visual device individually processes a color and a pattern of an object represented by an object area, using plurality of the pattern matching means.
Next, when the number of template images increased, a past geometrical analysis means used a suitable combination of extracting only contour from edge information of an object in an animation image by using Fourier transform, normalizing size of the contour of the object in the animation image by using Affine transform, and specifying form of the object in the animation image by using Hough transform. However, since these transform methods not only process an image exactly but also carry out global processing, implementation of a visual device by hardware is not suitable. Pay attention here that the geometrical analysis means detects approximate form of the object. If the geometrical analysis means can derive position and inclination from some parts of the contour of the object hashed, and if it can collect the inclination at the center of gravity of the contour of the object, the geometrical analysis means can detect the approximate form of the object.
Considering these facts, a geometrical analysis means comes to detect position and form of an object suitable for image processing, because a means for detecting inclination calculates length and inclination angle of some line segments representing contour of the object from a formed edge-information image, followed by calculation distance of transfer of the line segments during moving the length and the inclination angle of the line segments toward the direction of the center of gravity of the contour of the object every inclination angle. In addition, a contour line of the object has been already divided into some independent line segments. Therefore, if pixels in the line segments are moved independently, satisfying with an appointed limitation condition between them and their neighbors, the geometrical analysis means comes to detect the position and the form of the object, with a little of hardware complexity and computational complexity.
Finally, after an area normalization means once moved each pixel within a segmented object area in a digital image corresponding to an object area to a whole of the digital image as distance of the pixels is approximated equal each other, the area normalization means generates a normalized image whose size is equal to size of the digital image, by complementing pixels between these pixels with an average of pixels at their neighbors. In order to complement between the pixels, therefore, the area normalization means must carry out division by a natural number or multiplication by a reciprocal number of the natural number. The reasons why the area normalization means complements in such a way are as follows: First reason is that, in a case that some segmented object areas whose size and position are different from each other denote the same object, a pattern matching means which is a destination of the normalized image must have many template images of the same object if a pattern of the segmented object area is not restructured with each pixel of the segmented object area which was once resolved. Second reason is that the similarity between the normalized image and a template image representing the same kind of object as one in the normalized image increases, by complementing between the pixels of the segmented object area which was once resolved because the digital image is generally a multi-band image with noise. As the above, however, in a case that an edge-information formation means once generates a formed edge-information image from the segmented object area, instead of the normalized image corresponding to the segmented object area, followed by that the pattern matching means uses a normalized image corresponding to an image generated by a geometrical analysis means from the formed edge-information image, at least one pixel in the normalized image denotes form and size representing a part of pattern of the object at its circumference. Therefore, even though the area normalization means does not complement, the pattern matching means can select the most similar template image with the normalized image, among some template images.
Considering these facts, a pattern matching means comes to select a pattern of an object represented by an object area even though an area normalization means does not complement.
Now, this visual device searches an object and carries out image processing of the object, by controlling a mechanism of pan, tilt and zoom in a moving camera. Image processing carried out by the visual device is mostly local processing, and the local processing can be carried out in parallel by array operation units arranged in the shape of a two-dimensional lattice. In a case that the array operation units are implemented on an LSI, each array operation unit is designed as it can communicate with its adjoining array operation units asynchronously, by using signals named SEND and RECEIVE. Therefore, since a wiring pattern becomes extremely simple, and wire length becomes short, the LSI can reduce its power consumption, increasing an implementation surface of transistors. In addition, all of the array operation units do not always have to synchronize with each other.
By the way, there are four problems on a past array operation unit. First, in a controller which sent a SEND, a time from sending a SEND to receiving a RECEIVE became long. This cause is that a controller which received the SEND does not reply the RECEIVE until it inputs a calculation datum, type, transmission times in a horizontal direction and transmission times in a vertical direction. In the past array operation unit, therefore, a processor must wait until upper, lower, left and right RECEIVE STATUS signals in the controller which sent the SEND are updated certainly. In this way, however, even though the controller communicates asynchronously, the processor must waste time in vain. Second, it is difficult to distinguish a calculation datum before transmitting and a calculation datum after transmitting because order of transmitting the calculation data is irregular. This cause is that all array operation units work independently. In the past array operation unit, therefore, a memory stored a received calculation datum with a SEND FLAG to be transmitted, while the processor had updated the SEND FLAG related with a calculation datum transmitted, after it transmitted the calculation datum, always checking all of SEND FLAGs memorized in the memory. In this way, however, the processor must check repeatedly the SEND FLAG of the calculation datum which has been already transmitted. Third, in a case that a calculation datum is transmitted toward three directions simultaneously, a processor does not always succeed in writing the calculation datum to its controller. This causes is that the controller can send a calculation datum at a time to only array operation units at four neighbors. In the past array operation unit, therefore, the more the number of array operation units designated by SEND FLAGs becomes, the longer time the processor must wait for until it can write a next calculation datum to the controller. Fourth, in a case that a calculation datum is transmitted toward three directions simultaneously, it is difficult for an array operation unit received the calculation datum to distinguish two array operation units which are designated by transmission times in a horizontal direction and transmission times in a vertical direction of the calculation datum, where the transmission times in each direction designating the array operation units are equal to each other. This cause is that a controller communicates the transmission times in the horizontal direction and the transmission times in the vertical direction, only by using a non-negative integer. In the past array operation unit, therefore, a priority was assigned to two array operation units which are senders of calculation data, and the array operation unit had always transmitted in order, from a calculation datum of an array operation unit whose priority is high. In this way, however, transmission efficiency is bad because a calculation datum of an array operation unit whose priority is low has not been transmitted until the calculation datum of the array operation unit whose priority is high is inputted. The most effective method solving these problems is to design a high-performance controller. For example, in order to solve the first problem, frequency of a clock signal of the controller has better become higher than frequency of a clock signal of the processor. In order to solve the second problem, the controller has better comprise an electronic circuit like a FIFO (First In First Out). In order to solve the third problem, the controller has better be able to send some calculation data to the array operation units at its four neighbors simultaneously. In order to solve the fourth problem, the controller has better be added two circuits representing one bit for the transmission times in the horizontal direction and the transmission times in the vertical direction, respectively. Suppose, however, that a designer tries to design such an array operation circuit in practice, hardware complexity of the array operation unit becomes huge.
Thus, in order to solve the first problem, the controller has better input the calculation datum, the type, the transmission times in the horizontal direction and the transmission times in the vertical direction after it received the SEND, followed by memorizing the SEND and replying the RECEIVE immediately. In order to solve the second problem, a substitute for the FIFO has better be implemented in the memory and the processor. In order to solve the third and fourth problems, the calculation datum have better be transmitted only in at most two directions simultaneously.
Considering these facts, an array operation unit whose transmission efficiency is high comes to be designed by implementing stacks and cyclic buffers in the memory and the processor, and by transmitting the calculation datum counter-clockwisely and clockwisely.
Now, LSIs which are fast and have much transistors have been recently developed by rapid progress of LSI technology. As concerns the degree of accumulation of LSIs, not only technology detailing the design rule but also three-dimensional LSI technology (e.g., refer to Published Unexamined Japanese Patent Application No. S63-174356, Published Unexamined Japanese Patent Application No. H2-35425, Published Unexamined Japanese Patent Application No. H7-135293), especially technology for putting together some wafers (e.g., refer to Koyanagi, M., Kurino, H., Lee, K-W., Sakuma, K., Miyakawa, N., Itani, H., “Future System-on-Silicon LSI Chips”, IEEE MICRO, Vol. 18, No. 4, pp. 17-22, 1998) have been developed. Moreover, a lot of technologies stacking many chips (e.g., refer to Nikkei Microdevices, June 2000, pp. 62-79, Nikkei Microdevices June 2000, pp. 157-164, Nikkei Microdevices June 2000, pp. 176) have been recently developed. In short, since the LSIs have more and more transistors, some digital circuits implemented in some separated LSIs in past come to be implemented in an LSI easily. On the other hand, as concerns processing speed of the LSI, the more the frequency of a clock signal becomes, the serious the problems on clock skew and propagation delay time of signals becomes.
In order to solve these problems, then, many PLLs (Phase Locked Loops) has been used in the LSI. Note that these PLLs input a reference signal whose phase is fixed. In addition, comparing the difference between a phase of the reference signal and a phase of a comparison signal generated by each PLL, they change phases of their comparison signals as the difference becomes zero radian. In a case that there are many PLLs in the LSI, however, it is impossible to coincide phases of all PLLs because of propagation delay time of the reference signal. In addition, two PLLs can not communicate their comparison signals with each other. The reason is that neither PLL can generate its comparison signal whose phase is fixed, because of propagation delay time of these comparison signals. That is, if the phase difference of the comparison signal in either PLL becomes zero radian, the phase difference of the comparison signal in another PLL becomes twice of its propagation delay time. Therefore, both PLLs generate a big jitter of their comparison signals. Of course, a clock signal generated by the PLL generates a fatal jitter.
Let us aim here that each array operation unit can communicate with its adjoining array operation units asynchronously. In this case, all array operation units have better input not a clock signal whose phase is fixed but a clock signal whose period is fixed. Therefore, it is enough for a visual device to comprise such counters as all of their count numbers coincide within an appointed time, where the counters comprise independent oscillator circuits, respectively, and they communicate their count numbers with each other. In addition, suppose that each counter adjusts a phase of the oscillator circuit, according to the count numbers of all adjoining counters. As a result, a time for which all of the count numbers coincide becomes long.
Considering these facts, a counter comes to always coincide its count number with others and to supply a whole of an LSI with a high-frequency clock signal if the counter has a mechanism for memorizing all signals inputted from an external part individually, and if an oscillator circuit has a mechanism for synchronizing with a signal generated by the counter.
Now, many image sensors have been developed, using CCD (Charge Coupled Device) and CMOS (Complementary Metal Oxide Semiconductor) technology. Since many of these image sensors are used to generate a video signal they are a row-parallel type of image sensors. In addition, some image sensors stacking photo-receptor elements, charge amplifiers, A/D converters and digital circuits have been developed, using three-dimensional LSI (Large Scale Integrated Circuit) technology (e.g., refer to Published Unexamined Japanese Patent Application No. S63-174356, Published Unexamined Japanese Patent Application No. H2-35425, Published Unexamined Japanese Patent Application No. H7-135293). Many of these image sensors are a pixel-parallel type of image sensors using some vertical signal lines effectively, where a photo-receptor element, a charge amplifier, an A/D converter and a digital circuit are arranged vertically. Especially, technology for putting together some wafers (e.g., refer to Published Unexamined Japanese Patent Application No. H5-160340, Published Unexamined Japanese Patent Application No. H6-268154, Koyanagi, M., Kurino, H., Lee, K-W., Sakuma, K., Miyakawa, N., Itani, H., “Future System-on-Silicon LSI Chips”, IEEE MICRO, Vol. 18, No. 4, pp. 17-22, 1998) have been recently developed. Therefore, after a manufacturer of the image sensors individually makes an LSI implementing on some photo-receptor elements, an LSI implementing on some charge amplifiers, an LSI implementing on some A/D converters and an LSI implementing on some digital circuits, he can stack these LSIs as one of the photo-receptor elements, one of the charge amplifiers, one of the A/D converters and one of the digital circuits are arranged vertically. Thus, since even LSIs manufactured in some difference processes, which had already been checked, are stacked easily, a yield of the LSIs increases. Moreover, since technology stacking many chips (e.g., refer to Nikkei Microdevices, June 2000, pp. 62-79, Nikkei Microdevices June 2000, pp. 157-164, Nikkei Microdevices June 2000, pp. 176) have been recently developed, the manufacturer of the image sensors has been able to make a high-performance image sensor easily.
By the way, there is a problem that it is difficult for three-dimensional LSI technology to increase the number of vertical signal lines, while the three-dimensional LSI technology can increase the number of transistors. The reason is that line width of the vertical signal lines is much wider than line width of signals on an implementation surface of an LSI. Moreover, the transistors can not be arranged at a place where the vertical lines are arranged. Therefore, even though a designer of an image sensor uses the three-dimensional LSI technology, some transistors in a specific circuit finally must be implemented on a specific LSI. In short, the designer of the image sensor can not increase the number of pixels of the image sensor easily.
On the other hand, this inventor has developed a visual device as a device controlling a moving camera and carrying out image processing (e.g., refer to PCT Publication Number WO 00/16259). The visual device searches an object and carries out image processing of the object, by controlling a mechanism of pan, tilt and zoom. Image processing carried out by the visual device is mostly local processing, and the local processing can be carried out in parallel by array operation units arranged in the shape of a two-dimensional lattice. In a case that the visual device is embedded in an image sensor, each of the array operation units carries out some local processings, using some pixel data generated from some photo-receptor elements. Therefore, for some applications of the image sensor, the image sensor has better adopt such a type as some adjoining pixel data are inputted by a digital circuit, rather than a pixel-parallel type of image sensors. In this case, furthermore, only one A/D converter is desired for plurality of photo-receptor elements. Therefore, even though the number of pixels of the image sensor increases, a designer of the image sensor does not always have to increase the number of the A/D converters and the digital circuits. Of course, since all of the A/D converters and all of the digital circuits can work in parallel, performance of the image sensor seldom drops.
Considering these facts, an image sensor whose resolution and performance are high comes to be manufactured because some sensor modules are arranged in the shape of a two-dimensional lattice in the image sensor, some photo-receptor elements are arranged in the shape of a two-dimensional lattice in each of the sensor modules, and moreover, each of the sensor modules generates a pixel signal from the photo-receptor elements, in order.
Now, for a past image sensor, its specification had to be decided in designing. Of course, some electronic circuits can be changed after manufacturing the image sensor, by using an FPGA (Field Programmable Gate Array) and a CPLD (Complex Programmable Logic Device). However, the image sensor needs electronic circuits for the FPGA and the CPLD, a set of large memories and many signal lines from an external part. On the other hand, when each of sensor modules comprises many photo-receptor elements in the above image sensor, an implementation area of each of digital circuits also increases in proportion to the number of photo-receptor elements. Therefore, each of the digital circuits can comprise a processor and a set of large memories. Since the memories can store all pixel signals generated by the sensor module, the processor can refer the enormous number of pixel patterns consisting of all pixel signals. Suppose, thus, that combinations of a memory datum, a memory address and a write clock signal are assigned to these patterns. The processor can write a suitable memory datum at any memory address, according to the write clock signal. In addition, if at least one part of the set of memories is non-volatile, the part of the memories can remain storing the memory datum. Therefore, the processor can change even a program stored in the part of the memories. Thus, after a manufacturer of image sensors once made an image sensor, he can change a program if desired. Moreover, he can omit signal lines supplying all sets of memories with the program.
Considering these facts, an image sensor comes to change a program in all sets of memories simultaneously because light with a specific pattern is applied to all photo-receptor elements in the image sensor.
In the present invention described in claims, a visual device analyzes geometry of an object in a digital image, by repeating local processing for each pixel of the digital image, while it separates an object area and a background area quickly by using a formed edge-information image. In addition, in the present invention described in claims, an array operation unit and a virtual array operation unit transmit a calculation datum effectively, by designing a controller possible to reply a RECEIVE immediately after receiving a SEND, followed by transmitting the calculation datum counter-clockwisely and clockwisely. In addition, the present invention described in claims realizes an interlocked counter always possible to adjust its count number, according to some interlocking signals outputted by other interlocked counters, even though some of the interlocked counters do not communicate their interlocking signals with others. Finally, in the present invention described in claims an image sensor, whose resolution is high and which is fast, is manufactured, by outputting some pixel signals from each of sensor modules comprising photo-receptor elements arranged in the shape of a two-dimensional lattice.