1. Field of the Invention
The present invention relates to an information processing apparatus and an information processing method configured to execute a plurality of stages of information processing.
2. Description of the Related Art
A conventional method, in a digital camera and a printer, detects a specific object such as a person and a person's face included in an input image and executes processing appropriate to the detected object. As an example of processing for detecting a specific object, a conventional method executes face detection processing to execute skin color correction processing on image data of a person's face.
Various methods have been discussed as a method for executing face detection processing. P. Viola and M. Jones, “Robust Real-time Object Detection”, Second International Workshop on Statistical and Computational Theories of Vision, Jul. 13, 2001 discusses a method for executing face detection processing. Hereinbelow, the method discussed in the above literature will be simply referred to as the “Viola & Jones method”. In addition, another conventional method for detecting a face of a person by utilizing symmetric characteristics of a person's face, template matching, or a neutral network has been discussed.
Now, an outline of the “Viola & Jones” method will be described below. FIGS. 13A through 13D schematically illustrate the Viola & Jones method. In the Viola & Jones method, a plurality of stages of identification processing are executed according to a result of learning by the Adaboost algorithm. As illustrated in a processing flow of FIG. 13A, the plurality of identification processing includes cascade processing. More specifically, in the cascade processing illustrated in FIG. 13A, if it is determined that subsequent identification processing is to be executed as a result of identification by specific identification processing, the Viola & Jones method outputs “True”. On the other hand, if it is determined that subsequent identification processing is not to be executed as a result of identification by specific identification processing, the Viola & Jones method outputs “False”. If the result “False” is output, the identification processing ends.
FIG. 13D illustrates an example of a result of learning. Referring to FIG. 13D, a characteristic amount 0 (210) is a characteristic amount in which as a result of a comparison between a small rectangle drawn over image data of eyes of a person (a part of an image) and a small rectangle drawn over a portion below the eyes of the person (i.e., the rectangle drawn over the cheek and nose of the person), the small rectangle drawn over the eyes portion is displayed in a darker state than the state of display of the small rectangle drawn over the portion below the eyes of the person. Furthermore, a characteristic amount 1 (211) is a characteristic amount in which in the rectangle drawn over the eyes portion, the portion of the rectangle over each eye is displayed in a dark state and a portion of the rectangle drawn over the middle of the forehead (the portion of the person's face between the eyes) is displayed in a lighter state than the portion of the rectangle drawn over the eyes.
If the above-described result of learning (learned characteristic amount) is compared with input data 400 and if a result of the identification processing executed for all characteristic amounts is “True”, then the image is determined to be an image of a person's face.
In addition, in the “Viola & Jones” method, the identification processing is divided by into specific segment processing (hereinafter simply referred to as a “stage”) as illustrated in a flow of FIG. 13B. More specifically, the Viola & Jones method executes True/False identification in each stage to identify whether an image is an image of a person's face. Furthermore, in an early stage of identification processing, the Viola & Jones method utilizes a simple characteristic only to minimize the rate of “false negative” (i.e., to determine an image of a person's face as a non-face image (overlooking)) and increase the rate of “false positive” (i.e., to determine a non-face image as an image of a person's face (detection error)) to a relatively high rate.
If a simple characteristic only is used, identification processing can be executed by performing a small number of operations. Accordingly, if identification processing is executed by using a processor, the identification processing can be executed at a high processing speed. Furthermore, in this case, it is possible to effectively execute identification processing on as many rectangular areas as possible to determine the same as “False” (non-face image) at an earlier stage of the identification processing. Accordingly, it is possible to completely execute face detection processing on the entire image within a short period of time.
Hereinbelow, the rate of appearance of an identification result “True” in a stage including cascade processing will be simply referred to as a “successful detection rate”. Now, the “successful detection rate” will be described in detail below with reference to FIG. 13A.
Referring to FIG. 13A, “S” denotes a total number of identification processing in identification processing 106_0, which is first processing in the identification processing illustrated in FIG. 13A (i.e., “S”=total number of input rectangular areas). In identification processing 106_1, rectangular areas identified “True” in the identification processing 106_0 only are input. Accordingly, the number of rectangular areas to be processed in the identification processing 106_1 is calculated by multiplying the number of rectangular areas processed in the identification processing 106_0 by a successful detection rate p[0] in the identification processing 106_0 (i.e., “S×p[0]”).
In addition, the number of rectangular areas to be processed in identification processing 106_2 is calculated by multiplying the number of rectangular areas to be processed in the identification processing 106_1 by a successful detection rate p[1] in the identification processing 106_1 (i.e., “S×p[0]×p[1]”). Therefore, by similar calculation, the number of rectangular areas to be processed in identification processing 106_N can be calculated by the following expression:(S×p[0]×p[1]× . . . ×p[N−2])×p[N−1].
In the following description, the terms in the above-described expression “p[0]×p[1]× . . . ×p[N−1]” will be simply referred to as a “cumulative successful detection rate P[N] in identification processing at a stage N”. In the identification processing 106_0, all data to be input is input. Therefore, P[0]=1 (i.e., data input in the identification processing 106_0 at the successful detection rate of 100%).
Now, a method for increasing processing speed in data processing will be described below. A general processing speed increasing method includes a method for increasing an operational frequency and a method for providing a first-in first-out (FIFO) memory and/or a random access memory (RAM) inside an information processing apparatus to prevent a rate-limited (a bottle neck in) transfer of data to be input and output. In addition, a method for chronologically and spatially parallelizing processing has been widely used. Now, a method for chronologically and spatially parallelizing processing will be described below.
To begin with, chronologically paralleled processing (pipeline processing) will be described. In pipeline processing, processing stages are arranged in a cascaded chronological order. Furthermore, a dedicated identification device is provided to each such stage. Accordingly, in pipeline processing, the identification devices, each of which being provided for each processing stage, can operate in parallel at the same time. Therefore, by executing pipeline processing, the processing can be executed at a high speed. However, processing time that is the longest of the processing time of all stages may become the bottle neck against the entire processing time. Accordingly, if the successful detection rate at all stages is 100% and the processing time for each stage is even, then the processing can be executed at a high processing speed as high by the number of times equivalent to the number of the stages included in the processing. More specifically, if four stages are included in the processing, the processing speed of the processing can be increased to a four-fold processing speed.
Now, spatially paralleled processing will be described below. In order to further increase the processing speed of the above-described pipeline processing, a conventional method includes a plurality of pipelined processing. Accordingly, a plurality of pieces of input data can be simultaneously processed. In the spatially paralleled processing described above, if data to be input can be constantly input in each pipeline processing, then the processing can be executed at a processing speed as high by the number equivalent to the number of spatially paralleled processing. More specifically, if four pipelines are provided, then the processing can be executed at a four-fold processing speed.
Now, an example of a conventional method for executing processing illustrated in a processing flow of FIG. 13B by hardware will be described in detail below with reference to FIG. 13C. In the example illustrated in FIG. 13C, each stage is implemented as an identification device (hardware). The identification devices are connected with one another via data lines and control lines (“valid” lines) to implement the above-described pipeline processing. An identification device 1060 is hardware for the stage 0 illustrated in FIG. 13B.
In the following description, “data_in0 [0]”, of input data 0 (“data_in0”) (i.e., a part of specific data to be input), is input to the identification device 1060. Furthermore, “data_in0[1]”, of input data 0 (“data_in0”), is input to an identification device 1061. In addition, a “valid” signal refers to a control signal for controlling whether data_in (data to be input) and data_out (data to be output) is valid. If a result of processing by the identification device 1060 is determined “True”, then the identification device 1060 outputs a signal “valid_out0[0]=1”. On the other hand, if a result of processing by the identification device 1060 is determined “False”, then the identification device 1060 outputs a signal “valid_out0 [0]=0”.
If an asserted control signal (“valid_in0=1”) is detected, then a control device 1050 detects that valid data (“data_in0”) has been input. Then, the control device 1050 outputs the input data_in0 as data_in0 [0] to the identification device 1060. In addition, the control device 1050 outputs a parameter value “1” to the identification device 1060, which value indicating that valid data has been input, together with a signal “valid_in0[0]”.
Then, the identification device 1060 detects a signal “valid_in0[0]=1”. Furthermore, the identification device 1060 executes identification processing based on an input image (“data_in0 [0]”). Then, the identification device 1060 outputs a result of the identification processing as a signal “valid_out0[0]”. If a result of the identification processing is “True”, then the identification device 1060 outputs the input data 0 to the subsequent identification device 1061 as data_out0[0]. In addition, the identification device 1060 outputs a signal “valid_out0 [0]=1”, which indicates that valid data has been input. Accordingly, the identification device 1061 can detect valid input data and execute processing based on the valid input data.
As described above, the above-described conventional method executes transmission of the input data via the data line and executes control for determining whether valid data has been input (whether to process the input data) based on the signal on the valid line. If all results of identification processing by the identification devices 1060 through 1062 are “True”, then a control device 1053 outputs a control signal “valid_out0=1”. In this case, it is determined that an image of a person' face is included in the input image data (“data_in0”).
In the identification processing, if the same identification devices are used, a plurality of pieces of identification target data can be processed by changing the characteristic amount learned for each identification target data according to the identification target data (e.g., face, human figure, car). By changing the characteristic amount, a plurality of pieces of identification target data can be processed without changing the circuit configuration.
In addition, an example of an information processing apparatus configured to change identification target data will be described in detail below with reference to a block diagram of FIG. 14.
Referring to FIG. 14, the information processing apparatus includes a central processing unit (CPU) 100, a read-only memory (ROM) 101, a dynamic random access memory (DRAM) control device 102, a DRAM 103, a control unit 105, and a processing unit 106. The ROM 101 includes a processing setting data storage unit 104. The control unit 105, which includes control units 1050 through 1053, controls input data and a control signal. In addition, the processing unit 106, which includes identification devices 1060 through 1062, executes identification of a “valid” signal.
Now, a method for performing setting of processing, which is executed in starting processing, will be described below. At the start of identification processing, the CPU 100 acquires setting data of a characteristic amount, from the processing setting data storage unit 104 of the ROM 101. In addition, the CPU 100 sets the acquired setting data on an identification device provided within the processing unit 106. Furthermore, the CPU 100 acquires setting data of positional information of image data (i.e., an address of the image data) from the processing setting data storage unit 104. Moreover, the CPU 100 sets the acquired setting data on the control unit 105.
After completing the setting of the control unit 105 and the processing unit 106, the CPU 100 notifies the control unit 105 and the processing unit 106 that the processing has been started. Then, the control unit 105 accesses the DRAM controller 102 based on the image data positional information (the address) set thereon. Accordingly, the control unit 105 serially reads data of rectangular areas from the image data specific object the DRAM 103. In addition, the control unit 105 transfers the read rectangular area image data to the processing unit 106. After receiving the rectangular area image data from the control unit 105, the processing unit 106 serially executes identification processing on the received rectangular area image data.
If identification target data is to be changed, the control unit 105 notifies the CPU 100 that identification target data is to be changed. After receiving the notification from the control unit 105, the CPU 100, similarly to the operation described above, acquires setting data of a characteristic amount corresponding to new identification target data from the processing setting data storage unit 104. In addition, the CPU 100 sets the acquired setting data on an identification device provided within the processing unit 106. Furthermore, the CPU 100 acquires setting data of positional information of image data corresponding to the new identification target data from the processing setting data storage unit 104. In addition, the CPU 100 sets the acquired setting data on the control unit 105. In the above-described manner, the identification target data can be changed.
However, in the cascade processing by the “Viola & Jones method”, as the processing advances to later stages, the amount of data to be processed may decrease compared with the amount of data to be processed in early stages. Accordingly, even if the processing is chronologically paralleled (i.e., if the pipeline processing is executed), the processing cannot be efficiently executed.
Japanese Patent Application Laid-Open No. 2003-256221 discusses a conventional method for improving the operation rate of a processor in parallel processing. In the method discussed in Japanese Patent Application Laid-Open No. 2003-256221, a process generated by using a parallelizing program is assigned to each time period of processing by each of a plurality of processors according to a predetermined length of time, which is determined for each parallelizing program according to a processor assignment rate.
In addition, in the method discussed in Japanese Patent Application Laid-Open No. 2003-256221, it is determined whether a plurality of parallelized processes, which are generated by using a specific parallelizing program, can be assigned so that the plurality of paralleled processes can be paralleledly executed in an idle time, of the time periods of processing by the processors, to which no process has been assigned. Furthermore, if it is determined that the plurality of processes can be paralleledly executed, then another paralleled process is additionally assigned to an idle time. Moreover, each processor executes the paralleled process assigned to the time period of processing executed by each processor.
In the method discussed in Japanese Patent Application Laid-Open No. 2003-256221, a process for which a turn-around time needs to be secured is assigned to a predetermined time slot. Furthermore, a plurality of paralleled processes, which can be executed in parallel, is additionally assigned to an available time slot. In the above-described manner, the above-described conventional method improves the operation rate of the processor while securing a turn-around time.
However, the above-described conventional method considers execution of a process of a predetermined processing load only. More specifically, the above-described conventional method cannot sufficiently improve the processor operation rate if the processing (process) load (processing execution time) varies according to input data as in the face detection in the “Viola & Jones” method.
In addition, in executing identification processing, in changing the identification target (e.g., a person's face, a person's figure, or a car), the processing time and the successful detection rate for each stage may vary. More specifically, in identifying a specific object, such as a face of a person, a person, or a car, the shape of the identification target area may vary. In other words, the identification target area may be oriented in a portrait orientation or in a landscape orientation. In addition, in identifying a specific object such as a face of a person, a person, or a car, the size of the characteristic amount thereof may vary. Furthermore, due to the affect from the variation in the shape of the identification target area and the size of the characteristic amount, the processing time may vary. In addition, during learning for each identification target, the successful detection rate may vary. Accordingly, the operation rate of a stage, which has been sufficiently efficient before the identification target is changed, may become inefficient after changing the identification target.