1. Field of the Invention
The present invention relates to a focus detection device used in cameras and in video equipment.
2. Background of Related Art
A focus detection device of a camera using a phase differential detection method is well known. FIG. 14 illustrates such a device. Light rays entering from a region 101 in the shooting lens 100 pass through a field mask 200, a field lens 300, an aperture stop unit 401, and a re-imaging lens 501. The image is composed on an image sensor array having a plurality of photo-electric conversion elements arranged in a one-dimensional line forming a row (shown as row A). The image sensor array produces an output corresponding to the incident intensity of the light rays. Similarly, light rays entering from a region 102 in the shooting lens 100 pass through the field mask 200, the field lens 300, the aperture stop unit 402, and the re-imaging lens 502 and compose an image on a row B of photo-electric conversion elements.
The pair of object images composed on the image sensor arrays rows A and B move away from each other during the so-called front focus condition in which the shooting lens 100 composes a clear image of the object in front of the predetermined focus surface. Conversely, the pair of object images move toward each other during the so-called rear focus condition in which the shooting lens 100 composes a clear image of the object in the back of the predetermined focus surface. The object images on the image sensor arrays rows A and B relatively coincide with each other during the so-called in-focus condition in which a clear image of the object is composed on the predetermined focus surface. Therefore, the focus adjustment of the shooting lens 100, in particular, and the amount and the direction of the deviation from the in-focus condition (hereafter "defocus amount") in the present invention can be determined by obtaining the relative position shift of the pair of object images. This is accomplished by converting the pair of object images into electric signals through photo-electric conversion using each row of photo-electric conversion elements of the image sensor rows A and B, and by performing algorithm processing on these signals.
Thus, projected images from the re-imaging lenses 501 and 502 on the image sensor rows A and B coincide with each other in the vicinity of the predetermined focus surface. This generally is the central region in the shooting field, as described in FIG. 13. This region is designated as a focus detection area.
The algorithm processing method for obtaining the defocus amount is described hereafter.
The image sensor rows A and B are each composed of a plurality of photo-electric conversion elements that output a plurality of output signal strings a[1], . . . ,a[n] and b[1], . . . . b[n], respectively, as shown in FIGS. 15(a) and 15(b). A correlation algorithm relatively shifts the data within a specified range of the pair of output signal strings by a predetermined amount L. Letting the maximum shift amount be lmax, the range of L becomes -lmax to lmax. Specifically, the correlation amount C[L] is computed using formula 1. EQU C[L]=.tau..vertline.a[i+L]-b[i].vertline. (1)
where L=-lmax, . . . ,-2,-1,0,1,2, . . . ,lmax and .SIGMA. denotes the total sum over i=k to r.
In formula 1, L is an integer corresponding to the shift amount of the output data strings as described above. The first term k and the last term r vary depending on the shift amount L, as described in formula 2.
If L.gtoreq.0, EQU k=k0+INT {-L/2} EQU r=r0+INT {-L/2}
If L&lt;0, EQU k=k0+INT {(-L+1)/2} EQU r=r0+INT {(-L+1)/2}, (2)
where k0 and r0 denote the first term and the last term, respectively, when the shift amount L is equal to 0.
FIGS. 16(a)-16(e) describe a combination of signals used to compute the absolute value of the difference between row A signals and row B signals in formula 1 and the algorithm range of adding the absolute values of these differences when the initial term k and the last term r are varied by formula 2. As shown in these figures, the ranges used for rows A and B in the correlation algorithm shift in opposite directions with changes in the shift amount L.
There is also a known method in which the first term k and the last term r are fixed regardless of the shift amount L. In this method, the range used in the correlation algorithm of one of the rows is held constant and only the range of the other row shifts. The shift amount of the relative position becomes the shift amount L when a pair of data coincide. Hence, the shift amount corresponding to the minimum correlation amount among the determined correlation amounts C[L] is detected.
The shift amount is multiplied by a constant, which is determined by the pitch width of the photo-electric conversion elements in the image sensor array and the particular optical system used, to determine a defocus amount. Thus, a larger defocus amount can be detected by making the maximum shift value lmax larger.
The correlation amount C[L] is discrete, as shown in FIG. 15(c), and the minimum unit of detectable defocus amount is limited by the pitch width of the photo-electric conversion elements of the image sensor rows A and B. A method in which precise focus detection is obtained by performing an interpolation algorithm based on the discrete correlation amounts C[L] is disclosed by the assignee of the present invention in Japanese Unexamined Patent Publication Sho 60-37513, corresponding to U.S. Pat. No. 4,561,749. In this method, a true local minimum correlation amount Cex and a shift amount Ls that corresponds to Cex are computed by formulae 3 and formula 4 using the local minimum correlation amount C[1] and correlation amounts C[l+1] and C[l-1], with shift amounts on both sides of C[l], as shown in FIG. 17. EQU DL=(C[l-1]-C[l+1])/2 EQU Cex=C[l]-.vertline.DL.vertline. EQU E=MAX {C[l+1]-C[l], C[l-1]-C[l]} (3) EQU Ls=l+DL/E (4)
In formulae 3, MAX{Ca, Cb} is evaluated as the larger of Ca and Cb. Finally, the defocus amount DF is computed from the shift amount Ls using formula 5. EQU DF=Kf.times.Ls (5)
where Kf is a constant determined by the pitch width of the photo-electric conversion elements and the optical system used.
The defocus amount thus obtained needs to be further evaluated as to its reliability, ie., whether it is a true defocus amount or whether it is the result of fluctuations in the correlation amounts caused by noise and the like. If the obtained defocus amount satisfies the conditions of formula 6, it is considered reliable, i.e., there is a high level of confidence in the obtained defocus amount. EQU E&gt;E1 and Cex/E&lt;G1 (6)
where E1 and G1 are predetermined threshold values. Value E describes changes in the correlation amount which depend on the contrast of the object. As the value of E becomes larger, the contrast becomes larger and the confidence level becomes higher. Cex is the difference of the pair of data when the data are closest to each other. Ideally, Cex should be 0. However, due to noise and the visual difference between region 101 and region 102, there is a minute difference between the pair of object images; hence Cex is never 0 in reality. As the contrast of the object becomes higher, the effect of noise and the difference in object images becomes smaller. Therefore, Cex/E is used to denote the level of coincidence of the pair of data. Naturally, as the value of Cex/E approaches 0, the level of coincidence of the pair of data, and the level of confidence become higher.
Another known method for determining reliability computes the contrast for one of the pair of data instead of using the value E to determine confidence.
If the defocus amount is determined to be reliable, driving or display of the shooting lens 100 is executed based on the defocus amount DF. The correlation algorithm, the interpolation algorithm, and the determination of conditions associated with formula 1 through formula 6 above will be collectively referred to hereafter as the focus detection algorithm.
The focus detection optical system and the image sensor, in general, are structured in such a manner that a pair of data coincide with each other if the shooting lens 100 is in the in-focus condition and the shift amount L is approximately 0. Therefore, unless the object image is formed within the range of k0 to r0 with shift amounts L equalling 0 when the shooting lens is in the in-focus condition, the shooting lens 100 cannot be focused on the object. In other words, the region in which focus detection is performed is determined by the k0 and r0. Hereafter, the data region between k0 and r0 at shift amount L=0 will be referred to as the algorithm range. If the object is found in a region corresponding to the algorithm range in the shooting field, focus detection is to be performed on the object and the region becomes the focus detection area. This focus detection area is displayed in FIG. 13 as a focus detection frame enclosed by a real line in the center of the finder screen of the camera. The photographer can focus the shooting lens on the desired object by capturing it within this detection frame.
In the focus detection device described above, if a plurality of objects with different distances are composed on the image sensor arrays, due to the difference in shift amounts of a pair of output signals depending on the position of the sensor, no pair of output signals have the same shift amounts. Thus, sometimes focus detection becomes impossible because the value of Cex is too large, and Cex/E does not satisfy the conditions of formula 6. Thus, a method is disclosed in Japanese Unexamined Patent Publication Sho 60-262004 in which the focus detection area is further subdivided by dividing a pair of image sensor arrays into a plurality of blocks, with the focus detection algorithm performed on each of these blocks to compute the defocus amount Df. A block is selected from the plurality of blocks having, for example, the largest amount of information or the defocus amount representing the shortest distance. The defocus amount of the selected block is designated as the focus adjustment condition of the shooting lens, and driving or display of the shooting lens is executed based on the defocus amount.
In dividing a pair of image sensor arrays into a plurality of blocks, a plurality of sets are formed having k0 and r0 with shift amount L=0 in the correlation algorithm of formula 1 described above. For example, as described in FIG. 12(a), in dividing a pair of image sensor arrays each consisting of 46 data, into five blocks, wherein each block consists of eight data, the defocus amount DF is computed using the focus detection algorithm of formulae 1-6 with k0=4 and r0=11 in block 1. Similarly, the defocus amount of each block is computed using the focus detection algorithm of formulae 1-6 with (k0=12, r0=19), (k0=20, r0=27), (k0=28, r0=35), (k0=36, r0=43), respectively, in blocks 2, 3, 4, and 5.
As described in FIG. 12 (b), wider blocks are formed in the same pair of image sensor arrays than the example shown in FIG. 12(a) by setting (k0=3, r0=16) for block 1, (k0=17, r0=30) for block 2, and (k0=31, r0=44) for block 3, thus forming three blocks, each with 14 data.
When the boundary positions of the blocks are fixed in this manner, sometimes focus detection becomes impossible or the algorithm results are unstable due to the existence of object contrast at the boundaries of the blocks. Hence, a method is disclosed in Japanese Unexamined Patent Publication Hei 2-135311, corresponding to U.S. Pat. No. 5,068,682, in which the absolute value of the difference between adjacent data for the vicinity of a block boundary is computed, and the boundary position is moved so that the section where the absolute value of the difference assumes the minimum value becomes the boundary of the blocks.
In the explanation described above, the output signal strings of the image sensor rows A and B, i.e., a[1], . . . ,a[n] and b[1], . . . ,b[n], are used directly. However, sometimes correct focus detection is impossible due to the effects of frequency components of the object that are higher than the Nyquist frequency or due to the effects of unbalance in the output of rows A and B. A method in which a filter algorithm process is performed on the output signal strings and the focus detection algorithm is performed using the resulting filtered data is disclosed by the assignee of the present invention in Japanese Unexamined Patent Publication Sho 61-245123. For example, to eliminate frequency components higher than the Nyquist frequency, the filter processing algorithm of formulae 7 is executed resulting in the filter processed data Pa[1], . . . ,Pa[n-2] and Pb[1], . . . ,Pb[n-2] from the output signal strings a[1], . . . ,a[n] and b[1], . . . ,b[n], respectively. EQU Pa[i]=(a[i]+2.times.a[i+1]+a[i+2])/4 EQU Pb[i]=(b[i]+2.times.b[i+1]+b[i+2])/4, (7)
where i=1 to n-2.
Next, by performing a DC-eliminating filter processing algorithm (i.e. a filtering process for eliminating the DC component) using formulae 8 to eliminate the effects of the unbalanced output from rows A and B, for example, on the filter processed data Pa[1], . . . ,Pa[n-2] and Pb[1], . . . ,Pb[n-2], the filter processed data Fa[1], . . . ,Fa[n-2] and Fb[1], . . . ,Fb[n-2], respectively, are obtained. EQU Fa[i]=-Pa[i]+2.times.Pa[i+s]-Pa[i+2s] EQU Fb[i]=-Pb[i]+2.times.Pb[i+s]-Pb[i+2s], (8)
where 1=1 to n-2-2s.
In formulae 8, s represents integers from 1 to approximately 10, and as s becomes larger, the frequency component of the object pattern extracted becomes lower. Conversely, as s becomes smaller, the frequency of the object pattern extracted becomes higher. Moreover, as s becomes larger, the number of data being filter processed becomes smaller.
In the near in-focus condition, the object image contains more of the high frequency components; therefore, a relatively smaller value of s is preferred. In the out-of-focus condition, the object image is blurred and only low frequency components remain; therefore, a larger value of s is preferred. If s is small, almost all the low frequency components are eliminated. Therefore, in the case of the absence of high frequency components, for example, when the defocus amount is large, detection becomes impossible. Consequently, making the maximum shift number lmax in formula 1 large is meaningless, and a relatively smaller value of s is preferred. On the other hand, if s is large, because the low frequency components are extracted, detection is possible even if the defocus amount is large. Accordingly, a relatively large value is set for lmax.
Furthermore, when the value of s is relatively large, sometimes the number of data is reduced by half by using every other datum from the DC-eliminating filter-processed data Fa[i] and Fb[i]. In this case, since one datum contains the width of two pixels, half of the algorithm range is required for the same detection region, compared to the case in which all the data are used. This reduces algorithm processing time. Moreover, the shift amount 1 in the case when every other datum is used is equivalent to the shift amount 2 in the case when every datum is used, enabling detection of the same defocus amount even with reduction of the maximum shift number by half.
FIGS. 18(a)-18(c) describe the output signal string, the filter processed data for the output signal with s=2, and the filter processed data with s=8, respectively, of an output of a image sensor array for a object having only low frequency components. The shooting lens is assumed to be in the in-focus condition, and thus, the output signal string of row A is assumed to coincide with that of row B.
The filter processed data for s=2 are flat with hardly any contrast. However, by making s=8, the contrast becomes sufficient and results in a defocus amount with a high level of confidence.
Comparing the narrow algorithm range ce1 and the wide algorithm range ce2, as shown in FIG. 18(c), the wide algorithm range ce2 contains more contrast than the narrow algorithm range. Therefore, the wider algorithm range is more advantageous for the focus detection algorithm for which low frequency components are extracted.
FIGS. 19(a)-19(c) describe the output signal string, the filter processed data for the output signal with s=2, and the filter processed data with s=8, respectively, of an output of a image sensor array for an object having only high frequency components. Again, the shooting lens is assumed to be in the in-focus condition, and the output signal string of row A is assumed to coincide with that of row B.
For an object pattern consisting only of high frequency components, by making s=2, the contrast becomes sufficient and results in a defocus amount with a high level of confidence.
Comparing the narrow algorithm range cel and the wide algorithm range ce2, as described in FIG. 18(b), both algorithm ranges contain the same amount of contrast. However, because a narrow algorithm range is less effected by noise than a wide algorithm range, and since a wide algorithm range may result in the failure of focus detection due to the existence of multiple objects with different distances, the narrow algorithm range is preferred for filter processed data for which high frequency components are extracted.
FIGS. 20(a)-20(c) describe the output signal string, the filter processed data for the output signal with s=2, and the filter processed data with s=8, respectively, of an image sensor output for an object having both high and low frequency components. Again, the shooting lens is assumed to be in the in-focus condition, and the output signal string of row A is assumed to coincide with that of row B.
For an object pattern having both high frequency components and low frequency components, sufficient contrast is obtained regardless of the value of s. Moreover, the range of contrast distribution of the pattern becomes larger as the value of s increases.
FIGS. 21(a)-21(c) describe the output signal string, the filter processed data for the output signal with s=2, and the filter processed data with s=8, respectively, of an image sensor output for a subject, such as a chimney, for which the shooting lens deviates substantially from the in-focus condition. Here, the output signal string of row A is displayed by a real line, while that of row B is displayed by a dotted line.
When the shooting lens deviates substantially from the in-focus condition, very little amounts of high frequency components are present in the output signal and contrast is not obtained for the filter processed data with s=2. However, sufficient contrast is obtained with filter processed of s=8, and a defocus amount is obtained by making the maximum shift amount lmax sufficiently large.
Because the frequency components vary depending on the object, a method exists wherein s is first set to s=2. Thus, the filter processed data based on the extraction of high frequency components are output, and the process is concluded if a defocus amount with a high level of confidence can be obtained by conducting the focus detection algorithms of formulae 1 to 6 using this filter processed data. If a defocus amount with a high level of confidence cannot be obtained, s is set to s=4, filter processed data based on lower frequency components are output, and reliability is again evaluated. The process is repeated with increasing values of s until a reliable defocus amount is obtained.
With this method, when the high frequency components are extracted initially near the in-focus state of a normal subject, e.g, the pattern of FIG. 20(a), it is possible to obtain a defocus amount with a high level of reliability with the focus detection algorithms using filter processed data with s=2. Consequently, it is possible to conduct focus detection within a short time. In addition, when the subject is the face of a person or the like and has only low frequency components, e.g., the pattern shown in FIG. 18(a), it is possible to obtain a defocus amount with a high level of reliability using the filter processed data based on the extraction of low frequency components.
When the defocus amount is large, such as in FIG. 21(a), the defocus amount is computed by increasing the maximum shift number lmax using the filter processed data for which low frequency components are extracted and then conducting the focus detection algorithms. When this is done, it is possible to shorten the processing time of the algorithm near the in-focus state, easily follow an object when the object is moving, and focus even when the object includes only low frequency components. Consequently, it becomes possible to detect even large defocus amounts.
In general, the precision of the defocus amounts obtained when the object includes high frequency components is better than when the object includes only low frequency components. Therefore, by initially conducting focus detection using the filter processed data based on the extraction of high frequency components, it becomes possible to obtain defocus amounts with good precision.
With focus detection conducted in blocks, a method is disclosed in Japanese Unexamined Patent Publication Hei 6-82686, corresponding to U.S. Pat. No. 5,389,995, wherein s is initially set to s=2, filter processing is performed to extract high frequency components, and focus detection algorithms are conducted on each block using this filter processed data. The process is concluded if a block exists in which a defocus amount with a high level of confidence is obtained. If a high level of confidence is not obtained, s is set to s=4, filter processing is performed to extract low frequency components, and focus detection algorithms are conducted on each block using this filter processed data. The filter processing is repeated with increasing values of s until a block exists in which a defocus amount with a high level of confidence is obtained.
In the focus detection device disclosed in the above-described Japanese Unexamined Patent Publication Hei 2-135311, the absolute value of the difference between adjacent data near the boundary of the block is computed, and the boundary position is moved so that the block boundary becomes the position where the absolute value of the difference becomes the minimum value. A method is disclosed in the above-described Japanese Unexamined Patent Publication Hei 6-82686 wherein when the filter processed data for which DC components are completely eliminated are divided into a plurality of blocks, the absolute value of the difference between the data near the block boundary and a predetermined value is computed, and the block boundary position is set on the basis of the absolute value of this difference.
The filter processes of formulae 8 are processes that completely eliminate the DC component. When focus detection algorithms are conducted using filter processed data that completely eliminate the DC component, the problem arises that the possibility of a false focus is greater than when data is used in which the DC component remains. This problem will be described with reference to FIGS. 22(a)-22(d).
FIGS. 22(a) and 22(b) illustrate output signals from the image sensor rows A and B corresponding to a viewed object wherein the luminosity changes in steps from left to right across the focus detection area. The patterns in 22(a) and 22(b) match at the portions indicated by the arrows. As shown, the output of image sensor array row A is shifted to the left with respect to the output of row B. FIGS. 22(c) and 22(d) are the filtered data patterns corresponding to FIGS. 22(a) and 22(b), respectively, for which the DC component is completely eliminated. Because only the DC component differs between the data in FIGS. 22(a) and 22(b), the filtered data is the same when the DC component is completely eliminated. Hence, when focus detection is conducted using these filtered data, the determination is made that the object is in focus because the pair of data relatively agree. Consequently, different object patterns may appear similar. This is particularly noticeable when the algorithm range is made narrower by conducting block division as described above.
To overcome this problem, a method is disclosed in the above-described Japanese Unexamined Patent Publication Hei 6-82686 wherein DC reduction filter processing is conducted which does not completely eliminate the DC component, using Formulae 9, to obtain the DC reduction filter processed data Qa[i] and Qb[i]. EQU Qa[i]=-Pa[i]+4.times.Pa[i+y]-pa[i+2y] EQU Qb[i]=-Pb[i]+4.times.Pb[i+y]-Pb[i+2y] (9)
where i=1 to n-2-2y.
In focus detection devices which conduct block division and then conduct the focus detection algorithms for each block, methods of determining a single final defocus amount from the plurality of defocus amounts, other than the above-described methods of selecting the defocus amount indicating the closest distance or selecting the defocus amount of the block where the information value E is largest, are disclosed in Japanese Unexamined Patent Publication Hei 2-178641, corresponding to U.S. Pat. No. 5,258,801, and Hei 4-235512. These methods select a block satisfying predetermined conditions as a standard block, set the defocus amount of the standard block as the standard defocus amount, conduct a weighing coefficient determination on the basis of the amount of difference between the various defocus amounts and the standard defocus amount, and find a weighted average of the plurality of defocus amounts using this weighing coefficient in order to compute the final defocus amount. The predetermined conditions for the standard block include a block that indicates a defocus amount having a closest distance. For example, when the amount of difference is small, the weighing coefficient is increased, and when the amount of difference is large, the weighing coefficient is decreased.
With this method, when a plurality of objects of differing distances are intermixed, it is possible to obtain a defocus amount relating to each of the objects. When the object is flat, such as a wall or the like, it is possible to obtain a stable defocus amount because the whole is averaged. The combined defocus amount Dfm, and the combined information amount Em can be obtained using formulae 10 below. EQU Dfm=.SIGMA. (Df[j].times.E[j].times.W[j])/.SIGMA. (E[j].times.W[j]) EQU Em=.SIGMA. (E[j].times.W[j]) (10)
where j=1 to h, h is the number of blocks, Df[j] is the defocus amount, and E[j] is the information amount E of block j.
The weighing coefficient W[j] is determined as shown in FIG. 11 from the difference between Dfk (the standard defocus amount) and Df[j] and has a value between 0 and 1. ML and UL are predetermined values so that W[j] is: 1 when the absolute value of the difference in the defocus amounts is not greater than ML, 0 when UL is exceeded, and changes in a linear manner between ML and UL. This indicates that Df[j] is not used in the combining algorithm when W[j] is 0.
The combined defocus amount Dfm obtained in this way is the final defocus amount. It is preferable for the value of ML to be a value between 30 .mu.m and 50 .mu.m, and for the value of UL to be between 80 .mu.m and 140 .mu.m.
However, the following problems occur in the conventional focus detection device described above.
First, the K-factor Kf in formula 5 differs in actuality depending on the position within the focus detection area. This is caused by the difference in magnification depending on the location within the focus detection area due to the distortion aberration of the focus detection optical system. In general, the K-factor Kf becomes larger as the distance from the center of the focus detection area increases. Therefore, it becomes necessary to change the K-factor Kf corresponding to the position where the object image is formed on the image sensor array.
In order to solve such a problem, the focus detection device disclosed in the above-described Japanese Unexamined Patent Publication Hei 2-135311 establishes a K-factor for each block and computes a defocus amount for each block using the K-factor for that block. However, this method is unstable when an object contrast exists in the boundary section of the block because the K-factor used depends on which block contains the contrast.
Second, if the information amount E exceeds the threshold value E1 only slightly and Cex/E falls below the threshold value G1 only slightly in the determination by the formulae 6, the defocus amount is determined to have a high level of confidence even though the defocus amount may be unstable.
To overcome this, a stricter threshold value E1' may be used in place of the threshold value E1 of the information amount. Consequently, even when the information amount exceeds the threshold value E1 slightly and Cex/E falls substantially below the threshold value G1, the defocus amount is determined to lack a sufficient level of confidence since the information amount E falls below the threshold value E1'. The same types of problems occur when a stricter threshold value G1 is used for the threshold value of Cex instead of changing the threshold value E1, or when both E1 and G1 are replaced by stricter values E1' and G1'.