The present invention relates to position determination and is particularly, but not exclusively, concerned with derivation of camera position for so-called xe2x80x9cvirtual studioxe2x80x9d television production. In such applications, a virtual scene, for example a computer-generated background, is superimposed onto a real scene, for example actors. It is important to know the exact position and orientation of the television camera in the studio, so that the virtual scene can be computed to correspond to the viewpoint of the real camera, to ensure correct registration between real and virtual elements in the scene.
There are a number of commercially-available systems that can provide camera position information. Most are based on mechanical camera mountings, such as robotic pedestals, pedestals mounted on tracks, or robot arms, which have sensors to measure position and orientation. These systems cannot be used with hand-held cameras, and can be bulky and difficult to use.
There are also methods that work without mechanical sensors, but instead use a patterned blue background, which is visible in the camera image. By analysing the video signal, these methods can deduce the orientation and position of the camera. One example of such a method is described in our earlier GB-A-2271241, which derives pan, tilt and zoom using an arbitrary patterned background. A further example using the same technique is described in WO-A-95/30312, which uses a particular type of pattern to enable pan, tilt, zoom and position to be determined. However, these methods rely on the presence of a two-tone blue background. This is inappropriate in some situations, for example when it is desired to extract shadows of objects from the foreground image. Also, in some situations there may be little or no blue background visible, for example during a close-up shot of an actor. Sometimes it may also be required to place a virtual object against a real background, in which case there will be no blue background at all.
WO-A-9711386 discloses apparatus for determining position and orientation in which a camera is pointed at an optically modulated target.
EP-A-706105 discloses a navigation system for an autonomous mobile robot in which coded signs are placed at various locations. The markers are distinguished based on the ratio of radii of rings.
A method for locating the position of a head-mounted display that has been used in the field of Augmented Reality is described in the paper by Azuma et al entitled xe2x80x9cA Demonstrated Optical Tracker with Scalable Work Area for Head-Mounted Display Systemsxe2x80x9d published in ACM Computer Graphics: Proceedings of the 1992 Symposium on Interactive 3D Graphics (Cambridge, Mass., April 1992), pp. 43-52. This method uses a number of infra-red LEDs mounted in ceiling panels, viewed by four upward-looking sensors mounted on the user""s headset. The sensors each provide information of the co-ordinates of bright points in the image (the LEDs) and from these co-ordinates, the known position of the LEDs and the geometry of the sensors, the position of the headset is computed. The inventors contemplated applying Azuma""s method to the problem of determination of camera position. However, Azuma""s method is not intended to be used for determining the position of a camera in a large studio, but is designed instead to work in a relatively small volume, where the total number of LEDs is small. The inventors identified several potential problems in applying such a technique to the field of camera position determination.
In a television studio, the camera could move many tens of meters, so the method of Azuma et al would require a very large number of ceiling markers of which only a small proportion would be visible at any one time. Azuma""s method relies on active control of the LEDs to identify the LEDs. The control electronics and wiring required to implement this in a large studio would be complex and impracticable. Furthermore, the ceiling of a television studio generally contains a number of lights and other objects that could make identifying the markers much more difficult than in the carefully controlled conditions in which Azuma""s method is designed to function; the bright spot sensors used by Azuma would generate spurious signals. A still further problem the inventors have identified is that the set of markers that the camera can see will change, not only due to markers coming into and out of the field of view as the camera moves, but also due to markers being obscured by objects such as microphone booms and studio lights. Using Azuma""s method, each time a marker appears or disappears, there is likely to be a small but sudden change in the computed position and orientation of the camera, since any errors in the camera calibration or the measurement of the marker positions will lead to results that depend on which markers are being used.
Thus, Azuma""s method cannot be directly applied to the present problem. The inventors have developed novel techniques which are particularly suited to the exacting requirements of determining camera position in a television studio, which overcome or alleviate the drawbacks of conventional techniques. References in this specification to a video camera are intended to include any camera capable of obtaining video images; although the video camera is advantageously a conventional television studio camera, no limitation to a particular type of camera or intended use is implied.
In a first aspect, the present invention provides a method of determining the position of an object comprising:
providing a plurality of markers at respective reference positions, at least some of the markers being patterned to encode identification information, the pattern providing a reference point for each marker;
storing information including a measure of the positions of the markers and information identifying the patterned markers;
obtaining an image of at least a sub-set of said plurality of markers from a camera associated with the object;
processing the image to identify the positions of said markers in the image and, for each patterned marker in the image, decoding said identification information;
determining a measure of the position of the object based on said processing and decoding and based on said stored information.
Thus, with the invention, at least some, and preferably all, markers are patterned to encode information identifying the marker. In this way, it becomes possible to determine not only relative movements of the object but also the absolute position by xe2x80x9clooking upxe2x80x9d the positions of each patterned marker. This may enable much greater freedom of movement of the object over a wider area whilst allowing absolute position to be measured accurately. In addition, determination of relative movement may be simplified or improved.
Preferably, the pattern comprises concentric shapes, preferably substantially closed rings; the use of concentric rings to encode the information facilitates location of the markers, as the centre of the rings can conveniently be located and provides a convenient reference position. Preferably, identifying the positions of the markers includes identifying the centres of the concentric rings as a reference point for each patterned marker. Although the rings are most preferably concentric to within the resolution of the camera means in use as this greatly facilitates identification and decoding, they need not be exactly concentric; in such a case, eccentricity may be used to provide a measure of angular orientation, and the reference point may be determined based on, for example, the centre of the innermost ring.
Preferably, the patterned markers encode information as a series of light and dark regions. This enables a monochrome camera to be used to identify the markers, allowing higher resolution to be obtained at 756xc2x0 lowest. The camera may operate in the visible region of the spectrum, the patterned markers having visible-markings; this facilitates identification of the markers during setting up by a user. However, infra-red (or ultra-violet) light may be used.
Preferably the identification information is encoded in binary form as the brightness or reflectivity of each ring. This can provide a robust and simple coding scheme. The rings in such a scheme are preferably of known inner and outer diameter and preferably of substantially equal thickness. If the dimensions or relative dimensions of each ring are known, it is not necessary for the rings to be visibly delimited; adjacent rings can be contiguous even if of the same shade. In addition, a measure of the distance to each marker can be derived based on the size of the marker in the image.
Alternatively, the information can be encoded in binary form as the thickness of each ring, successive rings preferably alternating between light and dark.
The rings are preferably substantially circular (this facilitates detection and decoding) and preferably all of the same or similar shapes. However, other shapes (for example squares, rectangles or polygons) may be used; the term xe2x80x9cringxe2x80x9d as used in this specification is intended to encompass such non-circular rings.
Surprisingly it has been found that good results can be obtained with monochrome patterns and binary encoding, even though this may require several rings. This has been found to be due in part to the better resolution and linearity generally attainable with monochrome cameras than with colour cameras. In addition, monochrome encoding enables a much higher dynamic range between light and dark to be attained and provides lower sensitivity to changes in ambient lighting.
Indeed, this binary encoding enabling monochrome detection is an important feature which may be provided independently in a second aspect, in which the invention provides a method of determining the position of an object comprising:
providing a plurality of markers at respective reference positions, at least some of the markers being patterned to encode identification information in binary form as a series of light and dark regions, the marker also providing a detectable reference point;
storing information including a measure of the positions of the markers and information identifying the patterned markers;
obtaining an image of at least a sub-set of said plurality of markers from a camera, preferably monochrome camera means, associated with the object;
processing the image to identify the positions of said markers in the image and, for each patterned marker in the image, decoding said identification information;
determining a measure of the position of the object based on said processing and decoding and based on said stored information.
Surprisingly, if binary encoding is used, the markers can be relatively complex and still be reliably decoded. Thus, in preferred arrangements, the markers encode at least 3, and more preferably 4 or more bits of identification information. This facilitates identification of the markers, as a larger number of markers can be uniquely identified. Preferably, at least one guard ring or error-correcting ring is provided, so the markers contain at least 4 or more rings.
In a related third aspect, the invention provides a set of patterned markers each comprising a series of concentric rings, each marker encoding a plurality of bits of information in binary form as the brightness or reflectivity and/or thickness of each ring, the information encoded varying between the markers. Some markers may be repeated (i.e. encode the same information) within the set.
The invention also provides use of a series of concentric rings to encode a plurality of bits of information in binary form as the brightness or reflectivity and/or thickness of each ring.
The invention further provides, in a fourth aspect, a method of producing a marker encoding a plurality of bits of information, the method comprising obtaining the information to be encoded in binary form and providing on a backing a series of concentric rings, each ring having either a high or a low brightness or reflectivity, the thickness and brightness or reflectivity of each ring being selected according to a pre-determined coding scheme to encode the information.
This may be implemented as a method of producing a marker encoding a plurality of bits of information, the method comprising obtaining the information to be encoded in binary form and providing on a backing a series of concentric rings, each ring corresponding to a bit of information to be encoded and having a first brightness or reflectivity to encode a binary xe2x80x9c0xe2x80x9d and a second brightness or reflectivity to encode a binary xe2x80x9c1xe2x80x9d.
A variety of coding schemes may be used, including self-clocking and self error-correcting schemes. For example, known linear barcoding schemes may be used with successive bars corresponding to successive rings.
Although a number of coding schemes may be used, it is found that, surprisingly, a simple coding scheme in which the rings are of substantially constant thickness and each have a brightness or reflectivity corresponding directly to a bit of information can provide a compact marker which can be reliably and readily decoded.
One or more parity bits or error-correcting bits may be included (preferably one parity bit for the xe2x80x9cevenxe2x80x9d rings and one for the xe2x80x9coddxe2x80x9d rings); this enables offset errors to be detected.
Decoding the markers may include comparing the identification information to predicted identification information based on marker positions and correcting or verifying the identification information based on the results of the comparison. For example, if the decoded marker identification information differs from the predicted information by an amount corresponding to a single bit or a shift of all bits, this may be interpreted as a corresponding read error and corrected xe2x80x9cintelligentlyxe2x80x9d.
The backing preferably has a surface coating of said first or second reflectivity or brightness (for example white or retro-reflective or black); in this way, only rings of the other reflectivity or brightness need be actively provided, for example by applying retro-reflective material or pigment.
The dark regions are preferably substantially black. The light regions may be white, but may be advantageously formed by retro-reflective material, to be illuminated by a light substantially optically coincident with the camera axis. This enables a high level of contrast to be attained, and facilitates detection of markers. The use of retro-reflective material and a light source associated with the camera renders detection of the markers less sensitive to changes in ambient lighting conditions, as may occur in a television studio. In place of retro-reflective material, fluorescent pigments may be employed to enhance the contrast.
Preferably each patterned marker has an outer region, preferably a relatively wide annular region, which is substantially the same shade for each marker, preferably dark; this may facilitate distinguishing the marker from other items, for example studio lights. Preferably the ring immediately inside the outer region is of the opposite shade to said outer region, preferably light; this may facilitate determination of the size of the pattern and enable more reliable decoding of the information. The outer region may contain one or more markings enabling the angular orientation of the marker to be determined.
The innermost ring or rings may be of predetermined brightness to facilitate detection of the centre of the marker, or may be used to encode identification information.
As an alternative to binary encoding, the rings may vary in colour, each ring encoding more than one bit of information. For example, a range of colours may be employed, with a particular bit pattern assigned to a selected colour, preferably based on an RGB encoding scheme. This may allow more information to be encoded in a smaller number of circles (for examples, if 8 secondary colours are used, each ring can encode 3 bits).
Preferably, at least some markers encode identification information uniquely identifying that marker among the plurality of markers. This enables the absolute position to be determined without needing to know an initial starting position. The markers are preferably so disposed that the position of the object can be uniquely determined at all points within a given volume.
The method may include adjusting the determined position to smooth out sudden changes in determined position as markers are revealed or obscured. This may alleviate the problem of small, but nonetheless sudden and very visible changes in determined object position due to small errors in determination of position of the markers as different markers move in and out of the field of view of the camera. Adjusting is preferably achieved by applying correction factors to the stored or measured positions of the markers so that determination of object position is based on corrected marker positions, the correction factors being calculated so that the corrected marker positions tend to mutually self-consistent values, the rate of variation of the correction factors being kept below a pre-determined level.
Adjustment may be effected by tracking movement of the markers as the object position changes and generating 3-dimensional correction vectors. In this way, stable, refined marker positions may be determined. However, tracking markers over several frames may be computationally intensive, and will fail to produce stable values for marker positions where errors are attributable to any non-linearity in the camera or lens. Thus, a preferred, simpler implementation of adjustment comprises generating correction factors with 2 degrees of freedom (for example displacement vectors parallel to the image plane or parallel to the reference surface) and correcting the marker positions to tend to render the marker positions self-consistent for an image frame or for a series of image frames.
The rate of variation may be limited so that adjustment of the correction factors occurs over a period of a few seconds (preferably at least about 2 seconds, preferably less than about 15 seconds, typically about 5-10 seconds), so that a gradual drift in the determined position occurs which is much less perceptible than a sudden shift.
A measure of the error or accuracy of position determination may be provided based on the self-consistency of determined position calculated for each marker. This may be based on the magnitude of said correction factors.
In a fifth aspect, the invention provides a method of determining the position of an object based on identification of a plurality of markers at respective reference positions, the method comprising applying correction factors to stored measures of the marker positions to produce respective corrected marker positions and determining the object position based on the corrected marker positions, wherein the correction factors are calculated so that the corrected marker positions tend to mutually self-consistent values, the rate of variation of the correction factors being limited to a pre-determined level.
Although the markers may be adhered to a flat reference surface, the markers are preferably positioned at varying distances from a reference surface, preferably the ceiling of a room in which the object is moved; this is found to provide a surprising improvement in the accuracy of detection of movement of the object. Particularly in such a case, the stored information preferably contains a measure of the distance of each marker from the reference surface. This may be achieved by storing the three-dimensional co-ordinates of each marker with respect to a defined origin (in cartesian or any polar or curvilinear form which is convenient for the position determination algorithm).
In the preferred application, the object is a video camera, and the camera means comprises a separate, generally upward pointing, camera mounted on the video camera.
In a sixth, apparatus, aspect, the invention provides apparatus for position determination comprising a camera for mounting on an object whose position is to be determined; memory means arranged to store measures of the positions of a plurality of markers and to store information identifying patterned encoded markers; image processing means arranged to process an image output by the camera to identify the positions of markers in the image; decoding means for decoding information encoded in the patterned markers; and position determining means arranged to determine the position of the object based on the output of the image processing means, the decoding means and the information stored in the memory means.
The memory means, image processing means and position determining means may all be integrated into a single computer. However, preferably, at least part of the function of the image processing means is provided by a hardware accelerator arranged to identify one or more markers or to decode one or more patterned markers or both.
The apparatus preferably further includes means for applying correction factors to stored measures of the marker positions to produce respective corrected marker positions and determining the object position based on the corrected marker positions, wherein the correction factors are calculated so that the corrected marker positions tend to mutually self-consistent values, the rate of variation of the correction factors being limited to a pre-determined level.
In cases where the position of a video camera (separate from said camera means) is determined, position determination may be supplemented by information obtained from the video camera image. In addition, information concerning the video camera zoom and focus settings may be input. One or more encoded markers may be placed in the field of view of the video camera. Provision of encoded markers at specific positions in the field of view is much easier to implement in practice than provision of a patterned background for an entire studio, as used in prior art methods. Moreover, the encoded markers may enable more accurate measurement than a conventional background in which it is difficult to provide accurate patterning over a large area.
Thus, the method of the first or second aspect may further comprise providing at least one supplementary marker at a given position so as to be viewable by the video camera, the or each marker having identification information encoded thereon in binary form as a series of concentric rings of two tones, the tones being selected to enable the marker to be keyed out of the video camera image by chroma-keying, said determining the position of the video camera being based on the position of the supplementary marker, if present, in the video camera image.
This supplementary positioning may be provided independently, as a seventh aspect, in a method of determining or correcting a measure of the position of a video camera, the method comprising providing at least one marker at a given position viewable by the camera, the or each marker having identification information encoded thereon in binary form as a series of concentric rings of two tones, the tones being selected to enable the marker to be keyed out of the video camera image by chroma-keying, a measure of the position of the video camera being determined based on the position of the marker, if present, in the video camera image. Complete position determination may be effected by use of several such markers so that preferably at least three individual markers are always visible, or by additionally using mechanical sensing, or other optical methods, with the encoded markers providing a check or reference point. Preferably a set of preferably at least three individual markers are mounted on a substantially rigid support to form a cluster of markers, the markers preferably being non co-planar.
In an eighth aspect, the invention provides marker apparatus for positioning in the field of view of a camera for use in determining the relative position of the camera and the marker apparatus, the apparatus comprising at least three non-co-planar patterned markers mounted at predetermined relative positions on a substantially rigid base, the patterned markers each having identification information encoded thereon in binary form. The information is preferably encoded as a series of concentric rings, preferably of two tones, the tones being selected to enable the marker to be keyed out of the video camera image by chroma-keying.
Reference has been made above to a camera means mounted on the object and markers disposed around a volume in which the object is located. This is indeed preferable for determining the position of a camera in a large studio. However, it will be appreciated that the technique may be employed to determine the position of one or more objects on which markers are located using one or more cameras mounted in the volume in which the object is located. For example, the positions of several cameras within a relatively small studio may be determined by a single camera mounted in the ceiling of the studio and having a sufficient field of view to view all cameras; this may lead to a saving in cost.
Thus, in a ninth and final aspect, the invention provides a method of determining the relative positions of camera means and an object having a plurality of markers, preferably at least three, mounted thereon, the relative positions of the markers and the object preferably being substantially fixed, each marker being patterned to encode identification information in binary form, preferably as series of concentric rings, the method comprising storing the relative positions of the markers and information enabling the identification information to be decoded, identifying the positions of the markers and decoding the identification information thereon, and, based on the stored information, determining a measure of the relative positions and orientations of the camera means and the object. The method is preferably employed to determine the positions of several objects relative to the camera means, the objects being distinguished based on the identification information. The camera means may comprise several cameras having different viewpoints.