1. Field of the Invention
This invention relates generally to the field of motion capture. More particularly, the invention relates to an apparatus and method for improving marker identification within a motion capture system.
2. Description of the Related Art
“Motion capture” refers generally to the tracking and recording of human and animal motion. Motion capture systems are used for a variety of applications including, for example, video games and computer-generated movies. In a typical motion capture session, the motion of a “performer” is captured and translated to a computer-generated character.
As illustrated in FIG. 1a in a motion capture system, a plurality of motion tracking “markers” (e.g., markers 101, 102) are attached at various points on a performer's 100's body. The points are selected based on the known limitations of the human skeleton. Different types of motion capture markers are used for different motion capture systems. For example, in a “magnetic” motion capture system, the motion markers attached to the performer are active coils which generate measurable disruptions x, y, z and yaw, pitch, roll in a magnetic field. By contrast, in an optical motion capture system, such as that illustrated in FIG. 1a, the markers 101, 102 are passive spheres comprised of retro-reflective material, i.e., a material which reflects light back in the direction from which it came, ideally over a wide range of angles of incidence. A plurality of cameras 120, 121, 122, each with a ring of LEDs 130, 131, 132 around its lens, are positioned to capture the LED light reflected back from the retro-reflective markers 101, 102 and other markers on the performer. Ideally, the retro-reflected LED light is much brighter than any other light source in the room. Typically, a thresholding function is applied by the cameras 120, 121, 122 to reject all light below a specified level of brightness which, ideally, isolates the light reflected off of the reflective markers from any other light in the room and the cameras 120, 121, 122 only capture the light from the markers 101, 102 and other markers on the performer.
A motion tracking unit 150 coupled to the cameras is programmed with the relative position of each of the markers 101, 102 and/or the known limitations of the performer's body. Using this information and the visual data provided from the cameras 120-122, the motion tracking unit 150 generates artificial motion data representing the movement of the performer during the motion capture session.
A graphics processing unit 152 renders an animated representation of the performer on a computer display 160 (or similar display device) using the motion data. For example, the graphics processing unit 152 may apply the captured motion of the performer to different animated characters and/or to include the animated characters in different computer-generated scenes. In one implementation, the motion tracking unit 150 and the graphics processing unit 152 are programmable cards coupled to the bus of a computer (e.g., such as the PCI and AGP buses found in many personal computers). One well known company which produces motion capture systems is Motion Analysis Corporation (see, e.g., www.motionanalysis.com).
FIG. 1b illustrates an exemplary motion capture camera 110. The camera 110 includes an illuminating ring 111 for directing light at the retro-reflective markers and a lens 112 for capturing light reflected off of the retro-reflective markers. As shown in the front view of the camera, the illuminating ring 111 generates light using a plurality of light emitting diodes (“LEDs”) 113 distributed along the front surface of the ring (i.e., the surface facing the markers). LEDs are particularly useful for this application because they are capable of generating light that is projected in a particular direction. The lens 112 passes through the center of the illuminating ring 111, as illustrated. Cameras such as this are available from a variety of companies including Vicon (www.vicon.com) and Motion Analysis (www.motionanalysis.com).
FIG. 2 illustrates a bird's eye view of a motion capture session with a performer 100. The performer's head is identified as 205; the performer's arms are identified as 201 and 202; the performer's hands are identified as 203 and 204; and two retro-reflective markers are identified as 206 and 207.
As illustrated generally in FIG. 2, a significant number of cameras 210, 220, 230, 240, 250, 260, 270 and 280 may be used for a given motion capture session. For example, in the movie “Polar Express,” recently released by Warner Bros. Pictures, as many as 64 cameras were used to capture certain scenes. Given the significant number of cameras used for these scenes, it becomes very likely that each camera will have several other cameras within its field of view. By way of example, camera 210 in FIG. 2 has three different cameras 240, 250, and 260, within its field of view.
One problem which results from this configuration is illustrated in FIGS. 3-4 which shows light rays 311 and 313 emanating from the illuminating ring of camera 210 and light rays 351 and 354 emanating from the illuminating ring of camera 250. Light ray 311 hits retro-reflective marker 206 and reflects directly back (or almost directly back) to camera 210 as light ray 312. The position of the retro-reflective element 206 may then be identified and processed as described above. Similarly, light ray 351 hits retro-reflective marker 207 and reflects directly back to camera 250 as light ray 352. However, instead of hitting a retro-reflective marker, light ray 314 is directed into the lens of camera 250 and light ray 354 is directed into the lens of camera 210. Since the light rays 314 and 354 are projected directly from an illuminating ring into a camera ring, they appear as very bright objects—as bright or even brighter than the light retro-reflected from markers 206 and 207. As a result the thresholding function of the motion capture system does not reject light rays 314 and 354, and cameras 210 and 250 capture more than just marker images.
FIG. 5 illustrates the view from camera 210 and FIG. 6 illustrates the elements which are captured by the camera 210 when a thresholding function is applied that eliminates all but the brightest objects in an effort to isolate the retro-reflective markers (the objects which are eliminated from the scene due to the thresholding function are shown as dotted lines). As illustrated, in addition to capturing light from the retro-reflective markers on the performer's body (such as marker 501), camera 210 also captures light emitted from the illuminating rings 541, 551, and 561 (i.e., because light from the illuminating rings will tend to be as bright or brighter than the light reflected off of the retro-reflective markers)
This becomes a problem for obvious reasons, i.e., the motion capture logic associated with camera 210 may misinterpret light ray 354 as a retro-reflective element, and the motion capture logic associated with camera 250 may misinterpret light ray 314 as a retro-reflective element. As a result, following a performance, a significant amount of “clean up” is typically required during which computer programmers or animators manually identify and remove each of the misinterpreted elements, resulting in significant additional production costs.
Current motion capture studios attempt to address this problem by positioning the cameras carefully so that no one camera is directed into the field of view of any other camera. For example, cameras 240, 250, and 260 may be removed from the field of view of camera 210 if they are positioned at a significantly different elevation than camera 210 or camera 210 is aimed at an angle which does not have any other cameras in its field of view. However, even with careful positioning, in a production which utilizes a significant number of cameras (e.g., 64) some cameras will almost certainly have other cameras within their field of view. As such, improved techniques for limiting the number of misinterpreted markers within a motion capture system are needed.