Traditionally, television (TV) production has featured the use of at least three cameras, with the video produced by those cameras seen in a control room on a monitor for each camera and selectable through a video switcher. As used here, a signal received from an audio or video device is sometimes referred to as a “feed.” A human director of a live TV program typically chooses camera views among these three or more feeds, while speaking into a microphone with wireless communication to the camera operators, to tell them how to position their cameras. Similarly, in live audio recording situations, many microphone feeds (i.e., audio signals received from the microphones), as well as direct electronic feeds from electronic instruments (e.g., electric guitars, keyboards, and such), go into a live audio mixer, and a front-of-house human engineer adjusts relative volumes and equalizations of the feeds to create a pleasing audio mix.
These processes have, in recent years, become far more difficult, most especially in the shooting and recording of live sporting events, as the number of camera positions has increased to a typical 18 or more per event; and microphone positions may number greater than 100. The director's job of controlling, among other things, proper positioning of cameras and/or microphones with virtually no information regarding their current positions aside from the feeds themselves and the verbal position reports given by human operators and with virtually no direct control of position aside from verbal instructions to the operators, is taxing to say the least. It is generous to say that the control available to the director is “loose”.
The complexity of such systems is exacerbated by stereoscopic imaging, which now increasingly places yet greater demands on the director's and the audience's understanding of where action takes place within observable three-dimensional (3D) space. 3D broadcasts typically involve more cameras and more microphones than for conventional, two-dimensional broadcasts. So, the complexity of the director's task of operator control grows exponentially.
Beyond the complexity of more devices to control, 3D broadcasting poses additional challenges.
One such challenge is that of convergence. In 3D video viewing, each object can appear to be at a depth that is not at the surface of the video display. One of the challenges of 3D video production is the stability of an object's perceived depth from shot to shot. For example, without careful control, a stationary object in a 3D video can appear to move or even jump toward or from the viewer. The effect can be very distracting and annoying to the viewer, much like the perpetual zooming in and out by some amateur videographers. Convergence refers to the relative horizontal positioning of left- and right-eye images meant to be viewed simultaneously (or in rapid sequence, utilizing the phenomenon of persistence of vision, to seem simultaneous); proper alignment of convergence from shot to shot lends stability of the scene in terms of its depth, in what is generally referred to as the z-axis of the three-dimensional display. As used here, a shot is an uncut, uninterrupted video scene captured by a camera.
Conventionally, convergence is controlled by manually watching the video and manually adjusting separation of left and right views to maintain a relatively consistent perceived depth. Doing so in a live video production with numerous 3D video feeds is simply impractical, and yet that is what has been done in all conventional multi-camera 3D TV video production. Because of its impracticality, the number of camera positions possible to use in 3D video production has, thus far, been far smaller than those commonly used in 2D video production.
Another challenge posed by 3D video production is that of vertical alignment of left and right view cameras. The two cameras capturing left and right views for the left and right eyes of the viewer ought to be precisely aligned vertically, i.e., pointing at precisely the same elevation. However, even if the cameras and/or lenses are carefully aligned at the beginning of a shot, the left- and right-eye images can become misaligned during the shot.
In particular, the focal center of the camera's lens can move slightly off center as the elements of the lens move during zooming in and out. Accordingly, the line of sight of the camera can vary slightly in the vertical direction, sometimes significantly, particularly at high magnitudes of zoom. This can result in an object appearing to the human viewer as being slightly higher or lower in the left eye than in the right eye. This effect can be very distracting and annoying and even painful to the viewer.
A third challenge posted by 3D video production is the unavailability of a 3D view in some instances. In complex productions such as live sporting events, it may be impractical to equip the venue with 3D video setups throughout. Accordingly, some aspects of the production may only be available in two-dimensional, flat video. That 2D video will likely be shot in a different style than would be acceptable for 3D viewing, since audiences tend to like much less cutting from shot to shot, and longer length shots, in 3D than in 2D.
Conversely, because video producers need to create both 2D and 3D versions of the same events, and since different cutting styles will be typically wanted for the two versions, a fourth challenge in 3D shooting is that the common procedure of taking one of the 3D eye views as a 2D version results in less exciting content (for the 2D version alone) than could have been achieved if the 2D version were created separately.