Audio data representing recordings of sound associated with physical environments are increasingly being stored in digital form, for example in computer memories. This is partly due to the increase in use of desktop computers, digital sound recording equipment and digital camera equipment. One of the main advantages of providing audio and/or image data in digital form is that it can be edited on a computer and output to an appropriate data output device so as to be played. Increasingly common is the use of personal sound capture devices that comprise an array of microphones to record a sound scene, which a given person is interested in recording. The well known camcorder type device is configured to record visual images associated with a given environmental scene and these devices may be used in conjunction with an integral personal sound capture device so as to create a visual and audiological recording of a given environmental scene. Frequently such camcorder type devices are used so that the resultant, image and sound recordings are played back at a later date to colleagues of, or friends and family of, an operator of the device. Camcorder type devices may frequently be operated to record one or more of: sound only, static images or video (moving) images. With advances in technology sound capture systems that capture spatial sound are also becoming increasingly common. By spatial sound system it is meant, in broad terms, a sound capture system that conveys some information concerning the location of perceived sound in addition to the mere presence of the sound itself. The environment in respect of which such a system records sound may be termed a “soundscape” (or a “sound scene” or “sound field”) and a given soundscape may comprise one or a plurality of sounds. The complexity of the sound scene may vary considerably depending upon the particular environment in which the sound capture device is located within. A further source of sound and/or image data is sound and image data produced in the virtual world by a suitably configured computer program. Sound and/or image sequences that have been computer generated may comprise spatial sound.
Owing to the fact that such audio and/or image data is increasingly being obtained by a variety of people there is a need to provide improved methods and systems for manipulating the data obtained. An example of a system that provides motion picture generation from a static digital image is that disclosed in European patent publication no. EP 1235182, incorporated herein by reference, and in the name of Hewlett-Packard Company. Such a system concerns improved digital images so as to inherently hold the viewer's attention for a longer period of time and the method as described therein provides for desktop type software implementations of “rostrum camera” techniques. A conventional rostrum camera is a film or television camera mounted vertically on a fixed or adjustable column, typically used for shooting graphics or animation—these techniques for producing moving images are the type that can typically be obtained from such a camera. The system described in EP 1235182 provides zooming and panning across static digital images.