The role of being a camera operator can often leave the user detached from the very event that they are trying to capture. Thus rather than participating in a family event, such as a wedding, attending a sporting event, or enjoying a holiday some people become so engrossed by the process of capturing these events on their camera that they don't truly participate in the event and merely observe the majority of the event through the view finder of the camera.
It has been proposed in the scientific literature to provide wearable cameras, see for example Starner, Schiele and Pentland, “Visual Contextual Awareness in Wearable Computing”, 2nd International Symposium on Wearable Computers October 1998. Such a wearable camera is able to continually monitor the environment around a person and to capture scenes from it. Such a camera could, of course, be operated by the user but it is preferable that the camera is continually active and analyses the scenes that it has acquired in order to determine whether or not the image is “interesting”. In this context “interesting” means that it would be of interest to the camera's owner.
Wearable cameras have no innate understanding of the environment around them. They therefore need to be trained to understand the visual (and other) clues presented to the camera in order to determine what images a user would like or prefer to be captured. The “rules” which a camera can apply in order to determine whether it should store an image can be considered as “behaviours”. The behaviours that a camera should apply can vary depending on the position of the camera and the activity that the camera is viewing. Thus if the wearable camera were attached to a skier, then the chances are that interesting images would include those where other objects were reasonably close to the skier. However if the camera were attached to a hill-walker, then it is likely that panoramic views of scenery would be preferred. Furthermore, if a camera which had a behaviour suitable for skiing were to be used inside a shopping mall or supermarket then it is likely that almost all images would satisfy the condition of having objects sufficiently close for them to be considered interesting and hence the camera would be unlikely to show a sufficient level of discrimination and would probably capture images relating to nearly all of the time that the wearer was in the supermarket environment.
A teachable camera is disclosed in U.S. Pat. No. 5,227,835 assigned to Eastman Kodak Company. The teachable camera includes a template matching neural network which is responsive to inputs such as a focus sensor, an exposure sensor, a motion sensor and a flash control sensor, and also to a camera microprocessor, and which alters the performance of camera functions such as camera flash, shutter speed, lens focus, and aperture so that the camera characteristics are suited to the picture characteristics desired by the photographer. The neural network template can be altered by a rule based expert system executing on a personal computer.
Workers, such as Clarkson and Pentland in “Unsupervised Clustering of Ambulatory Audio and Video” proceedings of the International Conference of Acoustics, Speech and Signal Processing, Phoenix, Ariz., 1998, have disclosed a wearable camera which has used hidden markov models in order to determine the nature of the environment surrounding the camera. Thus, a camera having knowledge of the sort of images that it would see in a video store has successfully been demonstrated when it has entered another video store and is able to separate this environment from other events.