1. Field of the Invention
The present invention relates to a video signal generating apparatus equipped with a camera and to a video signal receiving apparatus for receiving the video signal generated by that video signal generating apparatus. More particularly, the invention relates to a system which inserts human-readable text data into the video signal in response to operations performed on the video signal generating apparatus or on the camera attached thereto.
2. Description of the Related Art
[Original Purpose of the EssenceMark (Registered Trademark)]
SMPTE (Society of Motion Picture and Television Engineers) stipulates criteria called KLV (Key-Length-Value) as a basic structure in which to describe metadata. As shown in FIG. 1, metadata is described as a byte string of a Key-Length-Value structure. In this structure, “Key” is a 16-byte label indicative of a metadata type. All keys are registered in an SMPTE metadata dictionary. “Value” denotes an actual value of the metadata indicated by the key. “Length” preceding “Value” stands for the data length of “Value” (number of bytes).
The applicant of the present invention has proposed metadata called “EssenceMark” (registered trademark), referred to as EssenceMark™ hereunder. The metadata was devised as a variation of “Term Value,” one of the KLV metadata types according to SMPTE and used as bookmarks of AV (audio video) materials. The proposed metadata is explained illustratively by Yoshiaki Shibata, Takumi Yoshida and Mitsutoshi Shinkai in “EssenceMark—SMPTE Standard-based Textual Video Marker—,” SMPTE Motion Imaging Journal Vol. 114, No. 12, pp. 463 to 473, December 2005 (Non-Patent Document 1), and in “EssenceMark,” an article in the September 2006 issue of the Journal of the Motion Picture and Television Engineering Society of Japan Inc. (Non-Patent Document 2). The use of EssenceMark™ involves attaching human-readable text data directly or indirectly to desired frames of an AV material. The primary objective of EssenceMark™ is to provide high-speed access to desired locations in the material selectively according to text data values. Illustratively, a particular location furnished with EssenceMark™ is instantly accessed, or all frames with specific EssenceMark™ are extracted and displayed as thumbnails (as shown in FIG. 5 of the above-cited Non-Patent Document 2). The manner in which to store multi-byte characters such as those of the Japanese language using EssenceMark™ is precisely established.
SMPTE stipulates that for the baseband transmission of an AV material in SDI (Serial Digital Interface) format, KLV metadata packets are to be stored as ancillary data in VBI (vertical blanking interval) areas of the material (SMPTE 259M, 292M, 291M). As a variation of the KLV metadata, EssenceMark™ in SDI format is thus transmitted as ancillary data placed in the VBI areas of the frames to which the marker is attached.
Reserved words are part of the specifications of EssenceMark™. The words were first introduced as a common language to avert varying descriptions of the frequently used EssenceMark™ values (i.e., values shown in FIG. 1). FIG. 2 gives a list of the reserved words. Each reserved word begins with a symbol “_.” For example, the word “_RecStart” is a reserved word that designates a recording start point of an AV material. Where videotapes are patched together for consecutive recording, the recording start location of each videotape may be marked by this reserved word. The word “_RecEnd” is a reserved word that designates a recording end point of the AV material. “_ShotMark1” and “_ShotMark2” are reserved words that specify points of interest of the AV material.
As EssenceMark™ values, any text data may be designated. Still, the most practical of the values are those offered as text values that have their own meanings. FIG. 3 is the same as FIG. 2 in the above-cited Non-Patent Document 1 (and equivalent to FIG. 1 in the above-cited Non-Patent Document 2), schematically showing an AV material (video clips) furnished with EssenceMark™. FIG. 4 is the same as FIG. 6 in the above-cited Non-Patent Document 1, showing a list of EssenceMark™ values in XML (Extensible Markup Language) attached to clip 1 in FIG. 3. This list of EssenceMark™ values attached to the AV material and expressed as a human-readable XML document gives an at-a-glance picture of typical frames furnished with particular EssenceMark™ values having their own meanings.
Generally, when video recording equipment (e.g., optical disk recorders, VCRs) is said to comply with EssenceMark™, that means the equipment is capable of simultaneously recording images and EssenceMark™ associated with the images. Earlier, this applicant proposed techniques for attaching text data such as EssenceMark™ to video materials as they are picked up and recorded in real time so that at a later stage, relevant image scenes furnished with the text data may be instantly accessed, displayed, or otherwise edited efficiently on an editing apparatus (see Japanese Patent Laid-Open No. 2003-299010). However, such video recording equipment is incapable of starting or stopping the recording of images in accordance with EssenceMark™ attached to video materials.
[Typical Endoscope (Camera) and Video Recording Apparatus]
At present, a medical endoscope (i.e., camera) is typically connected with a video recording apparatus (e.g., optical disk recorder, VCR) by use of two cables: a coaxial cable for transmitting image signals in HD-SDI format, and an RS232C cable for controlling the downstream video recording apparatus. This setup poses no problem when the camera and the video recording apparatus are connected on a peer-to-peer basis. Problems can arise if the two apparatuses constitute part of a larger system. For example, there may be a system that causes the moving images recorded by the video recording apparatus to be fed to a diagnostic program for analysis and allows the result of the analysis to be stored into a separate PACS (picture archiving and communication system for storage and distribution of clinical images). In this system, the video recording apparatus is obviously connected to a communication network. Whenever it is desired to view original moving images on a main monitor of the system, the image reproduction should be controlled by a personal computer carrying the diagnostic program in use; it is not practical to bring the camera up to the person who wants to view the images (it is assumed that the video recording apparatus is not at hand but controlled remotely). In other words, whereas the operations directly related to image pickup and recording are to be controlled by the camera, unrelated operations such as reproduction, fast-forward, rewind, and random access to desired locations should preferably be made through a separate interface and not from the camera.
Typical video recording apparatuses are designed to handle all control commands via the RS232C cable as mentioned above. This makes it difficult for any such apparatus to function as a subsystem in a larger system configuration. There could be provided at least two RS232C cables of which one may be used for the usual connection with the camera and the other for connection with the PC carrying the diagnostic program. Such a setup, however, would demand establishing new provisions regulating cases in which, illustratively, competing control commands are simultaneously input through these RS232C cables (the present protocol has no provisions against such contingencies). The arrangements involving the additional installation of redundant interface circuits will lead to considerable cost increase. Having to install the coaxial cable and the RS232C cable between the camera and the video recording apparatus is bothersome in the first place.
[Connection Between the Camera and the Video Recording Apparatus]
This applicant has filed another patent application (Japanese Patent Application No. 2005-77965), undisclosed at the time of this filing. The above application proposes a technique to be used as follows: there are cases where a plurality of broadcasting stations are represented by a single master station that picks up and records videos for coverage and distributes them to the other stations coming under the master station. In such cases, the proposed technique enables the video camera of the master station creating a material videotape to have the subordinate stations simultaneously create the same material videotape on their VCRs. Specifically, video signals transmitted by the video camera to at least one video recording apparatus connected to that video camera are multiplexed with control signals. The video recording apparatus or apparatuses under the master station detect control signals from the incoming video signals and control their internal video recording facilities accordingly.
However, the above-cited Patent Application makes use of values that are not meaningful by themselves (i.e., human-unreadable) as control information. Such human-unreadable information is stored in the user area (unstandardized and designed to accommodate user-specific specifications) of ancillary data packets according to SMPTE 291M. When placed in the user area, the human-unreadable information can only be interpreted correctly by the video recording apparatus complying with the proposed invention of the filed application. No other video recording apparatus is capable of determining what the information in the user area signifies. No interoperability exists between these apparatuses as per different specifications. The proposed invention thus has difficulty letting its embodiment materialize beyond the scope of apparatus-specific functional expansion.
According to the above-cited Patent Application, control information is limited to basic control commands (e.g., recording start, recording end, fast-forward, rewind) of the video recording apparatus. That is because what is intended is solely and precisely to execute such limited tasks as enabling the master video camera creating a material videotape to have the subordinate VCRs create the same material videotape simultaneously.