The present invention relates to the production and editing of image data, or in particular to an image data editing method, an editing apparatus and a computer program product for carrying out the method used to produce a television broadcast program or a video program.
In recent years, an image data editing system has been developed for producing image data using the computer from GG animation, moving image data, still image data, text data, audio data, sound synthesis data by combining the techniques of animation, sound synthesis and moving image reproduction.
Further, an image data editing method has been conceived in which a TV program is described as a script in time series in the same way as writing a scenario, and the script is interpreted by the computer to create a TV program. According to this method, image data can be interactively edited easily even by a person who has thus far been engaged in the preparation of a program schedule table. In this image data editing method, the editing state is displayed on the display screen and the GUI (graphical user interface) operation can be performed for automatically producing the script.
Examples of the image data editing system for creating a TV program interactively between the user and the computer are disclosed in “Desk-top TV Program Creation—TVML (TV Program Making Language) Editor—”, Ueda et al., Association for Computing Machinery, September 1998; “Program Creation/interactive Editing System Based on TV Program Making Language TVML”, by Toshiaki Yokoyama, et al., 3rd Intelligence Information Media Symposium, December 1997; “Development of Man-Machine Interface of TV Program Making Language TVML”, by Toshiaki Yokoyama, et al., September 1997 Society Convention of The Institute of Electronics, Information and Communication Engineers; and “Program Making Language TVML Making Possible Desk Top Creation of Personal TV Program”, by Masaki Hayashi, Broadcast Technologies, January 1999, pp. 139-144.
In the TVML editor disclosed in these references, a program for a virtual studio, i.e. a CG (computer graphics) studio can be created with a computer and a screen connected thereto using the animation characters (CG characters) stored in a large-capacity random access memory such as a hard disk drive, a CG studio set image data library, a voice synthesis tool and an animation production tool without using a real studio or actors/actresses.
An image of a program being edited or created can be displayed on the editing screen of the TVML editor. The image displayed is the one viewed from a preset direction of projection, i.e. the eye of the camera in a virtual studio. During the creation or editing of a program image, the processing is required for moving CG characters and other CG objects on stage and setting the speech and the motion of each CG character accurately in conformance with each other. For changing the set data of the CG characters in the virtual studio, it is necessary to open the setting input screen and input data from the keyboard or the like. This work for changing the setting, which requires the operations of opening different windows and inputting data of information from the keyboard a number of times, is complicated and low in efficiency.
The editing screen used in the conventional image data editing method will be explained with reference to FIG. 6. FIG. 6 shows an example of the screen of the conventional TV program editing device displayed on the monitor. Numeral 201 designates an editing window, numerals 202, 202′ studio blocks for setting the speech and motion of CG characters and the camera to image the interior of the CG studio, numeral 203 a movie block, numeral 204 a title block, numeral 205 a superimposition block, numeral 206 a sound block, numeral 207 a narration block, numeral 208 a block for miscellaneous setting, numeral 209 event marks, numeral 210 a monitor window, numerals 211, 212 representative screens, numerals 213, 214 slider sections, numeral 215 a start block and numeral 220 a menu bar.
On the left side of the editing window 201 shown in FIG. 6, the image output on the display screen is indicated by a vertical column having the studio block 202, the movie block 203, the title block 204, the studio block 202′, etc. In the editing window 201, the ordinate represents the time and the work of TV program creation is conducted downward on the display screen.
FIG. 7 is an enlarged view of the studio block 202 shown in FIG. 6. Numeral 202 designates the studio block, numeral 301 a speech setting section for setting the speech, voice type, etc. of the CG characters speaking in the CG studio, numeral 302 a motion setting section for arranging and setting the motion of the CG characters walking or otherwise behaving, numeral 303 a camera work setting section for designating the camera work, and numeral 304 a studio set-up button for setting the initial values of the positions of the CG characters and the camera in the studio, the background of the CG studio, properties and scenery and the combination thereof. The set information of the CG studio, the speech and motion of the CG characters and the camera work information are displayed in the studio block 202.
Returning to FIG. 6, the movie block 203 is a section to set the operation for controlling the reproduction of the moving image already edited and prepared in advance, and displays the file name of the moving image and other information. By clicking the representative screen 211 on the left side of the movie block 203 by mouse, for example, a movie setting window (not shown) pops up on the display screen. At the same time as the reproduction, rapid feed and rewinding of the moving image by the operation of editing and setting the movie setting window, in-points and out-points, and timing of superimposition, narration and speech, etc. are designated. The title block 204 is a section in which the display of the text information and a still image on the TV receiver screen or the movie screen, for example, is controlled. When the representative screen 212 on the left side of the title block 204 is clicked, for example, a title window (not shown) pops up on the display screen and the editing of the title screen is made possible.
A superimposition block 205 is a section where the superimposition of text combined with the image output on the TV receiver or the movie screen is controlled, and a sound block 206 is a section in which the background music (BGM) or the like music combined with the image is controlled. A narration block 207, on the other hand, is a section in which a narration is combined with the moving image or the like being reproduced, and a miscellaneous setting block 208 is a section in which the waiting time or the like is set. These blocks can be edited in a similar way to the studio block 202, the movie block 203 and the title block 205 described above.
A TV program creator (hereinafter referred to as the user) creates a TV program by the GUI operation on the edit window 201 shown in FIG. 6. First, in accordance with the scene of the program to be created, the user generates the studio block 202, the movie block 203, the title block 204, etc. in the edit window 201 and arrange them vertically. After miscellaneous detailed setting in each block, the work for creating the program is conducted. The setting in the studio block 202 will be explained as an example.
Basically, the speech and motion of a CG character each can be set only at one corresponding point (cell) of an event.
Specifically, once the studio block 202 is generated and arranged on the edit window 201, one event is created in the studio block 202. The “event” here is defined as one horizontal line on the screen displayed as the event marks 209. The events thus created are recorded in the order of the progress of the program. Upon designation of event addition by the operator, an event is newly added to the studio block 202, which extends vertically, so that the blocks lower than the studio block 202 in the screen (the movie blocks 203 and subsequent blocks, for example) are displaced by one line downward. In this way, after adding an event to each block, the miscellaneous setting of the particular event is carried out. For example, the speech of characters is input to the speech column (cell) in the studio block 202. A TV program is created by this operation.
FIG. 8 is a diagram in which a speech window is displayed above the edit window displayed on the display screen. The same component element as those described above are designated by the same reference numerals, respectively. Numeral 201-1 designates an edit window, numeral 210-1 a monitor window, numeral 401 a speech window, numeral 402 a character setting menu for designating a CG character who speaks in the speech window 401, numeral 403 a speech type setting menu for selecting the use of a text or an audio file, numeral 404 a text box for inputting the words in the case where the text is selected in the speech type setting menu 403, numeral 405 a wait check button for awaiting a start of execution of the next command until the end of the ongoing speech by a CG character, numeral 406 a rate scale for regulating the speed of the speech, numeral 407 a volume scale for regulating the sound volume of the speech, numeral 408 an intonation scale for regulating the intonation of the speech, numeral 409 a pitch scale for regulating the pitch of the speech, numeral 410 a closed caption setting menu for changing the caption, numeral 411 a motion change menu for changing the motion of a CG character, numeral 412 a lip sensitivity scale for regulating the lip sensitivity of a CG character, numeral 413 a pause text box for inputting the number of seconds for which the preceding pause (the period from the start of an event to the time when the character begins to speak) and the tail pause (the period from the time when the character stops speaking to the end of the event), numeral 414 a wait menu for selecting the waiting or not waiting for a speech command, numeral 415 a preview button for previewing the speech of the CG character, numeral 416 a default button for changing the value set in the window to a default value, numeral 417 a cancel button for restoring the window setting to the state prevailing when the window is opened, and numeral 418 a close button for closing the window by use of the setting in the window. The lip sensitivity is a command factor determined by the TVML language specification, which is a coefficient for determining the size of the mouth opening according to the sound level. In the case where it is desired to set the speech of a CG character in the CG studio, the mouse is double clicked at the cell of the speech setting section 301 in FIG. 7. The speech window 401 is displayed on the screen as shown in FIG. 8. The creator desiring that the CG character B speaks a text, for example, first double clicks the mouse at the cell of the desired position in the self setting section 301 to open the speech window 401. The character setting menu 402 is set to the CG character B, and the character string which the producer desires the CG character B speaks is input in the text box 404. After setting the other parameters, the close button 418 is clicked. Thus the creator can cause the CG character to speak the words of the speech.
Not only to set the speech of the CG character, it is also possible to edit a desired event by double clicking the block of the event or the cell of the setting section and thus opening the related window.
The GUI used in this invention will be explained with reference to FIG. 11. FIG. 11 is a pop-up menu of the OSF/Motif widget which is one of the GUI parts. Numeral 800 designates a menu window for displaying a pop-up menu, numeral 801 a parent widget such as “form”, “row/column” or “bulletin board” for displaying the pop-up menu, numeral 802 pop-up menu frame, numeral 803 a label widget for menu title, numeral 804 a separator widget for defining menu items, and numeral 805 a push□button widget constituting a menu item. The pop-up menu is of such a type that the menu is displayed when the mouse is clicked. The OSF/Motif (open software foundation) is an organization engaged in standardization of the operating system and is composed of DEC (Digital Equipment Corporation), HP (Hewlett Packard) and IBM (International Business Machine Corporation). The widget is defined as a high-level GUI in the X window proposed by OSF/Motif and includes a library call for supplying various “parts” considered necessary for the user interface. Among these parts, the availability and quantity of the labels, separators and buttons of the menu can be determined freely.
Generally, the pop-up menu is of such a type that the menu is displayed in response to clicking of the mouse. Normally, a pop-up menu appears on the display screen when the right button of the mouse is clicked within an area where the pop-up menu is registered. The desired menu item is selected by moving the mouse vertically on the pop-up menu display screen while keeping the mouse depressed.
In the case where it is desired for a CG character B to speak a speech in a program already edited, for example, a line of an event is inserted (added) at an event position where it is desired for the CG character to speak the speech. The speech window is opened by double clicking the cell of the event speech setting section 301. It is also necessary to set the character setting menu 402 and input the desired character string to the text box. The setting is impossible unless a plurality of operations similar to those described above are carried out also for editing other events. This repetitive operations are complicated and have an adverse effect on the creation efficiency. Especially in the case where the program to be created is so long, the events are required to be checked by manipulating the scroll bar 213 of the edit window 201 by mouse and thus scrolling the contents of the display in the window. This makes it very difficult to grasp the contents.
Each event includes eight command types including “speech”, “motion”, “camera”, “superimposition”, “sound”, “mixer”, “narration” and “miscellaneous setting”. One event can be edited or set for each command type. If all the command types are set, therefore, a maximum of eight commands can be set for each event. The commands including “speech”, “motion”, “camera”, “superimposition”, “sound”, “mixer”, “narration” and “miscellaneous setting” set in each event are executed in that order.
In the conventional method of editing image data, the text information of the script of the video program displayed in the edit window is scrolled while searching for a point of a command to be edited, and the relation between the particular command and the preceding and following commands is checked based on the information on the screen. Then, the edit work such as insertion, modification and deletion of a command is carried out by inputting the text information. This poses the problem that the editing requires a plurality of operating and input sessions.
Especially in the case where the program created is so long, the program contents are very difficult to grasp when scrolling the contents of display in the edit window and thus checking the commands executed.
Also in editing each command of an event separately, assuming that a new command is to be added or one of a plurality of commands is to be changed, the user is required to conduct the edit work carefully considering the order of execution of the commands.
This repetitive operation is so complicated as to increase the inclination to be more dependent on the memory and skill of the user, and forms one of the causes of a suppressed production efficiency.
Further, the cooperation between the work of checking the monitor window after the edit operation and an additional edit work is not sufficient, and requires a similar complicated operation.
Also in the conventional edit screen described in the references cited above, a each change in the set data of a CG object in the virtual studio requires that the set input screen is opened and the data is input from the keyboard or the like. In this work of changing the setting, the repetitive sessions of operation of opening different windows and inputting data through the keyboard are necessary, thus complicating the work and reducing the editing efficiency.