When watching a program or a commercial broadcasted by conventional television broadcasting, a viewing person sometime desires to acquire an object such as an article appearing in the program or the commercial and music performed in the program or the commercial.
In this case, the viewing person first connects a PC (personal computer) to the Internet, then retrieves on the Internet the information of such an object, and thereby acquires the information of the target object.
Then, on the basis of the acquired information, the viewing person contacts with or goes to a vendor who sells the object, and thereby purchases the object. Conventionally, a viewing person had purchased an object appearing in a program or a commercial, in such a manner.
Nevertheless, in order to acquire an object appearing in a broadcasted program or a broadcasted commercial, the viewing person need access the Internet through a PC completely independently of the reception of the broadcast, and then need download the information of the object via the Internet. Further, on the basis of the downloaded information, an order for the object need be placed by telephone or the like. This is inconvenient.
That is, conventional broadcasting has a problem that an object appearing in a broadcasted program or a broadcasted commercial cannot be obtained easily, but that certain time and effort are necessary, and hence it is inconvenient.
Thus, in order to resolve the above-mentioned problem, in a previous application of the present inventor (Japanese patent application No. 2001-258564), the present inventor has proposed a shopping assistance system employing two-way broadcasting that allows an object appearing in a broadcasted program or a broadcasted commercial to be acquired easily without much time and effort. The entire disclosure of the reference of Japanese patent application No. 2001-258564 is incorporated herein by reference in its entirety.
The shopping assistance system employing two-way broadcasting proposed by the present inventor is described below.
FIG. 37 is a block diagram showing the conceptual configuration of the shopping assistance system employing two-way broadcasting in the previous application of the present inventor. FIG. 38 is a flow chart showing the operation of the shopping assistance system employing two-way broadcasting (simply referred to as a shopping assistance system, hereafter) FIG. 39 is a functional block diagram showing the detail of a part of FIG. 37.
In FIG. 37, the shopping assistance system comprises a broadcasting station 10, a vendor 20, and a home 30. A TV/STB 310 and a remote controller 320 are installed in the home 30.
The broadcasting station 10 is a broadcasting station which broadcasts a program together with program additional information. The vendor 20 is a vendor who sells an article appearing in a program. The home 30 is a home where the broadcast is received.
The TV/STB 310 is a two-way broadcasting receiver composed of a television receiver or an STB (Set Top Box) serving as a two-way broadcasting receiver.
The remote controller 320 is part of operating the TV/STB 310, and is provided with a microphone 321.
The TV/STB 310 is provided with a recognition vocabulary storing section 311, a speech recognition section 312, and the like. That is, as shown in FIG. 39, the TV/STB 310 comprises a broadcast receiving section 313, a recognition vocabulary generating section 314, the recognition vocabulary storing section 311, the speech recognition section 312, a time expression dictionary 316, a stored time controlling section 315, an additional information storing section 317, a displaying section 318, and a transmitting section 319.
The broadcast receiving section 313 is part of receiving broadcasting radio waves. The recognition vocabulary generating section 314 is part of generating a recognition vocabulary serving as an object of speech recognition, from the program additional information received by the broadcast receiving section 313. The recognition vocabulary storing section 311 is part of storing the generated recognition vocabulary. The time expression dictionary 316 is a dictionary of retaining expressions concerning time such as “now” and “a while ago”, as a recognition vocabulary. The speech recognition section 312 is part of performing speech recognition by using as a recognition vocabulary dictionary the recognition vocabulary storing section 311 and the time expression dictionary 316. The stored time controlling section 315 is part of learning the relation between each time expression vocabulary and an actual time width or the number of scenes on the basis of the relation between a recognized time expression vocabulary and an information selection input performed by a viewing person, and of thereby controlling the speech recognition section 312 and the recognition vocabulary storing section 311. The additional information storing section 317 is part of storing additional information corresponding to a within-the-program article or the like specified by speech recognition. The displaying section 318 is part of displaying the additional information. The transmitting section 319 is part of transmitting to the broadcasting station an input result such as the selection of additional information performed by a viewing person.
Next, the operation of such a shopping assistance system is described below.
FIG. 38 shows the operation of the shopping assistance system and its service. The following description is given with reference to FIG. 38.
First, during the watching of a program, a viewing person pays attention to an article or the like appearing in the program, and then utters words notifying that attention is paid to a specific article. Then, the microphone 321 receives the utterance, and then outputs a signal to the speech recognition section 312.
The speech recognition section 312 performs speech recognition on the utterance signal inputted through the microphone 321. On the basis of the speech recognition result, the speech recognition section 312 judges the article or the like of the viewing person's attention, then specifies corresponding program additional information, and then accumulates the information into the additional information storing section 317 (step 331).
Detailed description is given below for the case that a drama is watched. For example, during the watching of the drama, the viewing person paid attention to a suit worn by a character. However, the character who wears the suit has exited the screen. In this case, the viewing person utters “the red jacket a while ago is good” or the like.
The voice uttered by the viewing person is inputted through the microphone 321. With reference to the time expression dictionary 316 and the recognition vocabulary storing section 311, the speech recognition section 312 recognizes the inputted voice, and then extracts corresponding additional information from the broadcasted program additional information.
The recognition vocabulary stored in the recognition vocabulary storing section 311 is generated by the recognition vocabulary generating section 314 by successively accumulating each vocabulary indicating an article, music, or the like provided with additional information obtained from the received program additional information. That is, the program additional information contains also keyword information of specifying an article or music to which program additional information has been made to correspond at the broadcasting station. The recognition vocabulary generating section 314 generates the recognition vocabulary from this keyword information. Then, the speech recognition section 312 performs the speech recognition of a viewing person's uttered voice such as “the red jacket a while ago is good”, and thereby extracts a recognition vocabulary from the viewing person's uttered voice. For example, in the case of the uttered voice “the red jacket a while ago is good”, a recognition vocabulary of “red” and “jacket” is extracted. Then, program additional information is selected that has the largest number of keyword information pieces corresponding to the extracted recognition vocabulary. Then, the selected program additional information is stored into the additional information storing section 317. That is, when certain program additional information contains both of the keyword information corresponding to the recognition vocabulary “red” and the keyword information corresponding to the recognition vocabulary “jacket”, this program additional information is stored into the additional information storing section 317. As such, the speech recognition section 312 can specify program additional information by means of selection.
The description has been given for the case that the speech recognition section 312 selects program additional information having the largest number of keyword information pieces corresponding to the recognition vocabulary extracted from the viewing person's uttered voice. However, the invention is not limited to this. The speech recognition section 312 may select, for example, five pieces of program additional information in the descending order of the number of keyword information pieces corresponding to the recognition vocabulary extracted from the viewing person's uttered voice. Then, the selected program additional information may be stored into the additional information storing section 317. As such, the speech recognition section 312 may narrow down the program additional information, instead of specifying the information.
The stored time controlling section 315 performs control such that the generated recognition vocabulary should be retained during a time corresponding to a time range or the number of scenes having been set in advance or alternatively to a time range or the number of scenes that is the largest in the time expressions learned on the basis of the previous utterance of the viewing person and the subsequent input. The learning in the stored time controlling section 315 is described later. For example, in the case of the uttered voice “the red jacket a while ago is good”, in response to the control of the stored time controlling section 315, the speech recognition section 312 extracts a time expression vocabulary “a while ago” indicating a past. Then, with reference to the time expression dictionary 316, the speech recognition section 312 performs the above-mentioned specifying or narrowing down on the program additional information broadcasted within the time range or the number of scenes corresponding to “a while ago”.
After the drama ends (step 332), the displaying section 318 displays the additional information corresponding to the article which has appeared in the drama and has been specified by speech recognition (step 333).
The additional information contains information on the dimensions, the weight, the quality of the material, the color variation, the prices of the size variation, the manufacturer, the vendor, the vendor's contact address, and the like. The viewing person checks and examines the information. Then, when purchasing, the viewing person selects additional information and thereby inputs purchase information by using inputting part such as the remote controller 320, a pointing device, and speech recognition.
The transmitting section 319 transmits to the broadcasting station the purchase information together with an identification number or the like of the corresponding additional information (step 334).
As described above, on the basis of the relation between a recognized time expression vocabulary and an information selection input performed by a viewing person, the stored time controlling section 315 learns the relation between each time expression vocabulary and an actual time width or the number of scenes. This process of learning is described below in detail. The stored time controlling section 315 retains information of establishing the correspondence of each recognition vocabulary which is a time expression stored in the time expression dictionary 316, to an actual time width or the number of scenes. For example, the stored time controlling section 315 establishes the correspondence of a recognition vocabulary “a while ago” to a time width ranging from 20 seconds before to 5 minutes before relative to the present, and the correspondence of a recognition vocabulary “now” to a time width ranging from the present to 30 seconds before the present.
Thus, as described above, when a recognition vocabulary indicating the time expression “a while ago” is received from the speech recognition section 312, the stored time controlling section 315 performs the control such that the specifying and the narrowing down should be performed on the program additional information received within the time width ranging from 20 seconds before to 5 minutes before relative to the present. In response to this control, the speech recognition section 312 performs the control such that the specifying and the narrowing down should be performed on the program additional information received within the time width ranging from 20 seconds before to 5 minutes before relative to the present. Then, the specified or narrowed down program additional information is stored into the additional information storing section 317. That is, the stored time controlling section 315 performs the control such that the recognition vocabulary generated within this time width should be retained.
Meanwhile, when the stored time controlling section 315 receives a recognition vocabulary indicating a time expression “a while ago”, and when the time width ranging from 20 seconds before to 5 minutes before relative to the present is made to correspond to the time expression as described above, the program additional information displayed on the displaying section 318 could have a time width different from the intention of the viewing person. In this case, the viewing person utters “redo”, “display preceding information”, “display subsequent information”, or the like to the microphone 321.
Then, the speech recognition section 312 performs speech recognition on the utterance of the viewing person, and then notifies the speech recognition result to the stored time controlling section 315. In the speech recognition of an utterance “display preceding information”, the speech recognition section 312 extracts “display”, “preceding”, and “information” as a recognition vocabulary, and then notifies the result to the stored time controlling section 315.
On receiving a recognition vocabulary of “display”, “preceding”, and “information” from the speech recognition section 312, the stored time controlling section 315 revises the information on the time width made to correspond to the recognition vocabulary indicating the time expression “a while ago”. That is, revision is performed such that the recognition vocabulary “a while ago” should correspond to a time width ranging from 40 seconds before to 5 minutes and 40 seconds before relative to the present. Then, the stored time controlling section 315 controls the speech recognition section 312 such that the speech recognition section 312 should specify or narrow down the program additional information again with respect to the program additional information received between 40 seconds before and 5 minutes and 40 seconds before relative to the present. In response to the control of the stored time controlling section 315, the speech recognition section 312 specifies or narrows down the program additional information again, and then stores the specified or narrowed down program additional information into the additional information storing section 317. Then, the displaying section 318 displays the program additional information stored in the additional information storing section 317. Then, if the desired article is included in the displayed program additional information, the viewing person selects the program additional information, and thereby inputs purchase information.
When this procedure is repeated many times, the stored time controlling section 315 can incorporate the intention of the viewing person into the recognition vocabulary for time expressions, or establish appropriate time width correspondence. As such, the learning is performed in the stored time controlling section 315.
As described above, according to the shopping assistance system and the service, with respect to an article, music, or the like which appears in a program and in which a viewing person becomes interested, the information can be obtained and then the article or the like can be purchased in a manner that the watching of the program itself is not interrupted by the work of making a memorandum or the like, merely by means of natural utterance performed in parallel to the watching of the program.
The use of the shopping assistance system proposed by the present inventor realizes such an outstanding effect.
Nevertheless, in the shopping assistance system in the previous application of the present inventor, additional information is specified by judging the degree of agreement between the word obtained by speech recognition and the keyword corresponding to the keyword information contained in the additional information. Thus, it is desired that the specifying of the additional information should be performed more flexibly and appropriately than in this method. That is, an issue is present that an object appearing in a broadcasted program or a broadcasted commercial should be acquired more easily with less time and effort.
Further, it is desired that the additional information should be specified in a manner more suitable for the expression uttered by a viewing person. That is, an issue is present that an object appearing in a broadcasted program or a broadcasted commercial should be acquired easily without much time and effort in a manner suitable for the expression uttered by a viewing person.
Further, an issue is present that the additional information should be specified in a manner more suitable for the interest of a viewing person. That is, an issue is present that an object appearing in a broadcasted program or a broadcasted commercial should be acquired easily without much time and effort in a manner suitable for the interest of a viewing person.