1. Technical Field
This invention relates to voice recognition performed near a wireline node of a network supporting cable television and/or video delivery.
2. Background Art
Currently, voice operated functions using the latest voice recognition technologies are limited to a handful of applications, such as toys, appliances, some computers, voice dictation, cellular phones, and voice control of one's home. Most of these applications use voice recognition technology running on a computer or voice recognition chip technology. These voice recognition systems typically offer only a limited number of commands and the recognition efficiency is only fair and often require voice training.
There have been numerous patents issued regarding voice recognition. Many apply in a telephone context or other dial-up context such as an Automated Teller machine (ATM), including the following: Rabin, Voice command control and verification system, U.S. Pat. No. 6,081,782, issued Jun. 27, 2000, Basore, et al, Voice activated device and method for providing access ro remotely retrieved data, U.S. Pat. No. 5,752,232, issued May 12, 1998, and Kowalkowski, et al, Voice-control integrated field support data communications system for maintenance, repair and emergency services, U.S. Pat. No. 5,924,069, issued Jul. 13, 1999.
There is, however, another class of voice recognition technology referred to as natural language, which requires state of the art processing software and hundreds of megabytes of RAM to support. Natural language voice recognition is currently being used in high end systems, such as billing applications for utility companies and the New York Stock Exchange, because of its ability to recognize spoken words from any voice. Some natural language systems claim to be totally user independent and are also capable of recognizing speech in several different languages.
However, the problems of voice recognition at a centralized wireline node in a network supporting video delivery or cable television delivery have not been addressed by such prior art. For the purposes of the discussion herein, a centralized wireline node refers to a network node providing video or cable television delivery to multiple users using a wireline physical transport between those users at the node.
FIG. 1 depicts a typical network as found in a cable television and/or video delivery network employing a Hybrid Fiber-Coaxial (HFC) wiring scheme as disclosed in the prior art.
Each user site contains a Set Top Box, such as STB 180, coupling to the network through a coaxial cable 172, which interfaces 170 to a collective coaxial cable 160 which couples to a Node 126. The interface 170 may include bi-directional signal amplification and possibly further include the filtering and/or frequency shifting of these signals.
The Node 126 is hierarchically coupled 128 to a Headend 104, which in most cable television networks serves as the source of television programming and other signaling. The signals are sent through the Node 126 and couplings 160-170-172 to provide the STB 180 and others, with the television signaling. In certain large towns and cities, there may be a further hierarchical layer including a Metropolitan Headend 10 coupled 106 to Headend 104. These higher layers of the network use fiber optics for the physical transport of couplings 102, 106 and 108, as well as for 122, 126 and 128.
The couplings between STB 180 and Node 126 support bi-directional communication. The couplings between STB 180, Node 126 and Headend 104 may also support bi-directional communication. Such bi-directional communication allows the STB 180 to receive multiple television channels. This bi-directional communication allows STB 180 to signal at least limited information to the Node 126 and/or the Headend 104. Such information in either case may support management of Pay-per-View and other services.
User site accounting information usually resides at the highest level of the network, which tends to be either the Headend 104 or Metropolitan Headend 10.
In cable systems, several downstream data channels that send channel and synchronization information are often transmitted in a previously reserved band of frequencies. They are typically assigned for re-broadcasting FM channels over cable in the United States. Currently, most cable systems reserve some of the 88 to 108 MHz FM spectrum for set-top data transmission. The unused portion of that spectrum are left for barker channels or for additional video channels. The Open Cable Standard requires that the 70 to 130 MHz band be available for what's called Out-of-Band or (OOB) or Downstream transmission.
Most current cable systems use the popular HFC architecture so that the downstream video signals, digital or analog, are sent from the Headend to hubs or nodes via fiberoptic cable. At the receiving side of the node, the optical signal from the fiber gets converted to an electrical signal containing all of the analog, digital video RF carriers and program/service information. This signal, in turn, is amplified and distributed via coaxial cable to the appropriate subscribers connected to the node.
A major design objective for existing cable television set-top boxes was efficient downstream information delivery, i.e. from cable plant to subscriber. Provision for upstream data transmission, i.e. from subscriber to cable plant, is much more restrictive, supporting only limited bandwidth. As new classes of interactive services become available, efficient use of upstream transmission bandwidth grows in importance. For example, if it is necessary to pass voice information from the subscriber to the cable headend (also known as the headend), sufficient upstream bandwidth must be made available.
One of the most popular digital set-top boxes, the General Instruments (now Motorola) DCT-2000, is a useful example. When this box was first deployed, upstream transmissions were restricted to user pay-per-view requests, and other simple, infrequent transmissions. As a consequence, the transmission format used for upstream transmissions was not required to be very efficient, and in fact, is not.
In this set-top box, the transmission hardware is capable of selecting twenty different 256K bps channels, each of which uses QPSK transmission coding. While the hardware is capable of frequency-hopping to avoid channels which are subject to interference, the scheme used is fairly static, with typical deployments only using two active upstream communications channels. This leads to an aggregate bandwidth of only 512K bps per cluster of set-top boxes converging in the network to a node, in cable television terms. The cable node typically supports between 500 and 2000 subscribers.
Upstream signals in the 5 to 40 MHz band from each subscriber connected to the node are collected, combined, and then sent to the Headend via either the same fiber used for the downstream video carriers, or a separate fiber.
Furthermore, the transmission control protocol used, referred to as Aloha, is one where an individual set-top box immediately transmits any pending request to the headend, without regard to whether or not the transmission channel is already in use. This transmission is repeated at regular intervals until the box receives an acknowledgement command from the headend, indicating successful receipt of the transmission.
This transmission control protocol is quite inefficient due to the number of collisions which ensue, e.g. simultaneous transmissions from different set-top boxes which interfere with one another, forcing all of the transmitters to repeat their transmissions again. This leads to typical channel utilization on the order of just 30%. As a consequence, the total bandwidth available for upstream transmission per node is only about 30% of 512K bps=˜137K bps, on average.
Downstream control data transmission typically occurs in a separate frequency band from the upstream channels.
Typically, HFC networks employ an optical fiber from a central office, or Headend, to a neighborhood node. The fiber has forward and reverse transmission capability, which can alternatively be accommodated on separate fibers. Wavelength Division Multiplexing (WDM) can be used to implement both on a single fiber. At the node, coaxial cable connects the users through a shared frequency division multiplexing (FDM) scheme with contention resolution protocols used to manage upstream data flows.
Such communication schemes having both forward and backward paths, and which may or may not involve a user, are referred to as loops herein. An example of a loop is the communication between Headend 104 and Node 126. Communication schemes having both forward and backward paths to multiple users are referred to as local loops. An example of a local loop is the communication between Node 126 and user site STBs 180, 182 and 184. Note that a loop may be constituted out of optical fiber or out of coaxial cable.
Hybrid-Fiber-Copper (HFCop) networks work in much the same manner, but substitute copper wire(s), often in twisted pairs, for coaxial cable. In such networks a local loop may further be constituted out of optical fiber, coaxial cable or twisted pairs.
Another alternative local loop configuration is commonly known as Switched Digital Video. It is a form of HFC coupling the fiber through a node to each user site with a distinct point-to-point coaxial cable. The node interfaces the user site coaxial cables with the optical fiber through a switch. The switch typically contains a network management unit which manages the switch, connecting the bandwidth service provider with multiple homes, today often in the range of five to 40 homes per switch.
The Synchronous Optical NETwork (SONET) scheme is also applied in the creation of high-speed networks for homes and businesses. This and similar communication schemes may be employed to deliver video streams to user sites.
FIG. 2 depicts a typical residential broadband network using local loop wiring of the network, as disclosed in the prior art.
As in FIG. 1, each user site contains a Set Top Box, such as STB 180, coupled to the network through a coaxial cable 172 which interfaces 170 to a collective coaxial cable 160 which is coupled to Node 126. Interface 170 may include bi-directional signal amplification, and possibly further include the filtering and/or frequency shifting of these signals.
As in FIG. 1, the couplings between STB 180 and Node 126 support bi-directional communication allowing the STB 180 to receive multiple television channels and allowing STB 180 to signal at least limited information to the Node 126, which may well include management of Pay-per-View and other services. The couplings between STB 180, Node 126 and Headend 104 may also support bi-directional communication allowing the STB 180 to receive multiple television channels and allowing STB 180 to signal at least limited information to the Headend 104, which may well include management of Pay-per-View and other services.
FIG. 2 shows a loop coupling Headend 104 through coupling 130 to Node 120 through coupling 132 to Node 124 through coupling 134 to Node 126 which in turn couples 136 to Headend 104 forming the loop.
The hierarchical coupling of Node 126 with Headend 104 is carried out along distinct paths through this loop. Communication from Headend 104 to Node 126 follows a path 130-132-134. Communication from Node 126 to Headend 104 follows the path 136. The specific wiring schemes are dominated by the choice of physical transport, communication protocols and network level management. The description just given for FIG. 2 is provided as a simplified discussion of the basics of how high speed residential broadband networks incorporate loops and local loops supporting network level hierarchies.
There has been extensive research into the mechanics of speech recognition. The progress has been sufficient to allow voice trading by stock brokers using their desk top computers.
While these innovations have been substantial, they do not resolve several central questions of great importance to cable television, video delivery systems, and commerce. There is no present system providing voice recognition to a collection of users over a cable television network. There is no present system providing user identification based upon that voice recognition over a network that supports cable television and/or video delivery. There is no present system sufficient for real-time auctions and contracting to be conducted over a cable television and/or video delivery network, based on user identification through voice recognition.