Remote gaming applications, in which a server-side game is controlled by a client-side player, have attempted to encode the video output from a three-dimensional (3D) graphics engine in real-time using existing or customized encoders. However, the interactive nature of video games, particularly the player feedback loop between video output and player input, makes game video streaming much more sensitive to latency than traditional video streaming. Existing video coding methods can trade computational power, and little else, for reductions in encoding time. New methods for integrating the encoding process into the video rendering process can provide significant reductions in encoding time while also reducing computational power, improving the quality of the encoded video, and retaining the original bitstream data format to preserve interoperability of existing hardware devices.
Unlike regular video playback, video games have a unique player input to video feedback loop. Players are very sensitive to latency between input and video output. High latency in this player input-feedback loop has been a significant stumbling block for video game streaming applications in which a server-hosted video game instance is controlled by a remote player. Any process that can reduce the time between input and feedback will directly improve user experience.
The client hardware in a game streaming environment may have varying levels of computational performance, but dedicated H.264 hardware decoders are becoming more ubiquitous, even in mobile and other low-power devices. A hardware decoder is good at performing a small selection of computations, such as motion compensation, which are regularly performed according to the H.264 coding standard. The strengths of the dedicated decoding hardware can be exploited to provide a better player experience in a game streaming environment regardless of the overall computational power of the client.
In local, non-streamed rendering applications, the game engine can add several frames of latency between player input and video feedback. In game streaming applications, additional latency is introduced to the player input-feedback cycle because player input must travel through the network to the remote server and video output must be encoded, transmitted, and decoded before the player receives feedback. For some player inputs, the client can estimate the results on video feedback by performing motion compensation immediately, cutting out the network latency.
Player input motion compensation is, at its most basic, a technique of shifting groups of pixels in order to sacrifice some image accuracy for a decrease in input-feedback latency in situations where a video game is running on a server while being controlled remotely by a networked client. This technique is good at reducing player-feedback latency for inputs that result in consistent motion vectors, such as player view rotations in first-person games.
In a video game, player-context is defined as the current game state which is the result of previous player actions, inputs, and decisions. The client in a game streaming system is naïve to the player-context, that is, the client receives only the video output and none of the game-state information that will determine the results of certain player inputs. There are a wide range of inputs that result in unique yet predictable motion outcomes based on game-state information. These inputs would benefit from a reduction in player-feedback latency but cannot be pre-cached on the client for traditional player input motion compensation because the client will not have player-context information. Additionally, the player-context permutation space may be too exhaustive to pre-generate motion vectors for methods such as cached repetitive motion vectors. These systems and methods are described in U.S. Provisional Application Nos. 62/488,526; 62/634,464; and 62/640,945; all three of which are incorporated here in their entireties. The game-server can compensate by generating anticipated motion vectors and updating the client's input motion compensation cache as the player-context changes. This allows the client to use player input motion compensation techniques for a limited set of context-dependent inputs, resulting in input-feedback latency reduction.
U.S. Pat. No. 9,661,351 (“the '351 Patent”), discloses systems and methods for skipping a frame during transmission from a server to a client device, where, in response to detecting the skipped frame, the client device generating a predicted frame that replaces the skipped frame in the compressed video stream, the predicted frame being generated based on extending delta information from one or more previous frames decoded by the client device. For this technology, the client-side frame prediction of one or more reconstructed or predicted frames is used following a skipped frame based on the data (e.g., motion vectors, residuals, etc.) of one or more preceding frames. The technology also prioritizes bit allocation and/or subfeature encoding. Encoded Network Abstraction Layer Units (NALUs) could be split into (1) motion vectors and (2) residuals. Instead of actually skipping a frame, the apparatus may just send minimal encoding data as prioritized. For example it could send just motion vectors if motion is prioritized. The present invention is superior to the technology of the '351 Patent at least because the '351 Patent does not disclose a client device that uses transmitted lookup tables from a server to match user input to motion vectors and tags and sums those motion vectors. The '351 Patent also does not disclose the application of those summed motion vectors to the decoded frames to estimate motion in those frames. The present invention is also superior because it reduces input-feedback latency, which is significantly reduced by using player input motion compensation instead of waiting for the server to return output video.
U.S. Pat. No. 8,678,929, (“the '929 Patent), is directed to the operation of a networked, interactive game system. The disclosed methods are geared towards reducing network lag by determining course-corrected positional values for a shared global object through two-way blending. The “blending” discussed in the patent includes the steps of the network simulation sending the calculation of a pose for the local player on the local console. The console blends the local and network simulations. The methods also blend shared global objects by using the blended pose of the local player to determine a positional value for the shared global object on the local console. The present invention is again superior to the technology of the '929 Patent at least because the '929 Patent does not disclose a client device that uses transmitted lookup tables from a server to match user input to motion vectors and tags and sums those motion vectors. The '929 Patent also does not disclose the application of those summed motion vectors to the decoded frames to estimate motion in those frames. The present invention is also superior because it does not require the presence of a client with extremely high processing power, reduces input-feedback latency, which is significantly reduced by using player input motion compensation instead of waiting for the server to return output video.
U.S. Pat. No. 8,069,258, (“the '258 Patent”) is directed to using local frame processing to reduce apparent network lag of multiplayer simulations. The methods described include intercepting inputs from a local user, determining the state data of a remote object in a network simulation from a previous game frame, and determining the interactions of nondeterministic objects from multiple game systems that are part of the networked simulation. That interaction data, along with the state data and local input, is used to generate a local simulation of the video frame. In this manner, the local and network simulations can run asynchronously for a single frame, with each frame corresponding to a single time phase within the game. This allows the local simulation updates in real-time during networked gameplay, while remaining (essentially) synchronized with the network. The present invention is once more superior to the technology of the '258 Patent at least because the '258 Patent does not disclose a client device that uses transmitted lookup tables from a server to match user input to motion vectors and tags and sums those motion vectors. The '929 Patent also does not disclose the application of those summed motion vectors to the decoded frames to estimate motion in those frames. The present invention is also superior because it does not require the presence of a client with extremely high processing power, reduces input-feedback latency, which is significantly reduced by using player input motion compensation instead of waiting for the server to return output video.
U.S. Pat. No. 9,665,334 B2 (“the '334 Patent”), discloses systems and methods for rendering protocols applying multiple processes and a compositor to render the combined graphics on a display. The technology operates as follows: when the server simultaneously provides a game screen to a number of client devices, the calculation load of the rendering processing on the server becomes heavy in, for example, a game content which requires quick responsiveness. That is, the number of client devices to which the server can provide a screen is limited depending on its rendering performance and required responsiveness. By contrast, when each client device is controlled to execute processing which can be executed by general rendering performance to share the rendering processes between the server and client device, a screen can be provided to more client devices. Also, in general, a game screen which is rendered without applying texture mapping has high compression efficiency, and can be sent with a smaller bandwidth via a network such as the Internet. The present invention is superior to the technology discussed in the '334 Patent at least because it does not disclose generating motion vectors at a server based on predetermined criteria and transmitting the generated motion vectors and one or more invalidators to a client, which caches those motion vectors and invalidators. It further does not disclose having the server instruct the client to receive input from a user, and use that input to match to cached motion vectors or invalidators, where those vectors or invalidators are used in motion compensation. The present invention is also superior because it reduces input-feedback latency, which is significantly reduced by using player input motion compensation instead of waiting for the server to return output video.
U.S. Pat. No. 9,736,454 (“the '454 Patent”), discloses systems and methods for encoding comprising examining availability of a depth block co-located with a texture block, determining a prediction method for a texture block on the basis of availability of a co-located depth block, and deriving a first prediction block for the texture block on the basis of the availability of the co-located depth block. Again, the present invention is superior to the technology discussed in the '454 Patent at least because it does not disclose generating motion vectors at a server based on predetermined criteria and transmitting the generated motion vectors and one or more invalidators to a client, which caches those motion vectors and invalidators. It further does not disclose having the server instruct the client to receive input from a user, and use that input to match to cached motion vectors or invalidators, where those vectors or invalidators are used in motion compensation. The present invention is also superior because it reduces input-feedback latency, which is significantly reduced by using player input motion compensation instead of waiting for the server to return output video.
U.S. Pat. No. 9,705,526 (“the '526 Patent”), discloses systems and methods for entropy encoding in media and image applications. The technology discloses a system where compression begins with receiving a source of image and or video data as indicated. A lossless compression scheme is then applied. A predictor/delta computation unit then takes the input and tries to reduce the redundancy in the input data using a delta computation between neighboring input elements. Then these values are encoded using a predefined statistical modeling in an entropy encoder to produce the compressed image and/or video data. Similar to the above, the present invention is superior to the technology discussed in the '526 Patent at least because it does not disclose generating motion vectors at a server based on predetermined criteria and transmitting the generated motion vectors and one or more invalidators to a client, which caches those motion vectors and invalidators. It further does not disclose having the server instruct the client to receive input from a user, and use that input to match to cached motion vectors or invalidators, where those vectors or invalidators are used in motion compensation. The present invention is also superior because it reduces input-feedback latency, which is significantly reduced by using player input motion compensation instead of waiting for the server to return output video.
U.S. Pat. No. 8,873,636 B2 (“the '636 Patent”), is directed to a moving-image distribution server (such as one running an online game) that provides coded image data to the user's PC, running the local instance of the game. In order to perform this process, in relevant detail, the CPU of the of the user's PC specifies the region to be referred to in order to decode the motion vector associated with the selected block in the preceding frame screen. It does so by referring to the motion vector associated with the selected block (a vector that is included in the preprocessing information it is provided) and extracts the image of the region as a reference image. As is the case with the other references, the present invention is superior to the technology discussed in the '636 Patent at least because it does not disclose generating motion vectors at a server based on predetermined criteria and transmitting the generated motion vectors and one or more invalidators to a client, which caches those motion vectors and invalidators. It further does not disclose having the server instruct the client to receive input from a user, and use that input to match to cached motion vectors or invalidators, where those vectors or invalidators are used in motion compensation. The present invention is also superior because it reduces input-feedback latency, which is significantly reduced by using player input motion compensation instead of waiting for the server to return output video.
International Patent Publication No. WO2009138878 A2 (“the '878 Publication”) is directed to processing and streaming multiple interactive applications in a centralized streaming application server, while controlling the levels of detail and post-filtering of various rendered objects. In the system, a centralized interactive application server, at its video pre-processor, performs spatial and temporal filtering on the frame sequence prior to encoding a compressed stream of audio-visual content to client devices, which decode the compressed stream and display the content. The GPU command processor of the centralized interactive application server includes a module that also computes a motion compensation estimate for each macroblock in the target encoded frame in the video encoder. Nevertheless, the present invention remains superior to the technology discussed in the '878 Publication at least because it does not disclose generating motion vectors at a server based on predetermined criteria and transmitting the generated motion vectors and one or more invalidators to a client, which caches those motion vectors and invalidators. It further does not disclose having the server instruct the client to receive input from a user, and use that input to match to cached motion vectors or invalidators, where those vectors or invalidators are used in motion compensation. The present invention is also superior because it reduces input-feedback latency, which is significantly reduced by using player input motion compensation instead of waiting for the server to return output video.
U.S. Pat. No. 9,358,466 B2 (“the '466 Patent”), is directed to improving videogame performance through the reuse of cached data. The systems disclosed score cache performance of different generated video game missions at least in part by identifying the digital assets used, and determining whether the identified digital assets are in a cache. Cache scores can be calculated based on a cache re-use rate corresponding to a proportion of digital assets for a mission that are already in a cache. Other techniques for generating cache scores may account for factors such as the overall size of combined digital assets for a mission that are already in a cache and/or overall size of combined digital assets for a mission that are not already in a cache. By caching data in this manner, that data, and other non-cached data requests, become more efficient. The present invention remains superior to the technology discussed in the '466 Patent at least because it does not disclose caching repetitive motion vectors, calculating a motion estimate from the input data or updating the stored motion vector library based on the input data, so that a client can use the stored motion vector library to initiate motion prior to receiving actual motion vector data from a server.
U.S. Pat. No. 6,903,662 B2 (“the '662 Patent), is directed to a configurable computer input device, specifically one that caches input data to maintain quicker response times. The system operates by checking key presses against an internal memory (or cache) to determine if the particular key has been previously identified. If the system has not communicated with the key before this point, the system may then retrieve data corresponding to the input from its key memory. The system will then update its internal memory (cache) with the key identity and its corresponding data. The system may then send the input data from the key to the host computer. However, the next time the system encounters the same key in the pressed state, it can query its own memory (cache) instead of retrieving the same scan code from the key memory again. The present invention is superior to the technology discussed in the '662 Patent at least because it does not disclose caching repetitive motion vectors, calculating a motion estimate from the input data or updating the stored motion vector library based on the input data, so that a client can use the stored motion vector library to initiate motion prior to receiving actual motion vector data from a server.
Japanese Patent No. JP6129865B2 (“the '865 Patent”), discloses systems and methods for transmit[ting] the rendered game content data for the subsets of path segments to a game client for caching on the local cache so that the game content data may be available when needed during real-time game play. Again, the present invention is superior to the technology discussed in the '865 Patent at least because it does not disclose caching repetitive motion vectors, calculating a motion estimate from the input data or updating the stored motion vector library based on the input data, so that a client can use the stored motion vector library to initiate motion prior to receiving actual motion vector data from a server.
U.S. Pat. No. 9,762,919 (“the '919 Patent”), discloses systems and methods for caching reference data in a block processing pipeline. The '919 Patent technology discloses a cache memory (e.g., a fully associative cache) that may be implemented, for example in a local (to the pipeline) memory such as SRAM (static random access memory), to which portions of the chroma reference data (e.g., 64-byte memory blocks) corresponding to motion vectors determined for macroblocks at earlier stages of the pipeline may be prefetched from the memory. Chroma cache logic may maintain the cache, and may extend over multiple stages of the pipeline. Fetches for the motion vectors of a given macroblock passing through the pipeline may be initiated by the chroma cache logic one or more stages prior to the chroma motion compensation stage to provide time (i.e., multiple pipeline cycles) to read the respective memory blocks from the memory into the cache before chroma motion compensation needs it. However, the '919 Patent remains deficient compared to the present invention. The present invention is superior to the technology discussed in the '919 Patent at least because it does not disclose caching repetitive motion vectors, calculating a motion estimate from the input data or updating the stored motion vector library based on the input data, so that a client can use the stored motion vector library to initiate motion prior to receiving actual motion vector data from a server.
As is apparent from the above discussion of the state of the art in this technology, there is a need in the art for an improvement to the present computer technology related to the encoding of real-time game environments.