Many video games involve interaction between one or more human-controlled characters and one or more computer-controlled agents. Such computer-controlled agents can play the roles of opponents and/or teammates of the human-controlled characters. For example, a video soccer game may involve one or more human-controlled players (opponents or teammates) playing alongside players (opponents and teammates) that are controlled by the computer, such as the game console.
In existing approaches, a computer-controlled agent is typically implemented according to a fixed set of state-action policies, which control a computer-controlled agent's reaction to a given state in the video game. For example, when a human-controlled player shoots a soccer ball to the computer-controlled goalkeeper's right, the policy may cause the computer-controlled goalkeeper to dive to the right in an attempt to stop the shot. Depending on the state of the game, including potentially the timing, speed, and direction of the shot; the positions and movements of the shooting player, the goalkeeper and other players; as well as other parameters, a variety of new game states may result (e.g., the goalkeeper blocks the shot, the shot scores, the shot misses, another player deflects or blocks the shot, etc.). The state-action policies controlling this behavior may include a static, fixed set of policies that are set at development time, such that a predefined action is set to execute in response to given sets of state conditions.
Alternatively, some random or probabilistic selection from a defined group of actions associated with a given set of state conditions may be made to provide a more appealing variety in game play. In this approach, the computer-controlled agent may react to a given game state with one action in one instance and react to the same game state with a different action in another instance. Nevertheless, the predefined action-state mappings, whether purely static or randomly/probabilistically selected from a static set of options defined at development time, mean that the computer-controlled agents are unable to adapt to changing circumstances after development time, particularly to changes in the human player's behavior during game play.