CPC G06N 20/00 (2019.01) [G06F 11/3409 (2013.01); G06F 16/2471 (2019.01); G06N 5/043 (2013.01); H04L 51/02 (2013.01)] | 18 Claims |
1. A method comprising:
receiving, by a query serving system, a request to serve a query for a skillbot, wherein the query serving system comprises: (i) a first plurality of deployments in a serving pool, and (ii) a second plurality of deployments in a free pool, wherein the first plurality of deployments in the serving pool and the second plurality of deployments in the free pool are maintained in a pool of deployments that is different than a training pool of the query serving system;
determining, by the query serving system, whether a first deployment from the first plurality of deployments in the serving pool can serve the query based on an identifier of the skillbot; and
in response to determining that the first deployment cannot serve the query,
selecting, by the query serving system, a second deployment from the second plurality of deployments in the free pool to be assigned to the skillbot;
loading, by the query serving system, a machine-learning model associated with the skillbot into the second deployment, wherein the machine-learning model is trained to serve the query for the skillbot;
deleting a third deployment from the first plurality of deployments in the serving pool;
transferring the second deployment from the second plurality of deployments in the free pool to the first plurality of deployments in the serving pool;
serving, by the query serving system, the query using the machine-learning model loaded into the second deployment; and
constructing a new deployment to be added to the free pool maintained in the pool of deployments.
|