US 12,169,763 B2
Fast and scalable multi-tenant serve pool for chatbots
Vishal Vishnoi, Redwood Shores, CA (US); Suman Mallapura Somasundar, Sunnyvale, CA (US); Xin Xu, San Jose, CA (US); and Stevan Malesevic, Glen Ellyn, IL (US)
Assigned to Oracle International Corporation, Redwood Shores, CA (US)
Filed by Oracle International Corporation, Redwood Shores, CA (US)
Filed on Apr. 13, 2021, as Appl. No. 17/229,228.
Claims priority of provisional application 63/139,723, filed on Jan. 20, 2021.
Claims priority of provisional application 63/009,118, filed on Apr. 13, 2020.
Prior Publication US 2021/0319347 A1, Oct. 14, 2021
Int. Cl. G06N 20/00 (2019.01); G06F 11/34 (2006.01); G06F 16/2458 (2019.01); G06N 5/043 (2023.01); H04L 51/02 (2022.01)
CPC G06N 20/00 (2019.01) [G06F 11/3409 (2013.01); G06F 16/2471 (2019.01); G06N 5/043 (2013.01); H04L 51/02 (2013.01)] 18 Claims
OG exemplary drawing
 
1. A method comprising:
receiving, by a query serving system, a request to serve a query for a skillbot, wherein the query serving system comprises: (i) a first plurality of deployments in a serving pool, and (ii) a second plurality of deployments in a free pool, wherein the first plurality of deployments in the serving pool and the second plurality of deployments in the free pool are maintained in a pool of deployments that is different than a training pool of the query serving system;
determining, by the query serving system, whether a first deployment from the first plurality of deployments in the serving pool can serve the query based on an identifier of the skillbot; and
in response to determining that the first deployment cannot serve the query,
selecting, by the query serving system, a second deployment from the second plurality of deployments in the free pool to be assigned to the skillbot;
loading, by the query serving system, a machine-learning model associated with the skillbot into the second deployment, wherein the machine-learning model is trained to serve the query for the skillbot;
deleting a third deployment from the first plurality of deployments in the serving pool;
transferring the second deployment from the second plurality of deployments in the free pool to the first plurality of deployments in the serving pool;
serving, by the query serving system, the query using the machine-learning model loaded into the second deployment; and
constructing a new deployment to be added to the free pool maintained in the pool of deployments.