US 12,169,779 B2
Parameter-efficient multi-task and transfer learning
Mark Sandler, Mountain View, CA (US); Andrew Gerald Howard, Culver City, CA (US); Andrey Zhmoginov, Mountain View, CA (US); and Pramod Kaushik Mudrakarta, Chicago, IL (US)
Assigned to GOOGLE LLC, Mountain View, CA (US)
Filed by Google LLC, Mountain View, CA (US)
Filed on May 2, 2023, as Appl. No. 18/310,638.
Application 18/310,638 is a continuation of application No. 16/577,698, filed on Sep. 20, 2019, granted, now 11,676,008.
Claims priority of provisional application 62/737,763, filed on Sep. 27, 2018.
Prior Publication US 2023/0267330 A1, Aug. 24, 2023
This patent is subject to a terminal disclaimer.
Int. Cl. G10L 15/16 (2006.01); G06N 3/045 (2023.01); G06N 3/08 (2023.01)
CPC G06N 3/08 (2013.01) [G06N 3/045 (2023.01); G10L 15/16 (2013.01)] 17 Claims
OG exemplary drawing
 
1. A computer-implemented method, the method comprising:
obtaining, by one or more computing devices, a convolutional machine-learned model that has been previously trained on a first training dataset to perform a first task, the convolutional machine-learned model including one or more convolutional filters and a first set of learnable parameters;
modifying, by the one or more computing devices, the convolutional machine-learned model to include a model patch, the model patch including a second set of learnable parameters, wherein modifying, by the one or more computing devices, the convolutional machine-learned model to include the model patch comprises replacing, by the one or more computing devices, at least one of the one or more convolutional filters with a reduced-parameter version of the convolutional filter; and
after modifying the convolutional machine-learned model to include the model patch, training, by the one or more computing devices, the convolutional machine-learned model on a second training dataset to perform a second task that is different from the first task, wherein training, by the one or more computing devices, the convolutional machine-learned model on the second training dataset to perform the second task comprises learning new values for the second set of learnable parameters included in the model patch.