US 12,169,769 B2
Generic quantization of artificial neural networks
Benoit Chappet de Vangel, Paris (FR); and Gabriel Gouvine, Paris (FR)
Assigned to MIPSOLOGY SAS, Palaiseau (FR)
Filed by Mipsology SAS, Palaiseau (FR)
Filed on Jan. 20, 2020, as Appl. No. 16/747,103.
Application 16/747,103 is a continuation in part of application No. PCT/IB2019/050648, filed on Jan. 26, 2019.
Application 16/747,103 is a continuation in part of application No. 16/258,552, filed on Jan. 26, 2019, granted, now 11,068,784.
Prior Publication US 2020/0242445 A1, Jul. 30, 2020
Int. Cl. G06N 3/04 (2023.01); G06N 3/08 (2023.01)
CPC G06N 3/04 (2013.01) [G06N 3/08 (2013.01)] 19 Claims
OG exemplary drawing
 
1. A system for performing a quantization of artificial neural networks (ANNs), the system comprising one or more processors configured to:
receive a description of an ANN and sets of inputs to a plurality of neurons of the ANN, the description including sets of weights of the inputs to the plurality of neurons of the ANN, the description being of a first data type;
determine a first interval of the first data type to be mapped to a second interval of a second data type, the second interval of the second data type being in a format readable by a hardware-based accelerator, the hardware-based accelerator comprising an application-specific integrated circuit to perform computations of the plurality of neurons of the ANN using the second data type;
(a) perform, based on the sets of inputs and the description of the ANN and using the hardware-based accelerator, computations of sums of products of the weights and the inputs to obtain a set of sum results, wherein the computations of sums are performed using at least one number of the second data type within the second interval, the at least one number being a result of mapping of at least one number of the first interval to a number of the second interval;
(b) determine, based on the set of sum results, a measure of saturations; and
(c) adjust, based on the measure of saturations, at least one of the first interval and the second interval, wherein:
the products are computed using numbers of the second interval, the numbers being a result of mapping of the inputs to the neurons and the weights for the inputs to the second interval;
the sum results are represented by the second data type; and
the determining the measure of saturations includes comparing at least one of the sum results to a function of boundaries of the second interval.