US 12,169,844 B2
Methods and systems for identifying breakpoints in variable impact on model results
Or Herman-Saffar, Ofakim (IL); Amihai Savir, Sansana (IL); Anat Parush-Tzur, Beit Kama (IL); John Lawrence Dalton, Austin, TX (US); and Alana Brook Marcum Barker, Austin, TX (US)
Assigned to EMC IP HOLDING COMPANY LLC, Hopkinton, MA (US)
Filed by EMC IP Holding Company LLC, Hopkinton, MA (US)
Filed on Jun. 30, 2021, as Appl. No. 17/363,692.
Prior Publication US 2023/0004991 A1, Jan. 5, 2023
Int. Cl. G06Q 30/0201 (2023.01); G06F 18/214 (2023.01); G06N 20/00 (2019.01); G06Q 30/0202 (2023.01)
CPC G06Q 30/0201 (2013.01) [G06F 18/214 (2023.01); G06N 20/00 (2019.01); G06Q 30/0202 (2013.01)] 15 Claims
OG exemplary drawing
 
1. A method for identifying breakpoints, the method comprising:
obtaining, by a breakpoint identification device, an entity group data set corresponding to an entity group;
generating, using the entity group data set, an enhanced entity group data set comprising the entity group data set and a derived data item, wherein each enhanced entity of the enhanced entity group data set comprises a set of input variables;
performing a clustering analysis using at least a portion of the enhanced entity group data set to obtain a plurality of entity clusters;
training a machine learning (ML) model using the enhanced entity group data set to obtain a trained ML model;
using the trained ML model to determine a relative importance of the set of input variables for the trained ML model;
reducing a quantity of input variables by:
removing at least one input variable of the set of input variables based on the relative importance to obtain a first reduced set of input variables;
training the ML model using the first reduced set of input variables to obtain a first reduced trained ML model;
making a first determination that the first reduced trained ML model has a first prediction accuracy above a threshold;
removing at least one input variable of the first reduced set of input variables based on the relative importance to obtain a second reduced set of input variables;
training the first reduced ML model using the second reduced set of input variables to obtain a second reduced trained ML model; and
making a second determination that the second reduced trained ML model has a second prediction accuracy above the threshold;
generating, using the enhanced entity group data set, a set of simulated entities,
wherein each simulated entity of the set of simulated entities comprises a modified variable of the set of important input variables;
inputting the set of simulated entities into the reduced trained ML model to perform a breakpoint analysis;
generating a breakpoint graph based on the breakpoint analysis; and
providing recommendations to an interested entity based on a breakpoint identified using the breakpoint graph.