US 12,169,685 B2
Computer-implemented method of preparing a training dataset for a natural language processing or natural language understanding machine learning algorithm
Simon Hegelich, Munich (DE); and Kolja Hegelich, Dorsten (DE)
Filed by Simon Hegelich, Munich (DE); and Kolja Hegelich, Dorsten (DE)
Filed on Oct. 13, 2021, as Appl. No. 17/450,795.
Claims priority of application No. 21172292 (EP), filed on May 5, 2021.
Prior Publication US 2022/0358282 A1, Nov. 10, 2022
Int. Cl. G06F 40/166 (2020.01); G06F 40/247 (2020.01); G06F 40/253 (2020.01); G06N 5/022 (2023.01)
CPC G06F 40/166 (2020.01) [G06F 40/247 (2020.01); G06F 40/253 (2020.01); G06N 5/022 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A computer-implemented method comprising:
selecting, by one or more processors, one or more sentences from an original text dataset as one or more selected sentences;
determining, by the one or more processors, for each selected sentence one or more grammatical elements of the selected sentence that can be negated as one or more negatable elements;
determining, by the one or more processors, for one or more negatable words in each negatable element one or more antonyms;
based on each determined antonym, creating, by the one or more processors, a negated sentence by replacing the respective negatable word in the selected sentence for which the antonym was determined with the determined antonym; and
adding, by the one or more processors, the negated sentence to a training dataset, the training dataset configured to be used in a natural language processing (NLP) machine learning algorithm.