The present invention relates generally to a method, system, and computer program for mining procedure dialogs from source content containing linguistic data. More particularly, the present invention relates to a method, system, and computer program for embedding linguistic data of source content to identify and generate procedure flows.
Data embedding is a process that maps segments of linguistic data, e.g. words and/or sentences, to vectors of real numbers. Data embedding enables the prediction of certain data segments based on the data segments that surround that data segment based on the relationships of those segments in a vector. Data embedding also enables the prediction of a surrounding data segments based on a single data segment based on the relationships of those segments in a vector.