An embodiment relates generally to data mining of warranty service repair data.
Typical text mining tools generate searches utilizing simple search criteria such as single term searches. Many current text mining tools cannot handle poorly written sentences or unstructured service repair data consisting of different types of noises, such as abbreviated service repair information, incomplete service repair text, and misspellings. Furthermore the existing tools cannot identify the anomaly cases from the field data such as comparing a respective labor code description (which consists of ‘name of a part to be fixed’ and a ‘repair action to be taken’ for fixing the fault associated with a part) with a respective reported labor code for identifying mismatches. Therefore, for a search that requires more than a single term, there is no guarantee that the combination of searched terms in the service repair verbatim has a precise relationship between one another. Moreover, unless the exact terms searched appears in each of the different sets of documents, clustering of service repair technician verbatim (i.e. documents) to identify frequently failing parts in addition to the symptoms associated with these parts and the repair actions that are taken by the technicians to fix the fault may be incomplete. This would results in unobservable data representation for the subject matter expertise mining the data and attempting to take appropriate decision making action.