Users of stand-alone or hosted software programs often encounter issues in installation, configuration, or execution of the software programs. With the advent of software as a service (SaaS) and subscription-based software delivery models, users may further have additional issues with, for example, subscription, accounts, renewal, etc. Conventional approaches address these issues with statically coded responses such as online or offline documentation or help files that often fall short of answering the exact issues or inquiries that hamper the users' experiences with or uses of the software applications.
These conventional approaches fall short due to many reasons such as a variety and often unpredicted ways of expressing or describing even the same inquiry from various users. For example, different users may use different vocabularies or different expressions (e.g., complete sentences, incomplete sentences, phrases, colloquialism, slangs, one or more words, etc.) in free text or natural language to describe an inquiry. Such a variety of expressions and vocabularies render these conventional, statically coded software help systems inadequate at best and often difficult, if not entirely impossible, to replace live support personnel in responding to users' inquiries.
Moreover, statically coded support engineering systems not only have difficulties in understanding and hence addressing users' inquiries but also in providing adequate or accurate recommendations to respond to the users' inquiries due to their limited coverage of a great variety of possible ways of expressing these inquiries. In addition, even dynamically coded support engineering systems have difficulties in understanding terms that are not covered by or contained in the existing data sets. As a result, conventional approaches, even deployed to replace live support personnel, often fail to provide satisfactory user experiences and leave much to be desired in terms of accuracy and hence usefulness.
Data classification and data clustering have been employed to a data set using variables and their known values in predicting (e.g., data classification) and describing (e.g., data clustering) data for various purposes. Conventional data classification and data clustering techniques seeking better accuracy in the description and prediction of data often employ iterative processes driven by complex classifier or clustering algorithms.
These conventional data classification and clustering techniques often attain better accuracy at the expense of speed and computational resource utilization. These conventional data classification and clustering techniques are often performed in a batch process that is run overnight due to the complexities of the computation involved. Other data classification and clustering techniques trade accuracy for speed and resource utilization and often fall short on the accuracy of their results of description and prediction and hence the usefulness of these convention approaches.
In both approaches, the encounter with terms that are not covered by or contained in the data sets upon which the classification or clustering engines are built often result in the result of no classification or no cluster. Such terms may only be captured after the classification or clustering engines are adjusted to accommodate such new terms. Nonetheless, such adjustments may require modification to the source code, re-compilation of the source code, etc. before the modified classification or clustering engines may be placed in service. Any attempts to deploy such modified engines to the Web often demands much manual effort to convert the code into interpreted runtime language. To further exacerbate these problems, the aforementioned deficiencies of these conventional classification and clustering approaches not only negatively affect the classification or clustering of data but also impede any subsequent actions that rely on the results of classification or clustering.
Therefore, there exists a need for a method, system, and computer program product for classifying digital data using real-time computing techniques to address at least the aforementioned shortcomings of conventional approaches.