The background description includes information that may be useful in understanding the present invention. It is not an admission that any of the information provided herein is prior art or relevant to the presently claimed invention, or that any publication specifically or implicitly referenced is prior art.
All publications herein are incorporated by reference to the same extent as if each individual publication or patent application were specifically and individually indicated to be incorporated by reference. Where a definition or use of a term in an incorporated reference is inconsistent or contrary to the definition of that term provided herein, the definition of that term provided herein applies and the definition of that term in the reference does not apply.
Many computer systems collect, aggregate, and process data in order to perform tasks and nm analytics. There has been, and will likely continue to be, a significant increase in the volume and variety of data available to organizations from various disparate sources. The term “Big Data” is often used to describe this trend. Organizations oftentimes seek ways to use such data in order to gain insight, improve performance, and develop predictive models. Efficiently using data from disparate sources oftentimes requires combining and transforming the data into a single dataset before processing the data. However, it may be difficult to determine the most relevant data sources and attributes and how these need to be transformed to be most useful. Therefore it would be beneficial for a system to recommend to the user relevant data attributes and transformations.
U.S. Pat. No. 8,775,473 to Anzalone teaches a data processing system that aggregates data from two different data repositories to create a multidimensional data structure. Anzalone's system will allow a client user to select attributes to be analyzed and modeled. An analytic recommendation processor will then suggest additional available attributes based upon past response rates of other users who also selected such attributes. Anzalone's system, however, is unable to predict new attributes to suggest when past users have not selected the new attributes nor can it rank the suggested attributes and associated transformations based on the similarity of the current selection to prior selections.
US 2015/0026153 to Gupta teaches a search engine that generates relational database queries among a plurality of databases. When a user enters a search term, such as “revenue,” a state machine will look for related attributes and measures to suggest, such as “state,” “city,” or “tax.” Gupta, however, requires an administrator of the system, however, to pre-program the state machine with relational data that suggests related attributes and measures to the user's search term. Gupta is unable to predict new related attributes to suggest when past users have not selected the new attributes nor can Gupta's system rank the suggested attributes and associated transformations based on the similarity of the current selection to prior selections.
Thus, there remains a need for an improved system and method that suggests and ranks unselected relevant attributes and associated transformations.