The invention relates generally to systems and methods for analyzing data. More specifically, the invention relates to a systems and methods (collectively “the system”) for automatically identifying an “event” in a data file.
Advances in information storage technology provide individuals, businesses, universities, think tanks, government agencies, research institutions, hospitals, and other types of organizations and entities with entirely new sets of challenges and opportunities. As the cost of data storage decreases, the amount of data being stored increases. However, the ability to effectively and efficiently analyze data has not kept up with the technology of capturing and storing data. It would be desirable for analysis tools to possess user-friendly interfaces that are easy to use, and yet are also computationally robust and comprehensive. It would be desirable for an analysis tool to be both highly automated and highly configurable.
The voluminous abundance of data provides a yet untapped opportunity to look for patterns, and perform various statistical analysis relating to those patterns. The undiscovered patterns in existing data could be the source of future insights in engineering, economics, medicine, computer science, business and other fields. The ability to identify and explore statistical correlations and other data relationships can be the key to effective problem solving and optimization. One significant barrier to such a harvesting of analysis is the inability to find the data you want when you want it. It would be desirable for a data analysis system to include some type of search tool to better facilitate access to meaningful data. It would be desirable for such a search tool to include the ability to perform statistical correlations in identifying particular events or patterns.
Pursuing data analysis from the ground up is a difficult task. The persons with the subject matter expertise often do not have the educational background in statistics, and the person with the background in statistics will often not have sufficient knowledge of the relevant subject matter. It would be desirable if a data analysis system could encapsulate patterns of data in the form of events. Persons with subject matter expertise, could then define the events that are of interest from a subject matter perspective, and apply automated statistical tools to those events (e.g. patterns of data). The ability to place “markers” in data files to mark the occurrence of various events, and perform automated processing based on those markers would also be desirable.
The inability of an analysis system to search data files for user-defined events in an automated way impedes the ability to conduct data analysis in a timely and efficient manner. There are many obstacles to such an automated system. The prior art does not appear to provide an analysis tool that provides comprehensive correlation and other statistical tools in a way that promote both meaningful flexibility and significant automation. When dealing with data in the time domain, data is typically stored at various different scales, so it would be desirable to adjust the scaling of data so that an “apples to apples” comparison can be made. Similarly, statistical fitting heuristics will possess different sensitivities given different sample sizes. Thus, it would be desirable for an analysis system to make corresponding adjustments in an automated fashion. The ultimate use of statistical analysis requires some sense of how strong the end results are. It would be desirable for an automated analysis system to provide some measure of a confidence value in the output of the system.
The prior art does not appear to disclose or even suggest that an analysis system can be both highly configurable and substantially automated. The goals are in direct conflict with each other, with an apparent zero-sum game being the end result. Those well rooted in data analysis typically sacrifice automation and ease of use to facilitate comprehensive functionality. Those focused on the subject matter at hand typically sacrifice flexibility and comprehensiveness to obtain the goals of ease of use and automation. Persons focused on the subject matter often fail to realize the analytical tools that at least theoretically exist. The possibility of a tool that is easy to use, highly configurable, and computationally comprehensive is not suggested in the art.