1. Field of the Invention
The present invention relates to systems and methods for displaying and providing user interaction with heterogeneous sets of data. In particular, the present invention relates to systems and methods that provide a novel graphical user interface that allows the user to focus on data of interest. More specifically, the present invention relates to systems and methods for displaying the user interface that includes: a center of attention, a parameter space, and a plurality of correlations between the center of attention and the parameter space.
2. Description of the Background Art
With the use and proliferation of computers, the Internet, and devices that collect digital data, there has been an explosion in the amount of data available in addition to the number of types of data that are available. This data is often available to the user only in particular applications. These massive quantities of data often have proprietary or specific formats and custom user interfaces to access them. While there are some mechanisms to import or reformat data so that it is usable in another system, there are not systems and methods that allow users to view heterogeneous sets of data in an effective and efficient manner. Furthermore, since there is so much data, systems for interacting and displaying data are often not able to appropriately represent the data or focused the user's attention on the portions of data that are most significant. Thus there is a need for new and improved user interface that allows such capabilities.
One example of an area that produces great amount of data and that need systems and methods for representation and visualization of that data is cyber security. Cyber security has become increasingly important due to the dependence of our modern day society on computerized information systems. Billions of bytes of data are transported across computer networks everyday, carrying information about credit transactions, banking information, sensitive government information, power plant operations, and personal notes. The pervasiveness of sensitive information makes it increasingly vulnerable to malicious uses and exploits. It is important that electronic communication transfers are secure and reliable in a society that depends so heavily on information networks. One way to increase the overall security of computer networks is to develop tools that increase the situational awareness and understanding of all those responsible for their safe operations.
Making quick and accurate decisions in complex and rapidly changing environments is a major concern in many fields, including patient monitoring, computer network management, financial trading, process control, government intelligence, vehicle operation, traffic control, enterprise systems management, corporate management, and quality assurance.
Given a natural or man-made system, events occur that need to be detected, diagnosed, and treated in order to maintain or improve the “health” of such a system (health being defined as normal or desired behavior). Using all the raw data that may be measured or computed, insight is achieved by identifying the functional relationships among data variables. In addition, a decision maker has a specific context, mission and expertise, and may want to know: the overall health of the system versus the component details, exact quantities of variables versus the qualitative behavior of variables (or their relationships), and the history and trend versus the details of the moment.
The prior art presents streams of abstract data (e.g. heart rate, stock price, packet loss) with plots, pies, bars, maps, trees, etc. Displays based on these centuries' old metaphors do not reflect the relative importance of the variables and the evolution of the relationships. In addition, chart type displays do not capitalize on the power of modern computer graphics and on human natural perception. Such displays also have a limited ability to convey insight from the increasing amount of data produced today.
Sifting through and integrating many screens of such output displays to determine functional relationships reliably may produce information overload for an analyst. Cognitive psychologists have demonstrated that humans are capable of processing no more than four interacting variables at a time unless the individual has developed high levels of expertise in understanding data in that particular domain. When faced with multi-variant information, decision makers develop their own heuristic rules and mental models for selecting and integrating information, which may take years of training or experience. In other situations, decision makers need intermediation by experts. This additional analysis introduces layers of reliability loss and time delay, which interfere with mission criticality. There is a need for tools that augment human ability to draw insight from abundant or complex data, in order to make decisions faster, more accurately, with less cognitive effort, and less training.
Research in information visualization and software development has primarily focused more on the internal processing logic and data organization, and less on methods to present data in a usable way so that others make better decisions. Little literature is available on real time decision making. Research in information visualization often consists of improving traditional visual metaphors. However, many existing visual metaphors and techniques may not be intuitive to inexperienced users. For example, most prior art representations do not satisfy the principles of congruence (internal data representation needs to be consistent with the external representation) and apprehension (the representation needs to be intuitively apprehended).
Computer scientists, who may not be trained in visual communication or in user knowledge elicitation, usually design information visualizations. As a result, the user's interaction and apprehension have been left as a secondary issue. Many believe that usability must be employed throughout the development process. User-centered design methodologies have emerged and are being utilized for software development such as Hartson and Hix star life cycle and the adopted ISO 13407 standards.
However, few information visualization solutions have involved user-centered approaches, despite usability being critical for effective transfer and understanding of information. The focus on data presentation requires user interaction. This differs from expert systems, which typically represent experts' heuristics as data or rules, and generally does not involve the user in exploratory data analysis using human pre-attentive perceptual skills.
Typical examples of current techniques include spreadsheets, basic histograms and bar charts (Flowscan), node and link metaphors (NIVA), scatter plots (NVision), line-position, and star coordinates. Fundamentally, the prior art techniques are: 1) based on simple representations, 2) do not map effectively to the visual processes and more importantly to the decision making process, 3) focus on very narrow or trivial problems or data sets, or 4) are designed by analysts for personal use on specific tasks.
Flexview is an AFRL visualization tool based on spreadsheets that represents snort alerts in tabular form. An expert analyst can initiate queries and filters to identify anomalous activity represented within the snort alerts. While it is an effective way to filter the alerts, it does not present the information graphically and does not allow the ability to include other types of alerts and data.
Other techniques use simple histograms and bar charts to indicate a relative value of network health or activity. Sudden changes in behavior of the overall network are an indicator of anomalous network activity. However, many of these representations offer only limited representation and analysis capabilities.
Scatter plots have become extremely popular, especially in the representation of port activity (PortVis). This visualization technique has merits in its ability to see port scan activity which may be a precursor to an attack but this only represents a very narrow view of the problem and in the current implemented form does not allow for the integration of multiple data sets. This limits the ability to see complex relationships among disparate data sets.
Many node and line or line-position based techniques have been developed. However, many of these are poorly designed resulting in cluttered and confusing displays with limited information. Often times these displays have an enormous number of lines intersecting and shown with no way to see relationships and hierarchy of the importance of the information. Many of these techniques have gained interest due to the publication of promising results. However, these results are based on trivial data sets, such as the representation of BGP data. For example, one visualization technique aids the detection of a worm virus, however a simple histogram may have been a more effective visualization for such data.
Therefore, what are needed are systems and methods for displaying and providing user interaction with heterogeneous sets of data.