1. The Field of the Invention
This invention relates to data analysis, and more particularly, to novel systems and methods for mapping correlations of data while maintaining data in an original data-domain rather than transforming the data into other domains for manipulation.
2. The Background Art
In the disclosure of U.S. Pat. No. 5,796,922 issued Aug. 18, 1998 to Smith and directed to a Trainable, State-Sampled, Network Controller, several very useful analysis techniques are presented. In addition to the matrix algebra methodologies, very useful properties in a state-sampled domain are relied upon. For example, by reliance upon the uncoupled, independent nature of variables in the state domain, simplified systems of equations may be formulated and readily solved. However, if data is highly coupled, the presumption of independence or uncoupling between variables is highly inaccurate.
Also, the ""922 patent relies on transformations into, and subsequent analysis in, the state-space domain. Such transformations into a state-space typically provide analytical simplicity. However, in many actual situations encountered, information regarding coupling between dimensions is lost by the required transformations.
Another issue raised when one reviews the ""922 patent is that of xe2x80x9cprevious knowledgexe2x80x9d of the form of equations. In control systems, classical control theory provides a plethora of terms having forms well understood for modeling various configurations of hardware or other control environments. In other classes of problems encountered in the real world, the forms of equations are not necessarily known. Moreover, in many situations, even when the form of equations is known, or the equations themselves are exactly known, absolutely intractable calculation complexity prohibits actual solutions of the governing equations.
Thus, what is needed is a method that does not require independence of variables, but which can rather accommodate, even capture and interpret, the coupled relationships between different variables (dimensions) in a data-domain. Also needed is a method that does not require transforms, particularly transforms that may lose information from the original data-domain. Another need is the need for a simplified, virtually single-step, method for mapping an output or solution surface in a multidimensional data space from the data directly without having to undergo complex calculations, encounter impossible calculations, or know a priori the form of a governing equation.
Linear networks use a linear set of simultaneous equations having variables (parameters of influence) which may include outputs and inputs. Each variable in an equation has a leading coefficient associated with it to scale the contribution of the variable to the equation. A linear system solver or other matrix system solver may be used to solve a system of resulting equations, defining the coefficients.
In general, linear algebra is a well understood art. Moreover, nonlinear systems of equations are also tractable by both closed form solutions and by various numerical methods.
The ""922 patent discusses at length the mathematical support for network-type controllers. Moreover, the patent discusses and compares network and classical controllers. The patent describes a controller that does not require complete knowledge of the system controlled, but relies instead upon an understanding of the form of various control parameters in control equation. The method then can identify the coefficients that best suit the various terms to formulate a control equation.
Data is originally obtained in a native domain. For example, time is measured in time. Space is measured in distance. Temperature is measured in a Fahrenheit, Celsius, Kelvin, or other temperature domain. Electromagnetic radiation is measured in a particular spectral domain.
All data is composed of numbers and xe2x80x9cunitsxe2x80x9d or xe2x80x9cdomain identifiers.xe2x80x9d That is, numbers do not stand alone. A number represents a quantity of something in a domain. Thus, a time domain may be incremented in seconds, minutes, hours, days, weeks, years, centuries, millennia, and so forth. Angles may be measured in degrees, seconds, minutes, radians, or the like. Similarly, distances may be measured in rectangular or polar coordinates. Velocities may be measured in angular or polar coordinates involving both spatial dimensions and temporal dimensions.
When data is converted or transformed from its original domain into another domain for purposes of mathematical manipulation, the motivation is usually some desire for simplicity. For example, certain problems arising in a polar coordinate domain may be very easily tractable in a rectangular coordinate domain. Similarly, certain equations or data that appears complex in a rectangular spatial domain may be readily tractable in a polar spatial coordinate domain.
Similarly, certain domains may represent mathematical functions. Since mathematical functions may be comparatively simple or complex, analysts may prefer to transform data from one domain, in which the data appears governed by comparatively complex equations or variables, into an alternative domain, where the data appears to be controlled by comparatively simple expressions or equations.
For example, one set of transformations that is frequently used is the set of transformations available to convert polar coordinates to rectangular coordinates, and vice versa. Trigonometry provides numerous relationships between spatial coordinates in rectangular and polar systems. Here however, a great difficulty often interferes.
Data obtained from observation of actual physical systems is usually continuous in its original data-domain, well behaved, of finite scope, and is described in comparatively simple mathematical terms. However, in order to avoid certain non-linear relationships, data from an original data-domain may be mapped to some other domain for analysis, manipulation, presentation, or the like. When trigonometric functions are used to transform data from a data-domain to some other analytical domain, problems arise in the inherent discontinuities that exist in trigonometric functions.
A classic example is the inverse tangent function. This function takes on values approaching infinity at certain asymptotes. Computationally, computers cannot tolerate infinite numbers nor divisions by zero. Thus, obtaining inverses of transformations is impossible at certain locations.
In one example, angular data from a two-axis magnetometer may be used to measure the rotational angle and rotational velocity of a spinning platform. A magnetometer measures magnetic field in two orthogonal, cartesian coordinates x and y. The two components of the magnetic field may be more easily relied upon if converted to angular directions and angular velocities. An arctangent relying of the orthogonal components of the magnetic field may thus yield an angle in degrees or radians, and an angular velocity in degrees per second or radians per second. Unfortunately, a discontinuity occurs at positive or negative 180 degrees. Thus, a system relying on the foregoing transformation is useless at angles approaching 180 degrees. Derivative data, such as a time derivative of position, yielding velocity, is even more problematic. A rotational velocity is a continuous function at all angles. However, magnetometer data in two cartesian directions contains all of the information, but the transformation again has discontinuities at the asymptotes of the arctangent.
Thus, various processes of converting data especially coupled into a domain different from the native or data domain may corrupt or lose some of the information. Often the information is lost by creation of discontinuities in what should be, or originally was, continuous data. For example, in the above-referenced example of the magnetometer, the important information is a non-linear relationship of phase between the two data channels. A data-domain sampled network is needed that can be used to optimally map multidimensional input data, each input remaining in the domain best suited to the specific data. Likewise, outputs really ought to be expressed in terms of variables in a domain best suited for the outputs.
Thus, what is needed is a system for simply and rapidly correlating outputs and inputs related to data, in their original domains, without requiring an intermediate transformation. In classical methods, this is often impossible. Complexity may render problems intractable. Numerical methods, in which computerized algorithms for approximation are sufficiently accurate, or can be made sufficiently accurate for all practical purposes, are desirable. Thus, what is needed is a method by which data can be maintained in its original domains, and in which some correlation between data parameter of interest (e.g. inputs and outputs, or independent variables and dependent variables) can be related quickly, accurately, continuously, and simply.
In view of the foregoing, it is a primary object of the present invention to provide a method and apparatus for moving between dimensions of a data-domain, e.g. to correlate inputs and outputs and outputs in a solution space without losing information from the data-domain through transformations.
It is an object of the invention to provide a method and apparatus effective to amalgamate multidimensional data, combining data sets or streams without requiring or falsely assuming independence or uncoupling between variables (dimensions in the data-domain).
It is an object of the invention to provide a method and apparatus for Preserving information in data interdependent variables from different dimensions in the data-domain.
It is an object of the invention to provide simplified data processing, analysis, and the like wherein data may be correlated to provide useful relationships (e.g. solutions, input/output relations) in a single algorithmic operation, particularly without loss of continuity of any dimension of the data-domain.
It is an object of the invention to do the foregoing without requiring a priori knowledge of the equations or forms of equations relating variables to one another.
Consistent with the foregoing objects, and in accordance with the invention as embodied and broadly described herein, an apparatus and method are disclosed, in suitable detail to enable one of ordinary skill in the art to make and use the invention. In certain embodiments of methods and apparatus in accordance with the invention, data may be manipulated or used in a data network of data points (as opposed to a hardware computer network, over which operations may proceed) in an original data-domain, without transformation into a domain that would lose important properties of the data. For example, continuity of functions or derivatives of functions may be maintained.
In certain embodiments of an apparatus and method in accordance with the invention, correlations may be made between a data input domain and a data output or solution domain, or between an independent variable domain and a dependent variable domain. Interpolation functions may be selected for fitting or optimally fitting the curvature of a surface in a functional domain dependent on a data-domain. Values of a function in a functional domain (e.g. dependent variables in a dependent variable domain), corresponding to selected points in a multidimensional data-domain, may be saved in memory. All points intermediate the saved points may be interpolated comparatively rapidly and accurately by interpolating with the interpolation functions. Interpolation functions may be comprised of linear combinations of terms. The terms may be linear or non-linear combinations of variables in the data-domain and weighting coefficients for correlating variables to the functional values.
One may say that the network of data points in a data-domain represents a sample. The sample may be taken at regular or irregular intervals over each dimension of a data-domain, as may best serve the purpose of a user. A value of a function in a functional sub-space (e.g. solution space, functional range, etc.) dependent upon other variables in a data-domain sub-space may be obtained with a minimum of computational complexity. In many embodiments, interpolation functions may be optimized using the data points in data-domain in order to provide optimized interpolations and nearly identical calculation times for every interpolation, based on an optimized sample size and interpolation function, correlating the function range to the data-domain.
In certain embodiments an apparatus and method in accordance with the present invention may include a general purpose digital computer, which may be networked in a local area network with other computers. Likewise a computer may be linked over an internetwork of smaller networks to any extent manageable.
To avoid confusion, one should differentiate between a hardware network of various items of apparatus (e.g. interconnected computers, devices) and a data network (a correlation of a grid of points in a data space of some dimension), A computer, in one embodiment of a method and apparatus in accordance with the invention, may process information provided directly to it by peripheral devices, or may receive data from other computers over one or more networks. Similarly, a computer may process data and send results to one or more computers or computationally capable devices over one or more networks.
In accordance with certain aspects of the invention, a memory of a computer may contain executable data (executables, programs, applications, instructions) and non-executable data (operational data). The processor of a computer may be loaded or programmed with executables for processing operational data, thus becoming a special purpose digital computer programmed to perform the functions enabled by the executables.
Methods in accordance with the invention may be executed on a computer or several computers together. Some methods may involve interaction between one or more computers and a user. Other methods may involve interaction between peripheral devices (e.g. data sensors, other data sources, machines, or other data consuming apparatus) and computers, between computers, users, and devices, between different computers, or between components of a single computer.