The present invention relates to measurements and data analyses, and, more specifically, to the generation of analytic measurements and to their application to real phenomena.
Data analyses that employ mathematics and statistics in analyses of measurements of real phenomena have proven to be useful in many areas, including, for example, medicine, public health, economy, industry, and education.
This invention focuses on measurements of real phenomena. Real phenomena refer here to factual and non-abstract phenomena, including, for example, tangible, material, living, non-living, or artificial things (like an animal, plant, chair, car, or engine) as well as events, places, conditions, structures, or processes (like a day, a country, income, or infant mortality). A measurement is the act or process of measuring; it is a number, figure, extent, or amount obtained by measuring. Measurements of real phenomena can be listed in one list or multiple lists of measurements; one list of measurements is univariate, multiple lists of measurements are multivariate.
Multivariate analysis is a data analysis of more than one variable. Therefore, multivariate analyses analyze multiple lists; they do not analyze a single list. Factor analysis, principal components analysis, cluster analysis, and regression analysis are examples of multivariate analyses of multiple lists of numbers (Everitt 1993; Hartigan 1975; Hotelling 1933; Pearson 1901; Rencher 2002; Spearman 1904).
The present invention focuses on the multivariate analyses of a univariate list of measurements of distinct real phenomena.
Transformations of univariate lists to matrices are common, and are found in many areas. For example, distance djk between numbers xj and xk can be computed with djk=|xj−xk|, and matrix D containing n2 cells containing distances djk can be constructed where j=1:n and k=1:n. Similar matrices are commonly constructed for distances between points in or on a geometric structure or distance between points in a specific space (e.g., n-dimensional Euclidean space) or distance between two vertices of a graph. The geodesic distance between two points is a locally length-minimizing curve (i.e., it is the shortest distance between the two points, and the shortest path that a particle which is not accelerating would follow between the two points). For example, in the plane, the geodesic distance between two points is a straight line, and the geodesic distance between two points on a sphere is the segment that connects them in the great circle of this sphere (e.g., the equator of a spherical planet). Geodesic distance between points is typically computed using geometric theorems, rules, or methods (e.g., the Pythagorean Theorem for computing the geodesic distance between two points in the plain); computations of geodesic distances between points may also be implemented in various global positioning systems, GPS, and diverse computer programs. In graph theory, a geodesic distance between two vertices is the number of edges in the shortest path connecting them.
A list of numbers can also be transformed to multiple lists of numbers using spectral, wavelets, or polynomial transformations (Fourier 1822; Gheondea and Sabac 2003; López-Gómez 2001; Mathworld 2006; Meyer 1992; Percival and Walden 2000). Spectral, wavelets, or polynomial transformations of lists of numbers typically change the numbers in each list and increase the number of lists. For example, a 2-degrees orthonormal polynomial transformation of list x that is denoted by x=(x1, . . . , xn) for n numbers xi where i=1:3 such that x=(1, 2, 3) produces two lists of numbers: x′1=(−0.7071068, 0.0000000, 0.7071068) and x′2=(0.4082483, −0.8164966, 0.4082483) that are also two columns of a 3*2 matrix. The orthonormal polynomial transformation in this example illustrates a transformation of one list to two lists and also a transformation of one list to a matrix. Orthonormality in this example indicates that the orthonormal polynomial transformation produced lists wherein each list has specific internal characteristics (e.g., normality) and wherein inter-relationships among lists have specific characteristics (e.g., orthogonality).
A transformation of a list of numbers to another list of numbers replaces these numbers by other numbers. For example, if an original list that is denoted by x=(x1, . . . , xn) for n numbers xi where i=1:3 such that x=(1, 2, 3) is transformed to a new list that is denoted by x′=(x′1, . . . , x′n) for n numbers x′i using x′i=xi2 then x′=(1, 4, 9), which is an example of a power transformation of list x using a power of 2. This is also an example of a transformation of one list to another list. Power transformations are one of the most popular methods of transformation of a list of numbers to another list of numbers (Emerson 1983; Emerson and Stoto 1983).
Multivariate analyses and data mining focus on more than one list, seeking to classify, reduce, make sense of, or handle or manage a plurality of lists. They typically do so by reducing the number of lists. For example, factor analysis or principal components analysis transform lists, reduce the number of lists, and elicit, refine, or augment the meaning of lists (Hotelling 1933; Pearson 1901; Rencher 2002; Spearman 1904), and cluster analysis enables analysts to identify groups within lists (Everitt 1993; Hartigan 1975).
The foregoing considerations, and studies of transformations of lists, reveal that a list of numbers could change part, or no part, or all of the components of this list. A transformation of a list could also leave intact or increase or reduce the number of lists. For example, a power transformation of one list of numbers typically changes these numbers, and power transformations of more than one list of numbers tend to leave intact the number of lists. An identity transformation of a list of numbers (e.g., a power transformation with a power of 1) does not change the numbers of this list, and identity transformations of lists of numbers do not change the number of lists.
Transformations of a list of numbers could offer diverse benefits in data analysis. For example, transformations of lists of numbers could linearize data or generate nonlinear data, increase symmetry or normality, increase homoscedasticity or reduce heteroscedasticity, orthogonalize or otherwise reposition variables, deepen insight into the data, simplify analysis, reveal new dimensions, and reveal meanings (Emerson 1983; Emerson and Stoto 1983; Rencher 2002).
Because they have diverse beneficial uses, transformations can be evaluated, compared, and optimized in order to assess which transformation would produce optimal results. For example, it is possible to compare various power transformations of a list of numbers for the purpose of discovering which of these transformations would produce an optimal result in a specific data analysis (Box and Cox 1964; Emerson 1983; Emerson and Stoto 1983).
While some transformation methods (e.g., power transformations) are simple or straight forward, or mechanized to diverse degrees, other transformation methods are complicated. For example, spectral or wavelets analyses typically require special transforms and complex calculations that limit the scope and applicability of these methods. Another drawback of the available methods of transformation is that they may not allow the user sufficient control, maneuverability, and tractability in the performance of the transformation or analysis. Additionally, some transformations may lack a theoretical foundation or a methodological justification that would explain why specific transformations ought to be used.
The multiple lists of numbers that result from spectral, wavelets, or polynomial transformations of one or more lists of numbers could, in turn, be used in multivariate analyses or data mining (Gheondea and Sabac 2003; López-Gómez 2001; Meyer 1992; Percival and Walden 2000).
This invention focuses on the following: lists of measurements of distinct real phenomena, the power transformations of these measurements to respective power transformed distances or proximities among these measurements, the multivariate analyses of vectors of matrices of these power transformed distances or proximities, and the application of analytic measurements from these analyses to real phenomena.
Epelbaum (2005) presents features of this invention and its embodiments in two studies. In one study of Epelbaum (2005), the real phenomena are the first 100 days of 1981 and the solar flux during these days. The days are measured in terms of the consecutive integer indices of these days, and the solar flux is measured by the adjusted solar activity indices, at a wavelength of 10.7 centimeters, in the first 100 days of 1981, collected by the Solar Radio Monitoring Programme of the Canadian National Research Council. Factor analyses of the power transformed distances among these 100 days yielded best fitting and optimal factor score measurements of these days. Generalized estimating equations, GEEs, analyses of effects of these factor scores on respective solar flux contribute to the understanding of solar flux and its changes through time and enable prediction of solar flux in the first 100 days of 1981. GEE is a regression analysis of correlated data. In another study of Epelbaum (2005), the real phenomena are the spatial distributions of U.S. states and the income per capita in these states. The spatial configurations are measured in terms of the centroids of these U.S. states, and the per capita incomes in these states in measured in terms of income data from the U.S. Census. Reiterated factor analyses of the power transformed distances among the centroids of U.S. states yield best fitting and optimal factor scores for the respective spatial positions of the respective states. GEE analyses of effects of factor scores on per capita income contribute to the understanding spatial differences in income per capita in U.S. states in 1990 and to the prediction of this income per capita.
The present invention formalizes, generalizes, refines, extends, broadens, and adds new uses to what the inventor presented in Epelbaum (2005).