The present teachings relate to clustering of data profiles such as, but not limited to, oceanic profiles.
Automatic oceanographic profile provincing (grouping by similar chosen parameters) has been done since the early 1970's, but mainly for deep water. The profiles were compared point by point so all profiles had to contain data down to a set depth, usually in excess of 200 meters. For an oceanographic area of interest, for example a two-degree square box in a littoral region, temperature profiles can vary over time and place widely throughout the large number of historical measurements available. It may be difficult to get an understanding of the underlying environmental forcing mechanisms through inspection of raw data. A previous tool, known as Nydis, allowed the user to set the regional boundary and seasonal time periods by hand. The resulting clusters of profiles were then viewed by the oceanographer together with average profiles and three standard deviation lines shown. The oceanographer would then move the boundaries and seasons and iterate until the oceanographer was satisfied with the results. Depending on the area, this process could take several days. Dividing the data into areas and seasons of similar profiles can shed light on environmental variability and forcing functions of the area of interest. In existing systems, parameters of the data are sampled and then grouped. For example the Naval Underwater Systems Center groups deep water data by temperature profile similarity at particular depths. Cluster analysis has been used to examine sound speed profiles in the Gulf of Alaska. For this data analysis, data points were forced into clusters by minimizing the sum of over each element in a set of predetermined size clusters.drs=sqrt(ΣI=1,n(cri−csi)2)  (1)
In equation (1), d is the distance between sound-speed profiles and cri, csi are the respective sound-speeds at the Ith depth. These clusters were then be grouped repeatedly by the same method until all points were in one set. The level of clustering the data seem to naturally cluster on can be determined. This method is commonly used in deep water where generally after a certain depth all profiles are similar and therefore can be universally trimmed to that depth, causing the profiles to be comparable at all points. Traditional automatic clustering algorithms were developed for deep water physical oceanographic profiles. The clustering algorithms clustered properties, for example, by matching each historical profile in turn to the closest unmatched profile in the data set, then matching these groupings until the desired number of clusters was created. In shallow water however, point by point comparison is not generally possible given the common occurrence of significant differences below the depth of the shallowest profile in the data.
When random points are clustered on a page by eye, they are clustered in the 2-D coordinates of up-down versus right-left. Each profile parameter is considered a dimension in the solution space, so each profile is a vector in the solution space with coordinates of the parameter values of the profile. In previous work these profile parameters might be, for example, surface temperature, temperature at fifty meters, temperature at one hundred meters, temperature at two hundred meters, etc. It is desirable to use different types of profile parameters to enable shallow water studies. Thus, it is desirable to perform the automatic clustering of oceanographic profiles spatially, temporally, and by specific profile parameters. It is also desirable to identify regions and seasons where oceanographic parameters are consistent. What is needed is fuzzy clustering for specific parameters to enable enhanced oceanographic studies. What is further needed is to province oceanographic profiles into regions and local seasons according to their oceanographic parameters using sets of profiles taken in water significantly shallower than 200 meters.