A proteome is a collection of proteins expressed by a genome at a particular moment. Proteomics is the study of proteomes in order to ascertain the differences between two different states of an organism, for example that of a healthy person and that of a person with cancer.
A preferred technique for separating proteins within a sample is two-dimensional gel electrophoresis. In this technique, proteins held within a porous gel are separated in a first direction using isoelectric focusing, before being separated in a second direction using a technique known as SDS-PAGE.
Isoelectric focusing is a technique in which proteins are electrophoresed in a polyacrylamide gel in which a pH gradient is established. Each protein migrates to a point in the gradient that corresponds to its isoelectric point (pl), the pH at which the protein has no net charge.
SDS-PAGE is a technique in which proteins are solubilised using the detergent sodium dodecyl sulphate (SDS), before being electrophoresed through a slab of polyacrylamide gel. Proteins migrate through the gel at a rate which is proportional to their molecular weight (MW).
The result of performing two-dimensional gel electrophoresis on a sample is a two-dimensional separation pattern.
A conventional method for analysing separation patterns as represented in FIG. 1 includes the following steps:—                Feature Detection 10        
Detection of spots in a separation pattern. This step may involve correction for noise and background offsets. Often, different numbers of spots are detected in different patterns, resulting in missing spots.                Feature Mapping 12        
Correspondences between spots in different patterns are identified, and indicated using a match vector joining the centres of detected spots in overlaid patterns. A warp is performed on the overlaid patterns to counteract geometric distortions of the gel.                Normalisation 14        
The sum of all of the pixel intensities within a spot outline is the spot volume. This value is processed to produce a normalised volume which can be used to compare spot volumes between patterns.                Analysis 16        
Certain criteria, for example the difference in size between corresponding spots in different patterns, are used to produce a list of interesting spots.
This conventional method has numerous disadvantages:—    1. Missing spots cause problems for statistical analysis of data.    2. Feature-based analysis introduces noise. Small differences in spot outline, background estimates, mapping etc can influence measured values.    3. The analysis step focuses on variance to differentiate groups. Subtle differences which may be used to differentiate classes may be overlooked because of high variance in other areas, which areas may not be useful in differentiating classes, and may be the result of noise.    4. Proteomics research usually involves supervised studies while many analysis techniques are unsupervised.    5. Basing the analysis on an increased number of samples exacerbates the problems of missing spots.    6. A large amount of information within the gels is discarded. The surrounding areas where no spots have been detected contain valuable information on background, noise and scaling that could be affecting the value of a measured spot.
A large proportion of supervised learning techniques suffer from having large numbers of variables in comparison to the number of class examples. With such a high ratio, it is often possible to build a classification model that has perfect discrimination performance, but the properties of the model may be undesirable in that it lacks generality, and that it is far too complex (given the task) and very difficult to examine for important factors.
Accordingly, it is desirable to provide a statistically sound framework for the analysis of bioparticle and biomolecular separation data.
It is further desirable to overcome some or all of the above-described problems.