The characterisation and identification of polypeptides from complex mixtures thereof, such as protein samples found in biological systems, is a well-known problem in biochemistry. Traditional methods involve a variety of liquid phase fractionation and chromatography steps followed by characterization, for example by two dimensional gel electrophoresis. Such methods are prone to artefacts and are inherently slow. Moreover, automation of such methods is extremely difficult.
Patent Application PCT/GB97/02403, filed on Sep. 5, 1997, describes a method for profiling a cDNA population in order to generate a `signature` for every cDNA in the population. It is assumed in that method that a short sequence of about 8 bp that is determined with respect to a fixed reference point is sufficient to identify almost all genes. This system relies on immobilizing the cDNA population at the 3' terminus and cleaving it with a restriction endonuclease. This leaves a population of 3' restriction fragments. The patent describes a technique that allows one to determine a signature of roughly 8 to 10 base pairs at a specified number of bases from the restriction site which is a sufficient signature to identify nearly all genes.
Techniques for profiling proteins, that is to say cataloguing the identities and quantities of proteins in a tissue, are less well developed in terms of automation or high throughput. The classical method of profiling a population of proteins is by two-dimensional electrophoresis. In this method a protein sample extracted from a biological sample is separated on a narrow gel strip. This first separation usually separates proteins on the basis of their iso-electric point. The entire gel strip is then laid against one edge of a rectangular gel. The separated proteins in the strip are then electrophoretically separated in the second gel on the basis of their size. This technology is slow and very difficult to automate. It is also relatively insensitive in its simplest incarnations. A number of improvements have been made to increase resolution of proteins by 2-D gel electrophoresis and to improve the sensitivity of the system. One method to improve the sensitivity of 2-D gel electrophoresis and its resolution is to analyse the protein in specific spots on the gel by mass spectrometry. One such method is in-gel tryptic digestion followed by analysis of the tryptic fragments by mass spectrometry to generate a peptide mass fingerprint. If sequence information is required, tandem mass spectrometry analysis can be performed.
More recently attempts have been made to exploit mass spectrometry to analyze whole proteins that have been fractionated by liquid chromatography or capillary electrophoresis. In-line systems exploiting capillary electrophoresis mass spectrometry have been tested. The analysis of whole proteins by mass spectrometry, however, suffers from a number of difficulties. The first difficulty is the analysis of the complex mass spectra resulting from multiple ionisation states accessible by individual proteins. The second major disadvantage is that the mass resolution of mass spectrometers is at present quite poor for high molecular weight species, i.e. for ions that are greater than about 4 kilodaltons in mass so resolving proteins that are close in mass is difficult. A third disadvantage is that further analysis of whole proteins by tandem mass spectrometry is difficult as the fragmentation patterns for whole proteins are extremely complex.