Proteins mediate the biological activity, and function, of virtually every biological process in cells, while misexpression is associated with various human diseases. The identification and quantification of proteins present in biological samples is therefore a fundamental problem applicable to most biomedical research studies, and a cornerstone of the emerging field of Proteomics.
Protein sequencing has traditionally relied on the sequential detection of individually cleaved N-terminal amino acids from a population of identical polypeptide molecules using Edman degradation chemistry and the detection and identification of the different amino acid Edman derivatives using techniques such as differential HPLC retention and UV absorption. More recently, mass spectrometry has been used to sequence and/or identify proteins or polypeptides with increased speed, accuracy and sensitivity. These methods are generally low-throughput, computationally demanding and require the use of expensive equipment. However, even the most sensitive mass spectrometers require relatively large amounts of sample, with current limits of detection on the order of 108 molecules (equivalent to nanogram or femtomole levels) and are not able to exhaustively sequence complex mixtures of proteins due to ion-ion interference, preferential (biased) detection of certain molecules, limited dynamic range and general under-sampling.
While dramatic improvements have been made in the past couple of years with respect to the speed, comprehensiveness and availability of high-throughput massively parallel DNA sequencing platforms capable of sequencing large numbers of different nucleic acid molecules simultaneously, advances in mass spectrometer performance have been incremental. Relatively little progress has been made towards the development of “next generation” platforms for global protein sequencing at the individual single molecule level. Furthermore, the relative complexity of protein mixtures such as blood, tissue or cell extracts, as well as the lack of PCR-based amplification or properties such as duplex formation and base-pairing, have hampered the development of single-molecule protein sequencing such as those described for polynucleotides (Harris et al. Science 4 Apr. 2008: Vol. 320. no. 5872).
Accordingly, there remains a need for novel methods and assays for sequencing single polypeptide molecules and for methods and assays able to perform the simultaneous parallel sequencing of large-numbers of polypeptides present in one or more samples.