The present invention relates to an integrated system based on functional affinity chromatography and large scale protein identification. More specifically it is a method of high throughput functional proteomics using a functional affinity column and mass spectrometry. The functional affinity column isolates proteins from a large pool based on a known function as identified by the type of affinity.
Most high throughput proteomic methods result in the isolation of a number of proteins for which no function is known. The function is usually deduced using sequence similarities to proteins with known functions or the identification of motifs with a known function. The process can be time-consuming and may not result in the identification of the correct function. Thus, a method is needed which allows for the identification of classes of proteins in a proteome for which a function may be assigned.
One aspect of the present invention provides a method of identifying proteins with a shared function from a protein pool. The method comprises preparing a protein pool. The protein pool is applied to a functional affinity column wherein the functional affinity column isolates proteins with a common function based on the affinity chromatographic behavior of the proteins. The isolated proteins are analyzed using a one or more dimensional column in combination with mass spectrometry thereby producing spectral information. The isolated proteins are identified by matching the spectral information with a theoretical mass spectrum of a protein having a known sequence.
According to another aspect of the present invention, one or more dimensional chromatography is performed using a high performance liquid chromatography column comprising a strong anion exchange resin followed by a reverse phase resin. In some embodiments, the protein pool can be fractionated prior to application to said functional affinity column. According to some embodiments of the present invention, mass spectrometry is tandem mass spectrometry.
The functional affinity column can comprise a ligand selected from the group consisting of carbohydrate, ATP, phosphate, ECM, metal ion, cell surface peptide, and enzymatic domain. Alternatively, the functional affinity column can comprise a small molecule such as a pharmacophore. In other embodiments the functional affinity column comprises a peptide or protein domain.
Another aspect of the present invention provides a method of ascribing a function to a protein: The method comprises providing a composition containing one or more proteins. The composition is applied to a functional affinity column. Bound proteins are then eluted from the functional affinity column and prepared for mass spectrometry. At least a portion of the eluted protein is analyzed by mass spectrometry thereby producing spectral information. The eluted protein is then identified by matching the spectral information with a theoretical mass spectrum of a protein having a known sequence. The function of the identified protein is ascribed based on the affinity chromatographic behavior of the identified protein.
According to another aspect of the present invention, an eluted protein is subjected to proteolysis and one or more dimensional chromatography. In some embodiments, the one or more dimensional chromatography is performed using a high performance liquid chromatography column comprising a strong anion exchange resin followed by a reverse phase resin.
The protein composition that is applied to the functional affinity column can be a protein extract wherein the protein extract is from a tissue or cell. In some embodiments, the cell is a microbe, a parasite or a cancer cell.
The functional affinity column can comprise a ligand selected from the group consisting of carbohydrate, ATP, phosphate, ECM, metal ion, cell surface peptide, and enzymatic domain. Alternatively, the functional affinity column can comprise a small molecule such as a pharmacophore. In other embodiments the functional affinity column comprises a peptide or protein domain.
In some embodiments of the present invention, the bound protein is eluted from said functional affinity column in a single step. In other embodiments, the bound protein is eluted from said functional affinity column using a stepwise or continuous gradient.
According to one aspect of the present invention, the sequence of the protein having a known sequence is present in a database. According to other aspects, the sequence of the protein having a known sequence is derived from a nucleic acid. In still other aspects, the protein having a known sequence has an unidentified function.
According to yet another aspect of the present invention, an annotated sequence database comprising at least one polypeptide sequence wherein a function of a protein having the at least one polypeptide sequence is ascribed by providing a composition containing one or more proteins. The composition is applied to a functional affinity column. Bound proteins are then eluted from the functional affinity column and prepared for mass spectrometry. At least a portion of the eluted protein is analyzed by mass spectrometry thereby producing spectral information. The eluted protein is then identified by matching the spectral information with a theoretical mass spectrum of a protein having a known sequence. The function of the identified protein is ascribed based on the affinity chromatographic behavior of the identified protein.
According to yet another aspect of the present invention, an annotated sequence database comprising at least one nucleic acid sequence wherein a function of a protein encoded by said at least one nucleic acid sequence is ascribed by providing a composition containing one or more proteins. The composition is applied to a functional affinity column. Bound proteins are then eluted from the functional affinity column and prepared for mass spectrometry. At least a portion of the eluted protein is analyzed by mass spectrometry thereby producing spectral information. The eluted protein is then identified by matching the spectral information with a theoretical mass spectrum of a protein having a known sequence. The function of the identified protein is ascribed based on the affinity chromatographic behavior of the identified protein.