In general, the present invention relates to methods of generating fixed arrays of proteins or coded sets of protein-conjugated microparticles.
Certain macromolecules, such as proteins, are known to interact specifically with other molecules based on their three-dimensional shapes and electronic distributions. For example, proteins interact selectively with other proteins, nucleic acids, and small-molecules. The identification of molecules that interact with proteins lays the groundwork for the development of compounds to treat diseases and their associated symptoms.
The discovery of a single drug candidate can require the screening of thousands of compounds. It is therefore important to be able to screen large numbers of compounds rapidly and efficiently. One method for screening a large number of compounds is to fix candidate binding partners, such as proteins, to a solid support.
The present invention features methods for tagging or xe2x80x9cencodingxe2x80x9d individual in vitro translated proteins, or groups of in vitro translated proteins, with unique and minimal encoding molecules, and related methods for subsequently sorting those encoded molecules onto solid supports or microparticles. The present invention also features methods for the identification of a desired binding partner (for example, a protein or other compound) using the encoded and sorted proteins of the invention. The invention facilitates the isolation of proteins with desired properties from large pools of partially or completely random amino acid sequences. The invention also facilitates the use of automated approaches to protein or compound screening methods.
Accordingly, in a first aspect, the invention features a method for encoding and sorting an in vitro translated protein, involving the steps of providing an in vitro translated protein attached to a nucleic acid linker and attaching the protein, through the nucleic acid linker, to an encoding molecule, thereby encoding the protein.
In one embodiment, this method further involves immobilizing the encoded protein onto a solid support. In another embodiment, the candidate protein is derived from an RNA-protein fusion molecule. In yet another embodiment, the encoding molecule is made of nucleic acids, or nucleic acid analogs. Preferably, the encoding molecule comprises a unique addressing element, a linker-specific alignment element, and a linkage element between the addressing element and the linker-specific alignment element. Furthermore, the linkage element of the encoding molecule may include polyethylene glycol units (preferably, hexaethylene oxide). In yet another embodiment, the candidate protein is attached to the encoding molecule through hybridization of the linker-specific alignment element of the encoding molecule to the nucleic acid linker of the candidate protein, or to the protein itself.
In a second aspect, the invention features a method for encoding an in vitro translated protein, involving the steps of providing an in vitro translated protein and binding a nucleic acid linker to the protein, wherein the nucleic acid linker contains an addressing element, thereby encoding the protein.
In a preferred embodiment, this method further involves immobilizing he encoded protein onto a solid support. In another preferred embodiment, the candidate protein is derived from an RNA-protein fusion molecule.
In a third aspect, the invention features a method for encoding an in vitro translated protein, involving the steps of providing an in vitro translated protein and binding a nucleic acid linker to the protein, wherein an addressing element branches off from the nucleic acid linker, thereby encoding the protein.
In one embodiment, this method further involves immobilizing the encoded protein formed in the last step of the invention onto a solid support. In another embodiment, the candidate protein is derived from an RNA-protein fusion molecule. In yet another embodiment, the addressing element is bound to the nucleic acid linker by a linkage element. The linkage element of the encoding molecule may include polyethylene glycol units. Preferably, the polyethylene glycol units are hexaethylene oxide.
In yet other embodiments of each of the above aspects of the invention, the solid support is a glass or silica-based chip, or a bead. A capture probe may be attached to the solid support, and may consist of nucleic acids or nucleic acid analogs. The encoded candidate protein may be immobilized onto the solid support by hybridizing the encoded candidate protein to the nucleic acid capture probe, thus sorting the protein according to the information contained in the encoding molecule.
In further embodiments of all of the aspects of the invention, the candidate protein is labeled with a reporter tag, which is preferably a fluorophore. An affinity tag may also be attached to the encoding molecule. One exemplary affinity tag is biotin.
In yet further embodiments, the encoding molecule and solid support are functionalized with a cross-linking moiety. Preferably, the cross-linking moiety is a psoralen, azido compound, or sulfur-containing molecule. In one embodiment, he 5xe2x80x2 terminus of the encoding molecule is functionalized with an electrophile that cross-links regioselectively with a nucleophilic amino acid side chain of the protein.
In a fourth aspect, the invention features a method for detecting an interaction between a protein and a compound, involving the steps of providing an encoded in vitro translated protein immobilized onto a solid support; contacting the protein with a candidate compound under conditions which allow an interaction between the protein and the compound; and analyzing the solid support for the presence of the compound as an indication of an interaction between the protein and the compound. The compound may be a nucleic acid, a protein, a therapeutic, or an enzyme.
In a fifth aspect, the invention features an in vitro translated protein attached to a nucleic acid linker and bound to an encoding molecule.
In a sixth aspect, the invention features an in vitro translated protein attached to an encoded nucleic acid linker molecule.
In a seventh aspect, the invention features an in vitro translated protein attached to a branched encoded nucleic acid linker molecule.
In various preferred embodiments, the protein is attached to a solid support bearing a capture probe. In other embodiments, the encoded protein is attached to the capture probe through hybridization or a covalent bond.
As used herein, by a xe2x80x9cproteinxe2x80x9d is meant any two or more naturally occurring or modified amino acids joined by one or more peptide bonds. xe2x80x9cProtein,xe2x80x9d xe2x80x9cpeptide,xe2x80x9d and xe2x80x9cpolypeptidexe2x80x9d are used interchangeably.
By an xe2x80x9cencoding moleculexe2x80x9d is meant a unique tag which may be attached to a protein or peptide and which facilitates recognition of the protein among a population of proteins. The encoding molecule may be composed of nucleic acids, nucleic acid analogs, or non-nucleosides, but it is not comprised of the RNA that, when translated, yields the protein itself. By xe2x80x9cencodexe2x80x9d is meant to attach an encoding molecule.
By an xe2x80x9caddressing elementxe2x80x9d is meant that portion of an encoding molecule which gives the encoding molecule its unique identity by differing sufficiently in sequence from other such elements in a given population. Preferably the addressing element is between 4 and 40 nucleotide units in length. In addition, the addressing element may comprise nucleic acids or nucleic acid analogs.
By a xe2x80x9clinker-specific alignment elementxe2x80x9d is meant that portion of an encoding molecule which hybridizes to the nucleic acid linker of an in vitro translated protein, or to the protein itself. The addressing element may consist of nucleic acids or nucleic acid analogs.
By a xe2x80x9clinkage elementxe2x80x9d is meant that portion of an encoding molecule that joins the addressing element and the linker-specific alignment element together. The linkage element may be composed of nucleic acids, nucleic acid analogs, and non-nucleosides. Preferably the linkage element includes polyethylene glycol units, and more preferably the polyethylene glycol units are hexaethylene oxide.
By xe2x80x9csortxe2x80x9d is meant to position in an organized manner or otherwise identify or separate. Encoded proteins may be sorted onto a solid support.
By a xe2x80x9csolid supportxe2x80x9d is meant any solid surface including, without limitation, any chip (for example, silica-based, glass, or gold chip), glass slide, membrane, bead, solid particle (for example, agarose, sepharose, polystyrene or magnetic bead), column (or column material), test tube, or microtiter dish.
By a xe2x80x9cmicroarrayxe2x80x9d is meant a fixed pattern of immobilized objects on a solid surface or membrane. Typically, the array is made up of encoded proteins bound to capture probes which themselves are immobilized on the solid surface or membrane. xe2x80x9cMicroarrayxe2x80x9d and xe2x80x9cchipxe2x80x9d are used interchangeably. Preferably the microarray has a density of between 10 and 1000 objects/cm2.
By xe2x80x9ccapture probexe2x80x9d is meant a sequence of deoxyribonucleotides, ribonucleotides, or analogs thereof, which hybridize in a sequence dependent manner to the addressing element of a unique encoding molecule in a population. The capture probe may consist of nucleic acids or nucleic acid analogs.
By xe2x80x9cnucleic acid linkerxe2x80x9d is meant a sequence of deoxyribonucleotides, ribonucleotides, or analogs thereof. The nucleic acid linker is not comprised of the RNA that, when translated, yields the protein to which it is attached.
By an xe2x80x9cencoded DNA linkerxe2x80x9d is meant a sequence of deoxyribonucleotides which contains an addressing element. In a xe2x80x9cbranched encoded DNA linker,xe2x80x9d the addressing element branches from an internal linker deoxyribonucleotide moiety. An encoded DNA linker may also comprise nucleic acid analogs.
By a xe2x80x9creporter tagxe2x80x9d is meant a molecule whose presence can be monitored or detected. For example, the reporter tag can be a fluorophore.
By xe2x80x9ctherapeuticxe2x80x9d is meant any molecule used to treat, ameliorate, improve, prevent, or stabilize a disease or symptom of a disease.
By an xe2x80x9cRNAxe2x80x9d is meant a sequence of two or more covalently bonded, naturally occurring or modified ribonucleotides. One example of a modified RNA included within this term is phosphorothioate RNA.
By xe2x80x9cRNA-protein fusionxe2x80x9d is meant an RNA molecule covalently bound to a protein.
By xe2x80x9cfunctionalizexe2x80x9d is meant to chemically modify in a manner that results in the attachment of a functional group or moiety. For example, an encoding molecule may be functionalized with an electrophile that cross-links regioselectively with a nucleophilic amino acid side chain of a protein or peptide. An encoding molecule or the capture probes of the solid support, in another example, can be functionalized with a cross-linking moiety such as psoralen, azido compounds, or sulfur-containing nucleosides.
The present invention provides a number of advantages. For example, the invention allows the employment of pre-made sets of universal encoding molecules, such as nucleic acids or nucleic acid analogues. These encoding molecules can be used in conjunction with corresponding universal microarrays or sets of microparticles to create novel protein-display systems. A system of pre-made encoding molecules is flexible, modular, scalable, and cost-effective. Another advantage of the present invention is the option of utilizing nucleic acid analogs which are not amenable to enzymatic incorporation or polymerization, but which are superior to conventional DNA or RNA in a number of respects. An additional advantage of the present invention is the ability to label proteins with fluorescent moieties, which can be used to monitor the protein in real time.
Yet another advantage of the present invention is the absence of RNA which encodes the protein in the final encoded and sorted product. This is important for several reasons. In particular, DNA is simpler to work with due to its chemical stability and its resistance to nucleases. In addition, the length of a protein""s RNA is directly related to the protein""s size, with large proteins possessing long RNA messages. Regions of these long RNAs sometimes have a propensity to adopt stable secondary structures which are difficult to predict, and these secondary structures can interfere with hybridization steps and protein folding and function. Accordingly, the development of a method to encode and sort proteins in the absence of the RNA which encodes the protein represents an advance in this field.
Other features and advantages of the invention will be apparent from the following detailed description, and from the claims.
The drawings will first briefly be described.