The present invention relates to mammalian nucleic acid and protein molecules, and methods for their use in diagnostic and therapeutic applications including detecting metabolic and toxicological responses, and in monitoring drug mechanism of action.
Toxicity testing is a mandatory and time-consuming part of drug development programs in the pharmaceutical industry. A more rapid screen to determine the effects upon metabolism and to detect toxicity of lead drug candidates may be the use of gene expression microarrays. For example, microarrays of various kinds may be produced using full length genes or gene fragments. These arrays can then be used to test samples treated with the drug candidates to elucidate the gene expression pattern associated with drug treatment. This gene pattern can be compared with gene expression patterns associated with compounds which produce known metabolic and toxicological responses.
Benzo(a)pyrene is a known rodent and likely human carcinogen and is the prototype of a class of compounds, the polycyclic aromatic hydrocarbons (PAH). It is metabolized by several forms of cytochrome P450 (P450 isozymes) and associated enzymes to form both activated and detoxified metabolites. The ultimate metabolites are the bay-region diol epoxide, benzo(a)pyrene-7,8-diol-9,10-epoxide (BPDE) and the K-region diol epoxide, 9-hydroxy benzo(a)pyrene-4,5-oxide, both of which induce formation of DNA adducts. DNA adducts have been shown to persist in rat liver up to 56 days following treatment with benzo(a)pyrene at a dose of 10 mg/kg body weight three times per week for two weeks (Qu and Stacey (1996) Carcinogenesis 17:53-59).
Acetaminophen is a widely-used analgesic. It is metabolized by specific cytochrome P450 isozymes with the majority of the drug undergoing detoxification by glucuronic acid, sulfate and glutathione conjugation pathways. However, at supratherapeutic doses, acetaminophen is metabolized to an active intermediate, N-acetyl-p-benzoquinone imine (NAPQI) which can cause hepatic and renal failure. NAPQI then binds to sulhydryl groups of proteins causing their inactivation and leading to subsequent cell death (Kroger et al. (1997) Gen. Pharmacol. 28:257-263).
Clofibrate is an hypolidemic drug which lowers elevated levels of serum triglycerides. In rodents, chronic treatment produces hepatomegaly and an increase in hepatic peroxisomes (peroxisome proliferation). Peroxisome proliferators (PPs) are a class of drugs which activate the PP-activated receptor in rodent liver, leading to enzyme induction, stimulation of S-phase, and a suppression of apoptosis (Hasmall and Roberts (1999) Pharmacol. Ther. 82:63-70). PPs include the fibrate class of hypolidemic drugs, phenobarbitone, thiazolidinediones, certain non-steroidal anti-inflammatory drugs, and naturally-occurring fatty acid-derived molecules (Gelman et al. (1999) Cell. Mol. Life Sci. 55:932-943). Clofibrate has been shown to increase levels of cytochrome P450 4A. It is also involved in transcription of xcex2-oxidation genes as well as induction of PP-activated receptors (Kawashima et al. (1997) Arch. Biochem. Biophys. 347:148-154). Peroxisome proliferation that is induced by both clofibrate and the chemically-related compound fenofibrate is mediated by a common inhibitory effect on mitochondrial membrane depolarization (Zhou and Wallace (1999) Toxicol. Sci. 48:82-89).
Toxicological effects in the liver are also induced by other compounds. These can include carbon tetrachloride (a necrotic agent), hydrazine (a steatotic agent), xcex1-naphthylisothiocyanate (a cholestatic agent), 4-acetylaminofluorene (a liver mitogen), and their corresponding metabolites, which are used in experimental protocols to measure toxicological responses (Waterfield et al. (1993) Arch. Toxicol. 67:244-254).
The present invention provides mammalian nucleic acid and protein molecules, their use in diagnostic and therapeutic applications including detecting metabolic and toxicological responses, and in monitoring drug mechanism of action.
The invention provides a method for detecting or diagnosing the effect of a test compound or molecule associated with increased or decreased levels of nucleic acid molecules in a mammalian subject. The method comprises treating a mammalian subject with a known toxic compound or molecule which elicits a toxicological response, measuring levels of a plurality of nucleic acid molecules, selecting from the plurality of nucleic acid molecules those nucleic acid molecules that have levels modulated in samples treated with known toxic compounds or molecules when compared with untreated samples. Some of the levels may be upregulated by a toxic compound or molecule, others may be downregulated by a toxic compound or molecule, and still others may be upregulated with one known toxic compound or molecule and be downregulated with another known toxic compound or molecule. The selected nucleic acid molecules which are upregulated and downregulated by a known toxic compound or molecule are arrayed upon a substrate. The method further comprises measuring levels of nucleic acid molecules in the sample after the sample is treated with the toxic compound or molecule. Levels of nucleic acid molecules in a sample so treated are then compared with the plurality of the arrayed nucleic acid molecules to identify which sample nucleic acid molecules are upregulated and downregulated by the test compound or molecule. In one embodiment, the nucleic acid molecules are hybridizable array elements of a microarray.
Preferably, the comparing comprises contacting the arrayed nucleic acid molecules with the sample nucleic acid molecules under conditions effective to form hybridization complexes between the arrayed nucleic acid molecules and the sample nucleic acid molecules; and detecting the presence or absence of the hybridization complexes. In this context, similarity may mean that at least 1, preferably at least 5, more preferably at least 10, of the upregulated arrayed nucleic acid molecules form hybridization complexes with the sample nucleic acid molecules at least once during a time course to a greater extent than would the probes derived from a sample not treated with the test compound or molecule or a known toxic compound or molecule. Similarity may also mean that at least 1, preferably at least 5, more preferably at least 10, of the downregulated arrayed nucleic acid molecules form hybridization complexes with the sample nucleic acid molecules at least once during a time course to a lesser extent than would the sample nucleic acid molecules of a sample not treated with the test compound or a known toxic compound. In one aspect, the arrayed nucleic acid molecules comprise SEQ ID NOs: 1-47 or fragments thereof.
Preferred toxic compounds are selected from the group consisting of hypolipidemic drugs, n-alkylcarboxylic acids, n-alkylcarboxylic acid precursors, azole antifungal compounds, leukotriene D4 antagonists, herbicides, pesticides, phthalate esters, phenyl acetate, dehydroepiandrosterone (DHEA), oleic acid, methanol and their corresponding metabolites, acetaminophen and its corresponding metabolites, benzo(a)pyrene, 3-methylcholanthrene, benz(a)anthracene, 7,12-dimethylbenz(a)anthracene, their corresponding metabolites, and the like, carbon tetrachloride, hydrazine, xcex1-naphthylisothiocyanate, 4-acetylaminofluorene, and their corresponding metabolites. Preferred tissues are selected from the group consisting of liver, kidney, brain, spleen, pancreas and lung.
The arrayed nucleic acid molecules comprise fragments of messenger RNA transcripts of genes that are upregulated-or-downregulated at least 2-fold, preferably at least 2.5-fold, more preferably at least 3-fold, in tissues treated with known toxic compounds when compared with untreated tissues. Preferred arrayed nucleic acid molecules are selected from the group consisting of SEQ ID NOs: 1-47 or fragments thereof, some of whose expression is upregulated following treatment with a toxic compound or molecule and others of whose expression is downregulated following treatment with a toxic compound or molecule.
More preferable are SEQ ID NOs:2, 4, 6, 8, 9, and 11 which are upregulated following treatment with a toxic compound or molecule, and SEQ ID NOs: 1, 4, and 7 which are downregulated following treatment with a toxic compound or molecule.
The invention also provides a method comprising measuring levels of nucleic acid molecules in a sample after the sample is treated with a test compound or molecule. Levels of nucleic acid molecules in a sample so treated are then compared with the plurality of the arrayed nucleic acid molecules to identify which sample nucleic acid molecules are upregulated and downregulated by the test compound or molecule. In one embodiment, the nucleic acid molecules are hybridizable array elements of a microarray.
Alternatively, the invention provides methods for screening a sample for a metabolic response to a test compound or molecule.
Alternatively, the invention provides methods for screening a test compound or molecule for a previously unknown metabolic response.
In another aspect, the invention provides methods for preventing a toxicological response by administering complementary nucleotide molecules against one or more selected upregulated nucleic acid molecules or a ribozyme that specifically cleaves such molecules. Alternatively, a toxicological response may be prevented by administering sense nucleotide molecules for one or more selected downregulated nucleic acid molecules.
In yet another aspect, the invention provides methods for preventing a toxicological response by administering an agonist which initiates transcription of a gene comprising a downregulated nucleic acid molecule of the invention. Alternatively, a toxicological response may be prevented by administering an antagonist which prevents transcription of a gene comprising an upregulated nucleic acid molecule of the invention.
In another aspect, the invention provides nucleic acid molecules whose transcript levels are modulated in a sample during a metabolic response to a toxic compound or molecule. The invention also provides nucleic acid molecules whose transcript levels are upregulated in a sample during a metabolic response to a toxic compound or molecule. The invention also provides nucleic acid molecules whose transcript levels are downregulated in a sample during a metabolic response to a toxic compound or molecule. Upregulation or downregulation is at least 2-fold, more preferably at least 2.5-fold, even more preferably at least 3-fold. The metabolic response to a toxic compound or molecule may be a toxicological response. The invention also provides mammalian nucleic acid molecules which are homologous to the upregulated and downregulated nucleic acid molecules. In one aspect, preferred arrayed nucleic acid molecules are selected from the group consisting of SEQ ID NOs: 1-47, or fragments thereof.
The invention also provides a method for using a molecule selected from SEQ ID NOs: 1-59 or a portion thereof to screen a library of molecules to identify at least one ligand which specifically binds the selected molecule, the method comprising combining the selected molecule with the library of molecules under conditions allowing specific binding, and detecting specific binding, thereby identifying a ligand which specifically binds the selected molecule.
Such libraries include DNA and RNA molecules, peptides, peptide nucleic acids, agonists, antagonists, antibodies, immunoglobulins, drug compounds, pharmaceutical agents, and other ligands. In one aspect, the ligand identified using the method modulates the activity of the selected molecule. In an analogous method, the selected molecule or a portion thereof is used to purify a ligand. The method involves combining the selected molecule or a portion thereof with a sample under conditions to allow specific binding, detecting specific binding between the selected molecule and ligand, recovering the bound selected molecule, and separating the selected molecule from the ligand to obtain purified ligand. The invention further provides a method for using at least a portion of the proteins encoded by SEQ ID NOs:1-47 and the proteins of SEQ ID NOs: 48-59 to produce antibodies.
The invention further provides a method for inserting a marker gene into the genomic DNA of an animal to disrupt the expression of the natural nucleic acid molecule. The invention also provides a method for using the nucleic acid molecule to produce an animal model system, the method comprising constructing a vector containing the nucleic acid molecule; introducing the vector into a totipotent embryonic stem cell; selecting an embryonic stem cell with the vector integrated into genomic DNA; microinjecting the selected cell into a blastocyst, thereby forming a chimeric blastocyst; transferring the chimeric blastocyst into a pseudopregnant dam, wherein the dam gives birth to a chimeric animal containing at least one additional copy of nucleic acid molecule in its germ line; and breeding the chimeric animal to generate a homozygous animal model system.
The invention also provides a substantially purified mammalian protein or a portion thereof. The invention further provides isolated and purified proteins encoded by the nucleic acid molecules of SEQ ID NOs:1-11, 17-33, 36, 39, and 41. The invention further provides isolated and purified protein molecule of SEQ ID NOs:50 and 53. Additionally, the invention provides a pharmaceutical composition comprising a substantially purified mammalian protein or a portion thereof in conjunction with a pharmaceutical carrier.
The invention further provides an isolated and purified mammalian nucleic acid molecule variant having at least 70% nucleic acid sequence identity to the mammalian nucleic acid molecule selected from SEQ ID NO:1-47 and fragments thereof. The invention also provides an isolated and purified nucleic acid molecule having a sequence which is complementary to the mammalian nucleic acid molecule comprising a nucleic acid molecule selected from SEQ ID NO:1-47 and fragments thereof.
The invention further provides an expression vector containing at least a fragment of the mammalian nucleic acid molecule selected from the group consisting of SEQ ID NOs:1-47. In another aspect, the expression vector is contained within a host cell.
The invention also provides a method for producing a mammalian protein, the method comprising the steps of: (a) culturing the host cell containing an expression vector containing a mammalian nucleic acid molecule of the invention under conditions suitable for the expression of the polypeptide; and (b) recovering the polypeptide from the host cell culture.
The invention also provides a pharmaceutical composition comprising a substantially purified mammalian protein encoded by SEQ ID NOs:1-11, 17-33, 36, 39, and 41 and the amino acid sequence of SEQ ID NOs:50 and 53 and fragments thereof, in conjunction with a suitable pharmaceutical carrier.
The invention further includes an isolated and purified antibody which binds to a mammalian protein encoded by SEQ ID NOs:1-11, 17-33, 36, 39, and 41 and mammalian protein of SEQ ID NOs:50 and 53 or fragments thereof. The invention also provides a purified agonist and a purified antagonist.
A portion of the disclosure of this patent document contains material which is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure, as it appears in the Patent and Trademark Office patent file or records, but otherwise reserves all copyright rights whatsoever.
The Sequence Listing contains the nucleic acid sequence of exemplary mammalian nucleic acid molecules of the invention, SEQ ID NOs:1-47, 60-135, 137, and 138; the protein sequence of exemplary mammalian protein molecules of the invention, SEQ ID NOs:48-59, and 136.
xe2x80x9cSamplexe2x80x9d is used in its broadest sense. A sample containing nucleic acid molecules may comprise a bodily fluid; a cell; an extract from a cell, chromosome, organelle, or membrane isolated from a cell; genomic DNA, RNA, or cDNA in solution or bound to a substrate; a biological tissue or biopsy thereof; a fingerprint or tissue print; natural or synthetic fibres; in a solution; in a liquid suspension; in a gaseous suspension; in an aerosol; and the like.
xe2x80x9cPluralityxe2x80x9d refers preferably to a group of one or more members, preferably to a group of at least about 10, and more preferably to a group of at least about 100 members, and even more preferably a group of 10,000 members.
xe2x80x9cSubstratexe2x80x9d refers to a rigid or semi-rigid support to which nucleic acid molecules or proteins are bound and includes membranes, filters, chips, slides, wafers, fibers, magnetic or nonmagnetic beads, gels, capillaries or other tubing, plates, polymers, and microparticles with a variety of surface forms including wells, trenches, pins, channels and pores.
xe2x80x9cModulatesxe2x80x9d refers to a change in activity (biological, chemical, or immunological) or lifespan resulting from specific binding between a molecule and either a nucleic acid molecule or a protein.
xe2x80x9cMicroarrayxe2x80x9d refers to an ordered arrangement of hybridizable array elements on a substrate. The array elements are arranged so that there are preferably at least ten or more different array elements, more preferably at least 100 array elements, even more preferably at least 1000 array elements, and most preferably 10,000. Furthermore, the hybridization signal from each of the array elements is individually distinguishable. In a preferred embodiment, the array elements comprise nucleic acid molecules.
xe2x80x9cNucleic acid moleculexe2x80x9d refers to a nucleic acid, oligonucleotide, nucleotide, polynucleotide or any fragment thereof. It may be DNA or RNA of genomic or synthetic origin, double-stranded or single-stranded, and combined with carbohydrate, lipids, protein, or other materials to perform a particular activity such as transformation or form a useful composition such as a peptide nucleic acid (PNA). xe2x80x9cOligonucleotidexe2x80x9d is substantially equivalent to the terms amplimer, primer, oligomer, element, target, and probe and is preferably single stranded.
xe2x80x9cProteinxe2x80x9d refers to an amino acid sequence, oligopeptide, peptide, polypeptide, or portions thereof whether naturally occurring or synthetic. Exemplary portions are the first twenty consecutive amino acids of a mammalian protein encoded by SEQ ID NOs:1-11, 17-33, 36, 39, and 41 and mammalian protein of SEQ ID NOs:50 and 53.
xe2x80x9cUp-regulatedxe2x80x9d refers to a nucleic acid molecule whose levels increased in a treated sample compared with the nucleic acid molecule in an untreated sample.
xe2x80x9cDown-regulatedxe2x80x9d refers to nucleic acid molecule whose levels decreased in a treated sample compared with the nucleic acid molecule in an untreated sample.
xe2x80x9cToxic compoundxe2x80x9d or xe2x80x9ctoxic agentxe2x80x9d is any compound, molecule, or agent that elicits a biochemical, metabolic, and physiological response in an individual or animal, such as i) DNA damage, ii) cell damage, iii) organ damage or cell death, or iv) clinical morbidity or mortality.
xe2x80x9cToxicological responsexe2x80x9d refers to a biochemical, metabolic, and physiological response in an individual or animal which has been exposed to a toxic compound or agent.
xe2x80x9cFragmentxe2x80x9d refers to an Incyte clone or any part of a molecule which retains a usable, functional characteristic. Useful fragments include oligonucleotides and polynucleotides which may be used in hybridization or amplification technologies or in regulation of replication, transcription or translation. Exemplary fragments are the first sixty consecutive nucleotides of SEQ ID NOs:1-47. Useful fragments also include polypeptides and protein molecules which have antigenic potential and which may be used with a suitable pharmaceutical carrier in a pharmaceutical composition. Exemplary fragments are the first twenty consecutive amino acids of a mammalian protein encoded by SEQ ID NOs:1-11, 17-33, 36, 39, and 41 and mammalian protein of SEQ ID NOs:50 and 53.
xe2x80x9cHybridization complexxe2x80x9d refers to a complex between two nucleic acid molecules by virtue of the formation of hydrogen bonds between purines and pyrimidines.
xe2x80x9cLigandxe2x80x9d refers to any compound, molecule, or agent which will bind specifically to a complementary site on a nucleic acid molecule or protein. Such ligands stabilize or modulate the activity of nucleic acid molecules or proteins of the invention and may be composed of at least one of the following: inorganic and organic substances including nucleic acids, proteins, carbohydrates, fats, and lipids.
xe2x80x9cPercent identityxe2x80x9d or xe2x80x9c% identityxe2x80x9d refers to the percentage of sequence similarity found in a comparison of two or more amino acid or nucleic acid sequences. Percent identity can be determined electronically, e.g., by using the MEGALIGN program (DNASTAR, Madison Wis.) which creates alignments between two or more sequences according to methods selected by the user, e.g., the clustal method. (See, e.g., Higgins, D. G. and P. M. Sharp (1988) Gene 73:237-244.) The clustal algorithm groups sequences into clusters by examining the distances between all pairs. The clusters are aligned pairwise and then in groups. The percentage similarity between two amino acid sequences, e.g., sequence A and sequence B, is calculated by dividing the length of sequence A, minus the number of gap residues in sequence A, minus the number of gap residues in sequence B, into the sum of the residue matches between sequence A and sequence B, times one hundred. Gaps of low or of no similarity between the two amino acid sequences are not included in determining percentage similarity. Percent identity between nucleic acid sequences can also be counted or calculated by other methods known in the art, e.g., the Jotun Hein method. (See, e.g., Hein, J. (1990) Methods Enzymol. 183:626-645.) Identity between sequences can also be determined by other methods known in the art, e.g., by varying hybridization conditions.
xe2x80x9cSubstantially purifiedxe2x80x9d refers to nucleic acid molecules or proteins that are removed from their natural environment and are isolated or separated, and are at least about 60% free, preferably about 75% free, and most preferably about 90% free, from other components with which they are naturally associated.
The present invention provides mammalian nucleic acid and protein molecules and method of using the nucleic acid molecules for screening test compounds and molecules for toxicological responses. Additionally the invention provides methods for characterizing the toxicological responses of a sample to a test compound or molecule. In particular, the present invention provides a composition comprising a plurality of nucleic acid molecules derived from human cDNA libraries, monkey cDNA libraries, mouse cDNA libraries, normal rat liver cDNA libraries, normalized rat liver cDNA libraries, prehybridized rat liver cDNA libraries, subtracted rat liver cDNA libraries, and rat kidney cDNA libraries. The nucleic acid molecules have been further selected for exhibiting upregulated or downregulated gene expression in rat livers when the rats have been exposed to a known hepatotoxin, including a peroxisomal proliferator (PP), acetaminophen or one of its corresponding metabolites, a polycyclic aromatic hydrocarbon (PAH), carbon tetrachloride, hydrazine, xcex1-naphthylisothiocyanate, 4-acetylaminofluorene, and their corresponding metabolites.
PPs include hypolipidemic drugs, such as clofibrate, fenofibrate, clofenic acid, nafenopin, gemfibrozil, ciprofibrate, bezafibrate, halofenate, simfibrate, benzofibrate, etofibrate, WY-14,643, and the like; n-alkylcarboxylic acids, such as trichloroacetic acid, valproic acid, hexanoic acid, and the like; n-alkylcarboxylic acid precursors, such as trichloroethylene, etrachloroethylene, and the like; azole antifungal compounds, such as bifonazole, and the like; leukotriene D4 antagonists; herbicides; pesticides; phthalate esters, such as di-[2-ethylhexyl]phthalate, mono-[2-ethylhexyl]phthalate, and the like; and natural chemicals, such as phenyl acetate, dehydroepiandrosterone (DHEA), oleic acid, methanol, and the like. In a preferred embodiment the toxin is clofibrate, or one of its corresponding metabolites. In another prefered embodiment the toxin is fenofibrate, or one of its corresponding metabolites.
PAHs include compounds such as benzo(a)pyrene, 3-methylcholanthrene, benz(a)anthracene, 7,12-dimethylbenz(a)anthracene, their corresponding metabolites, and the like. In a preferred embodiment the toxin is benzo(a)pyrene, or one of its corresponding metabolites.
SEQ ID NOs:1-16 were identified by their pattern of at least two-fold upregulation or downregulation following hybridization with sample nucleic acid molecules from rat liver tissue treated with a known toxic compound. SEQ ID NOs:17-47 were identified by their homology to the sample nucleic acid molecules from rat liver tissue treated with a known toxic compound. These and other nucleic acid molecules can be immobilized on a substrate as hybridizable array elements in a microarray format. The microarray may be used to characterize gene expression patterns associated with novel compounds to elucidate any toxicological responses or to monitor the effects of treatments during clinical trials or therapy where metabolic responses to toxic compounds may be expected.
When the nucleic acid molecules are employed as hybridizable array elements in a microarray, the array elements are organized in an ordered fashion so that each element is present at a specified location on the substrate. Because the array elements are at specified locations on the substrate, the hybridization patterns and intensities (which together create a unique expression profile) can be interpreted in terms of expression levels of particular genes and can be correlated with a toxicological response associated with a test compound or molecule.
The invention also provides a substantially purified and isolated mammalian protein comprising the protein molecule of SEQ ID NOs:50 and 53 or portion thereof. The invention further provides isolated and purified proteins encoded by the nucleic acid molecules of SEQ ID NOs:1-11, 17-33, 36, 39, and 41, or portion thereof.
Furthermore, the present invention provides methods for screening test compounds or therapeutics for potential toxicological responses and for screening a sample""s toxicological response to a particular test compound or molecule. Briefly, these methods entail treating a sample with the test compound or molecule to elicit a change in gene expression patterns comprising the expression of a plurality of sample nucleic acid molecules. Nucleic acid molecules are selected by identifying those genes in rat liver or kidney that are upregulated-or-downregulated at least 2-fold, more preferably at least 2.5-fold, most preferably at least 3-fold, when treated with a known toxic compound or molecule. The nucleic acid molecules are arrayed on a substrate. Then, the arrayed nucleic acid molecules and sample nucleic acid molecules are combined under conditions effective to form hybridization complexes which may be detected by methods well known in the art. Detection of higher or lower levels of such hybridization complexes compared with hybridization complexes derived from untreated samples and samples treated with a compound that is known not to induce a toxicological response correlates with a toxicological response of a test compound or a toxicological response to a molecule.
Molecules are identified that reflect all or most of the genes that are expressed in rat liver or kidney. Molecules may be identified by isolating clones derived from several types of rat cDNA libraries, including normal rat cDNA libraries, normalized rat cDNA libraries, prehybridized rat cDNA libraries, and subtracted cDNA libraries. Clone inserts derived from these clones may be partially sequenced to generate expressed sequence tags (ESTs). Molecules are also identified by comparing the clones from rat cDNA libraries with clones from human, monkey, and mouse cDNA libraries using computer software nucleic acid comparison programs such as BLAST (see, e.g., Altschul, S. F. (1993) J. Mol. Evol. 3:290-300; Altschul, et al. (1990) J. Mol. Biol. 215:403-410).
In one embodiment, two collections of ESTs are identified and sequenced. A first collection of ESTs (the originator molecules) are derived from rat liver and kidney and are derived from the cDNA libraries presented in the Examples. A second collection includes ESTs derived from other rat cDNA libraries available in the ZOOSEQ database (Incyte Pharmaceuticals, Inc. Palo Alto Calif.).
The two collections of ESTs are clustered electronically to form master clusters of ESTs. Master clusters are formed by identifying overlapping EST molecules and assembling these ESTs. A nucleic acid fragment assembly tool, such as the Phrap tool (Phil Green, University of Washington) and the GELVIEW fragment assembly system (GCG, Madison Wis.), can be used for this purpose. The minimum number of clones which constitute a cluster is two. In another embodiment, a collection of human genes known to be expressed in response to toxic agents are used to select representative ESTs from the 113 rat cDNA libraries. The master cluster process is repeated for these molecules.
After assembling the clustered consensus nucleic acid sequences, a representative 5xe2x80x2 clone is nominated from each master cluster. The most 5xe2x80x2 clone is preferred because it is most likely to contain the complete gene. The nomination process is described in greater detail in xe2x80x9cRelational Database and System for Storing Information Relating to Biomolecular Sequences and Reagentsxe2x80x9d, U.S. Ser. No. 09/034,807, filed Mar. 4, 1998, herein incorporated in its entirety by reference. The EST molecules are used as array elements on a microarray.
Selection of Arrayed Nucleic Acid Molecules
Samples are treated, preferably at subchronic doses, with one or more known toxic compounds over a defined time course. Preferably, the agents are peroxisomal proliferators (PPs), acetaminophen or one of its corresponding metabolites, polycyclic aromatic hydrocarbons (PAHs), carbon tetrachloride, hydrazine, xcex1-naphthylisothiocyanate, 4-acetylaminofluorene, or their corresponding metabolites.
The gene expression patterns derived from such treated biological samples can be compared with the gene expression patterns derived from untreated biological samples to identify and select nucleic acid molecules whose expression is either upregulated or downregulated due to the response to the toxic compounds. These selected molecules may then be employed as array elements alone or in combination with other array element molecules. Such a microarray is particularly useful to detect and characterize gene expression patterns associated with known toxic compounds. Such gene expression patterns can then be used for comparison to identify other compounds which also elicit a toxicological response.
The arrayed nucleic acid molecules can be manipulated to optimize their performance in hybridization. To optimize hybridization, the arrayed nucleic acid molecules are examined using a computer algorithm to identify portions of genes without potential secondary structure. Such computer algorithms are well known in the art and are part of OLIGO 4.06 primer analysis software (National Biosciences, Plymouth Minn.) or LASERGENE software (DNASTAR, Madison Wis.). These programs can search within nucleic acid sequences to identify stem loop structures and tandem repeats and to analyze G+C content of the sequence (those molecules with a G+C content greater than 60% are excluded). Alternatively, the arrayed nucleic acid molecules can be optimized by trial and error. Experiments can be performed to determine whether sample nucleic acid molecules and complementary arrayed nucleic acid molecules hybridize optimally under experimental conditions.
The arrayed nucleic acid molecules can be any RNA-like or DNA-like material, such as mRNAs, cDNAs, genomic DNA, peptide nucleic acids, branched DNAs and the like. The arrayed nucleic acid molecules can be in sense or antisense orientations.
In one embodiment, the arrayed nucleic acid molecules are cDNAs. The size of the DNA sequence of interest may vary, and is preferably from 50 to 10,000 nucleotides, more preferably from 150 to 3,500 nucleotides. In a second embodiment, the nucleic acid molecules are vector DNAs. In this case the size of the DNA sequence of interest, i.e., the insert sequence, may vary from about 50 to 10,000 nucleotides, more preferably from about 150 to 3,500 nucleotides.
The nucleic acid molecule sequences of the Sequence Listing have been prepared by current, state-of-the-art, automated methods and, as such, may contain occasional sequencing errors and unidentified nucleotides. Nucleotide analogues can be incorporated into the nucleic acid molecules by methods well known in the art. The only requirement is that the incorporated nucleotide analogues must serve to base pair with sample nucleic acid molecules. For example, certain guanine nucleotides can be substituted with hypoxanthine which base pairs with cytosine residues. However, these base pairs are less stable than those between guanine and cytosine. Alternatively, adenine nucleotides can be substituted with 2,6-diaminopurine which can form stronger base pairs than those between adenine and thymidine. Additionally, the nucleic acid molecules can include nucleotides that have been derivatized chemically or enzymatically. Typical modifications include derivatization with acyl, alkyl, aryl or amino groups.
The nucleic acid molecules can be immobilized on a substrate via chemical bonding. Furthermore, the molecules do not have to be directly bound to the substrate, but rather can be bound to the substrate through a linker group. The linker groups are typically about 6 to 50 atoms long to provide exposure to the bound nucleic acid molecule. Preferred linker groups include ethylene glycol oligomers, diamines, diacids and the like. Reactive groups on the substrate surface react with one of the terminal portions of the linker to bind the linker to the substrate. The other terminal portion of the linker is then functionalized for binding the nucleic acid molecule. Preferred substrates are any suitable rigid or semirigid support, including membranes, filters, chips, slides, wafers, fibers, magnetic or nonmagnetic beads, gels, tubing, plates, polymers, microparticles and capillaries. The substrate can have a variety of surface forms, such as wells, trenches, pins, channels and pores, to which the arrayed nucleic acid molecules are bound.
The samples can be any sample comprising sample nucleic acid molecules and obtained from any bodily fluid (blood, urine, saliva, phlegm, gastric juices, etc.), cultured cells, biopsies, or other tissue preparations. The samples can be derived from any species, but preferably from eukaryotic species, and more preferably from mammalian species such as rat and human.
DNA or RNA can be isolated from the sample according to any of a number of methods well known to those of skill in the art. For example, methods of purification of nucleic acids are described in Tijssen, P. (1993) Laboratory Techniques in Biochemistry and Molecular Biology: Hybridization With Nucleic Acid Probes, Part I. Theory and Nucleic Acid Preparation, Elsevier, New York, N.Y. In one preferred embodiment, total RNA is isolated using the TRIZOL total RNA isolation reagent (Life Technologies, Inc., Gaithersburg Md.) and mRNA is isolated using oligo d(T) column chromatography or glass beads. When sample nucleic acid molecules are amplified it is desirable to amplify the sample nucleic acid molecules and maintain the relative abundances of the original sample, including low abundance transcripts. RNA can be amplified in vitro, in situ, or in vivo (See Eberwine U.S. Pat. No. 5,514,545).
It is also advantageous to include controls within the sample to assure that amplification and labeling procedures do not change the true distribution of nucleic acid molecules in a sample. For this purpose, a sample is spiked with an amount of a control nucleic acid molecule predetermined to be detectable upon hybridization to its complementary arrayed nucleic acid molecule and the composition of nucleic acid molecules includes reference nucleic acid molecules which specifically hybridize with the control arrayed nucleic acid molecules. After hybridization and processing, the hybridization signals obtained should reflect accurately the amounts of control arrayed nucleic acid molecules added to the sample.
Prior to hybridization, it may be desirable to fragment the sample nucleic acid molecules. Fragmentation improves hybridization by minimizing secondary structure and cross-hybridization to other sample nucleic acid molecules in the sample or noncomplementary nucleic acid molecules. Fragmentation can be performed by mechanical or chemical means.
Labeling
The sample nucleic acid molecules may be labeled with one or more labeling moieties to allow for detection of hybridized arrayed/sample nucleic acid molecule complexes. The labeling moieties can include compositions that can be detected by spectroscopic, photochemical, biochemical, bioelectronic, immunochemical, electrical, optical or chemical means. The labeling moieties include radioisotopes, such as 32P, 33P or 35S, chemiluminescent compounds, labeled binding proteins, heavy metal atoms, spectroscopic markers, such as fluorescent markers and dyes, magnetic labels, linked enzymes, mass spectrometry tags, spin labels, electron transfer donors and acceptors, and the like. Preferred fluorescent markers include Cy3 and Cy5 fluorophores (Amersham Pharmacia Biotech, Piscataway N.J.).
Hybridization
The nulceic acid molecule sequence of SEQ ID NOs:1-47 and fragments thereof can be used in various hybridization technologies for various purposes. Hybridization probes may be designed or derived from SEQ ID NOs:1-47. Such probes may be made from a highly specific region such as the 5xe2x80x2 regulatory region or from a conserved motif, and used in protocols to identify naturally occurring sequences encoding the mammalian protein, allelic variants, or related sequences, and should preferably have at least 50% sequence identity to any of the protein sequences. The hybridization probes of the subject invention may be DNA or RNA and may be derived from the sequence of SEQ ID NOs:1-47 or from genomic sequences including promoters, enhancers, and introns of the mammalian gene. Hybridization or PCR probes may be produced using oligolabeling, nick translation, end-labeling, or PCR amplification in the presence of the labeled nucleotide. A vector containing the nucleic acid sequence may be used to produce an mRNA probe in vitro by addition of an RNA polymerase and labeled nucleic acid molecules. These procedures may be conducted using commercially available kits such as those provided by Amersham Pharmacia Biotech.
The stringency of hybridization is determined by G+C content of the probe, salt concentration, and temperature. In particular, stringency can be increased by reducing the concentration of salt or raising the hybridization temperature. In solutions used for some membrane based hybridizations, additions of an organic solvent such as formamide allows the reaction to occur at a lower temperature. Hybridization can be performed at low stringency with buffers, such as 5xc3x97SSC with 1% sodium dodecyl sulfate (SDS) at 60xc2x0 C., which permits the formation of a hybridization complex between nucleotide sequences that contain some mismatches. Subsequent washes are performed at higher stringency with buffers such as 0.2xc3x97SSC with 0.1% SDS at either 45xc2x0 C. (medium stringency) or 68xc2x0 C. (high stringency). At high stringency, hybridization complexes will remain stable only where the nucleic acid sequences are completely complementary. In some membrane-based hybridizations, preferably 35% or most preferably 50%, formamide can be added to the hybridization solution to reduce the temperature at which hybridization is performed, and background signals can be reduced by the use of other detergents such as Sarkosyl or Triton X-100 and a blocking agent such as salmon sperm DNA. Selection of components and conditions for hybridization are well known to those skilled in the art and are reviewed in Ausubel (supra) and Sambrook et al. (1989) Molecular Cloning, A Laboratory Manual, Cold Spring Harbor Press, Plainview N.Y.
Hybridization specificity can be evaluated by comparing the hybridization of specificity-control nucleic acid molecules to specificity-control sample nucleic acid molecules that are added to a sample in a known amount. The specificity-control arrayed nucleic acid molecules may have one or more sequence mismatches compared with the corresponding arrayed nucleic acid molecules. In this manner, whether only complementary arrayed nucleic acid molecules are hybridizing to the sample nucleic acid molecules or whether mismatched hybrid duplexes are forming is determined.
Hybridization reactions can be performed in absolute or differential hybridization formats. In the absolute hybridization format, nucleic acid molecules from one sample are hybridized to the molecules in a microarray format and signals detected after hybridization complex formation correlate to nucleic acid molecule levels in a sample. In the differential hybridization format, the differential expression of a set of genes in two biological samples is analyzed. For differential hybridization, nucleic acid molecules from both biological samples are prepared and labeled with different labeling moieties. A mixture of the two labeled nucleic acid molecules is added to a microarray. The microarray is then examined under conditions in which the emissions from the two different labels are individually detectable. Molecules in the microarray that are hybridized to substantially equal numbers of nucleic acid molecules derived from both biological samples give a distinct combined fluorescence (Shalon et al. PCT publication WO95/35505). In a preferred embodiment, the labels are fluorescent markers with distinguishable emission spectra, such as Cy3 and Cy5 fluorophores.
After hybridization, the microarray is washed to remove nonhybridized nucleic acid molecules and complex formation between the hybridizable array elements and the nucleic acid molecules is detected. Methods for detecting complex formation are well known to those skilled in the art. In a preferred embodiment, the nucleic acid molecules are labeled with a fluorescent label and measurement of levels and patterns of fluorescence indicative of complex formation is accomplished by fluorescence microscopy, preferably confocal fluorescence microscopy.
In a differential hybridization experiment, nucleic acid molecules from two or more different biological samples are labeled with two or more different fluorescent labels with different emission wavelengths. Fluorescent signals are detected separately with different photomultipliers set to detect specific wavelengths. The relative abundances/expression levels of the nucleic acid molecules in two or more samples is obtained.
Typically, microarray fluorescence intensities can be normalized to take into account variations in hybridization intensities when more than one microarray is used under similar test conditions. In a preferred embodiment, individual arrayed-sample nucleic acid molecule complex hybridization intensities are normalized using the intensities derived from internal normalization controls contained on each microarray.
The labeled sample emits specific wavelengths which are detected using a plurality of photomultipliers. The nucleic acid molecules whose relative abundance/expression levels are modulated by treatment of a sample with a known toxic compound can be used as hybridizable elements in a microarray. Such a microarray can be employed to identify expression profiles associated with particular toxicological responses. Then, a particular subset of these photomultipliers set to detect specific wavelengths. The relative expression levels of the arrayed nucleic acid molecules can be identified as to which arrayed nucleic acid molecule expression is modulated in response to a particular toxicological agent. These photomultipliers are set to detect specific wavelengths. The relative expression levels of the nucleic acid molecules can be employed to identify other compounds with a similar toxicological response.
Alternatively, for some treatments with known side effects, the microarray, and expression patterns derived therefrom, is employed to prospectively define the treatment regimen. A dosage is established that minimizes expression patterns associated with undesirable side effects. This approach may be more sensitive and rapid than waiting for the patient to show toxicological side effects before altering the course of treatment.
Generally, the method for screening a library of test compounds or molecules to identify those with a toxicological response entails selecting a plurality of arrayed genes whose expression levels are modulated in tissues treated with known toxic compounds when compared with untreated tissues. Then a sample is treated with the test compound or molecule to induce a pattern of gene expression comprising the expression of a plurality of sample nucleic acid molecules. Tissues from a mammalian subject treated at various dosages of the test compound may be screened to determine which doses may be toxic.
Then, the expression levels of the arrayed genes and the sample nucleic acid molecules are compared to identify those compounds that induce expression levels of the sample nucleic acid molecules that are similar to those of the arrayed genes. In one preferred embodiment, gene expression levels are compared by contacting the arrayed genes with the sample nucleic acid molecules under conditions effective to form hybridization complexes between arrayed genes and sample nucleic acid molecules; and detecting the presence or absence of the hybridization complexes.
Similarity may mean that at least 1, preferably at least 5, more preferably at least 10, of the upregulated arrayed genes form hybridization complexes with the sample nucleic acid molecules at least once during a time course to a greater extent than would the nucleic acid molecules of a sample not treated with the test compound. Similarity may also mean that at least 1, preferably at least 5, more preferably at least 10, of the downregulated nucleic acid molecules form hybridization complexes with the arrayed genes at least once during a time course to a lesser extent than would the nucleic acid molecules of a sample not treated with the test compound.
Such a similarity of expression patterns means that a toxicological response is associated with the compound or therapeutic tested. Preferably, the toxic compounds belong to the class of peroxisomal proliferators (PPs), including hypolipidemic drugs, such as clofibrate, fenofibrate, clofenic acid, nafenopin, gemfibrozil, ciprofibrate, bezafibrate, halofenate, simfibrate, benzofibrate, etofibrate, WY-14,643, and the like; n-alkylcarboxylic acids, such as trichloroacetic acid, valproic acid, hexanoic acid, and the like; n-alkylcarboxylic acid precursors, such as trichloroethylene, etrachloroethylene, and the like; azole antifungal compounds, such as bifonazole, and the like; leukotriene D4 antagonists; herbicides; pesticides; phthalate esters, such as di-[2-ethylhexyl]phthalate, mono-[2-ethylhexyl]phthalate, and the like; and natural chemicals, such as phenyl acetate, dehydroepiandrosterone (DHEA), oleic acid, methanol, and the like. In another embodiment, the toxic compound is acetaminophen or one of its corresponding metabolites. In yet another embodiment, the toxic compounds are polycyclic aromatic hydrocarbons (PAHs), including compounds such as benzo(a)pyrene, 3-methylcholanthrene, benz(a)anthracene, 7,12-dimethylbenz(a)anthracene, their corresponding metabolites, and the like. Of particular interest is the study of the toxicological responses of these compounds on the liver, kidney, brain, spleen, pancreas, and lung.
Modification of Gene Expression Using Nucleic Acids
Gene expression may be modified by designing complementary or antisense molecules (DNA, RNA, or PNA) to the control, 5xe2x80x2, 3xe2x80x2, or other regulatory regions of the mammalian gene. Oligonucleotides designed with reference to the transcription initiation site are preferred. Similarly, inhibition can be achieved using triple helix base-pairing which inhibits the binding of polymerases, transcription factors, or regulatory molecules (Gee et al. In: Huber and Carr (1994) Molecular and Immunologic Approaches, Futura Publishing, Mt. Kisco N.Y., pp. 163-177). A complementary molecule may also be designed to block translation by preventing binding between ribosomes and mRNA. In one alternative, a library of nucleic acid molecules or fragments thereof may be screened to identify those which specifically bind a regulatory, nontranslated sequence .
Ribozymes, enzymatic RNA molecules, may also be used to catalyze the specific cleavage of RNA. The mechanism of ribozyme action involves sequence-specific hybridization of the ribozyme molecule to complementary target RNA followed by endonucleolytic cleavage at sites such as GUA, GUU, and GUC. Once such sites are identified, an oligonucleotide with the same sequence may be evaluated for secondary structural features which would render the oligonucleotide inoperable. The suitability of candidate targets may also be evaluated by testing their hybridization with complementary oligonucleotides using ribonuclease protection assays.
Complementary nucleic acids and ribozymes of the invention may be prepared via recombinant expression, in vitro or in vivo, or using solid phase phosphoramidite chemical synthesis. In addition, RNA molecules may be modified to increase intracellular stability and half-life by addition of flanking sequences at the 5xe2x80x2 and/or 3xe2x80x2 ends of the molecule or by the use of phosphorothioate or 2xe2x80x2 O-methyl rather than phosphodiesterase linkages within the backbone of the molecule. Modification is inherent in the production of PNAs and can be extended to other nucleic acid molecules. Either the inclusion of nontraditional bases such as inosine, queosine, and wybutosine, and or the modification of adenine, cytidine, guanine, thymine, and uridine with acetyl-, methyl-, thio-groups renders the molecule less available to endogenous endonucleases.
Screening Assays
The nucleic acid molecule encoding the mammalian protein may be used to screen a library of molecules for specific binding affinity. The libraries may be DNA molecules, RNA molecules, PNAs, peptides, proteins such as transcription factors, enhancers, repressors, and other ligands which regulate the activity, replication, transcription, or translation of the nucleic acid molecule in the biological system. The assay involves combining the mammalian nucleic acid molecule or a fragment thereof with the library of molecules under conditions allowing specific binding, and detecting specific binding to identify at least one molecule which specifically binds the nucleic acid molecule.
Similarly the mammalian protein or a portion thereof may be used to screen libraries of molecules in any of a variety of screening assays. The portion of the protein employed in such screening may be free in solution, affixed to an abiotic or biotic substrate (e.g. borne on a cell surface), or located intracellularly. Specific binding between the protein and molecule may be measured. Depending on the kind of library being screened, the assay may be used to identify DNA, RNA, or PNA molecules, agonists, antagonists, antibodies, immunoglobulins, inhibitors, peptides, proteins, drugs, or any other ligand, which specifically binds the protein. One method for high throughput screening using very small assay volumes and very small amounts of test compound is described in U.S. Pat. No. 5,876,946, incorporated herein by reference, which screens large numbers of molecules for enzyme inhibition or receptor binding.
Purification of Ligand
The nucleic acid molecule or a fragment thereof may be used to purify a ligand from a sample. A method for using a mammalian nucleic acid molecule or a fragment thereof to purify a ligand would involve combining the nucleic acid molecule or a fragment thereof with a sample under conditions to allow specific binding, detecting specific binding, recovering the bound protein, and using an appropriate agent to separate the nucleic acid molecule from the purified ligand.
Similarly, the protein or a portion thereof may be used to purify a ligand from a sample. A method for using a mammalian protein or a portion thereof to purify a ligand would involve combining the protein or a portion thereof with a sample under conditions to allow specific binding, detecting specific binding between the protein and ligand, recovering the bound ligand, and using an appropriate chaotropic agent to separate the protein from the purified ligand.
Pharmacology
Pharmaceutical compositions are those substances wherein the active ingredients are contained in an effective amount to achieve a desired and intended purpose. The determination of an effective dose is well within the capability of those skilled in the art. For any compound, the therapeutically effective dose may be estimated initially either in cell culture assays or in animal models. The animal model is also used to achieve a desirable concentration range and route of administration. Such information may then be used to determine useful doses and routes for administration in humans.
A therapeutically effective dose refers to that amount of protein or inhibitor which ameliorates the symptoms or condition. Therapeutic efficacy and toxicity of such agents may be determined by standard pharmaceutical procedures in cell cultures or experimental animals, e.g., ED50 (the dose therapeutically effective in 50% of the population) and LD50 (the dose lethal to 50% of the population). The dose ratio between toxic and therapeutic effects is the therapeutic index, and it may be expressed as the ratio, LD50/ED50. Pharmaceutical compositions which exhibit large therapeutic indexes are preferred. The data obtained from cell culture assays and animal studies are used in formulating a range of dosage for human use.
Animal models may be used as bioassays where they exhibit a toxic response similar to that of humans and where exposure conditions are relevant to human exposures. Mammals are the most common models, and most toxicity studies are performed on rodents such as rats or mice because of low cost, availability, and abundant reference toxicology. Inbred or outbred rodent strains provide a convenient model for investigation of the physiological consequences of under- or over-expression of genes of interest and for the development of methods for diagnosis and treatment of diseases. A mammal inbred to over-express a particular gene, so that the protein is secreted in milk, may also serve as a convenient source of the protein expressed by that gene.
Toxicology
Toxicology is the study of the effects of test compounds, molecules, or toxic agents on living systems to identify adverse effects. The majority of toxicity studies are performed on rats or mice to help predict whether adverse effects of agents will occur in humans. Observation of qualitative and quantitative changes in physiology, behavior, homeostatic, developmental, and reproductive processes, and lethality are used to generate profiles of safe or toxic responses and to assess the consequences on human health following exposure to the agent.
Toxicological tests measure the effects of a single, repeated, or long-term exposure of a subject to a substance. Substances may be tested for specific endpoints such as cytotoxicity, mutagenicity, carcinogenicity and teratogenicity. Degree of response varies according to the route of exposure (contact, ingestion, injection, or inhalation), age, sex, genetic makeup, and health status of the subject. Other tests establish the toxicokinetic and toxicodynamic properties of substances. Toxicokinetic studies trace the absorption, distribution in subject tissues, metabolism, storage, and excretion of substances. Toxicodynamic studies chart biological responses that are consequences of the presence of the substance in the subject tissues.
Genetic toxicology identifies and analyzes the ability of an agent to produce damage at a cellular or subcellular level. Such genotoxic agents usually have common chemical or physical properties that facilitate interaction with nucleic acids and are most harmful when mutated chromosomes are passed along to progeny. Toxicological studies may identify agents that increase the frequency of structural or functional abnormalities in progeny if administered to either parent before conception, to the mother during pregnancy, or to the developing organism. Mice and rats are most frequently used in these tests because of their short reproductive cycle which allows investigators to breed sufficient quantities of individual animals to satisfy statistical requirements.
All types of toxicology studies on experimental animals involve preparation of a suitable form of the compound for administration, selection of the route of administration, and selection of a species which resembles the species of pharmacological interest. Dose concentrations of the compound are varied to identify, measure, and investigate a range of dose-related effects related to exposure.
Acute toxicity tests are based on a single administration of the agent to the subject to determine the symptomology or lethality of the agent. Three experiments are conducted; an experiment to define the initial dose range; an experiment to narrow the range of effective doses; and a final experiment to establish the dose-response curve.
Prolonged and subchronic toxicity tests are based on the repeated administration of the agent. Rat and dog are commonly used in these studies to provide data from species in different taxonomic orders. With the exception of carcinogenesis, there is considerable evidence that daily administration of an agent at high-dose concentrations for periods of three to four months will reveal most forms of toxicity in adult animals.
Chronic toxicity tests, with a duration of a year or more, are used to demonstrate either the absence of toxicity or the carcinogenic potential of an agent. When studies are conducted on rats, a minimum of at least one test group plus one control group are used. Animals are quarantined, examined for health, and monitored at the outset and at intervals throughout the experiment.
Transgenic Animal Models
Transgenic rodents which over-express or under-express a gene of interest may be inbred and used to model human diseases or to test therapeutic or toxic agents. (See U.S. Pat. No. 4,736,866; U.S. Pat. No. 5,175,383; and U.S. Pat. No. 5,767,337; incorporated herein by reference). In some cases, the introduced gene may be activated at a specific time in a specific tissue type during fetal development or postnatally. Expression of the transgene is monitored by analysis of phenotype or tissue-specific mRNA expression, in transgenic animals before, during, and after being challenged with experimental drug therapies.
Embryonic Stem Cells
Embryonic stem cells (ES) isolated from rodent embryos retain the potential to form an embryo. When ES cells are placed inside a carrier embryo, they resume normal development and contribute to all tissues of the live-born animal. ES cells are the preferred cells used in the creation of experimental knockout and knockin rodent strains. Mouse ES cells, such as the mouse 129/SvJ cell line, are derived from the early mouse embryo and are grown under culture conditions well known in the art. Vectors for knockout strains contain a disease gene candidate modified to include a marker gene which disrupts transcription and/or translation of the endogenous disease candidate gene in vivo. The vector is introduced into ES cells by transformation methods such as electroporation, liposome delivery, microinjection, and the like which are well known in the art. The endogenous rodent gene is replaced by the disrupted disease gene through homologous recombination and integration during cell division. Expression of the marker gene confers a selective advantage to the transformed cells when incubated with an otherwise toxic/lethal selecting agent. Transformed ES cells are selected, identified, and preferably microinjected into mouse cell blastocysts such as those from the C57BL/6 mouse strain. The blastocysts are surgically transferred to pseudopregnant dams and the resulting chimeric progeny are genotyped and bred to produce heterozygous or homozygous strains.
ES cells are also used to study the differentiation of various cell types and tissues in vitro, such as neural cells, hematopoietic lineages, and cardiomyocytes (Bain et al. (1995) Dev. Biol. 168:342-357; Wiles and Keller (1991) Development 111:259-267; and Klug et al. (1996) J. Clin. Invest. 98:216-224). Recent developments demonstrate that ES cells derived from human blastocysts may also be manipulated in vitro to differentiate into eight separate cell lineages, including eridoderm, mesoderm, and ectodermal cell types (Thomson et al. (1998) Science 282:1145-1147).
Knockout Analysis
In gene knockout analysis, a region of a human disease gene candidate is enzymatically modified to include a non-mammalian gene such as the neomycin phosphotransferase. gene (neo; Capecchi (1989) Science 244:1288-1292). The inserted coding sequence disrupts transcription and translation of the targeted gene and prevents biochemical synthesis of the disease candidate protein. The modified gene is transformed into cultured embryonic stem cells (described above), the transformed cells are injected into rodent blastulae, and the blastulae are implanted into pseudopregnant dams. Transgenic progeny are crossbred to obtain homozygous inbred lines.
Knockin Analysis
Totipotent ES cells, present in the early stages of embryonic development, can be used to create knockin humanized animals (pigs) or transgenic animal models (mice or rats) of human diseases. With knockin technology, a region of a human gene is injected into animal ES cells, and the human sequence integrates into the animal cell genome by recombination. Totipotent ES cells which contain the integrated human gene are handled as described above. Inbred animals are studied and treated to obtain information on the analogous human condition. These methods have been used to model several human diseases. (See, e.g., Lee et al. (1998) Proc. Natl. Acad. Sci. 95:11371-11376; Baudoin et al. (1998) Genes Dev. 12:1202-1216; and Zhuang et al. (1998) Mol. Cell Biol. 18:3340-3349).
Non-Human Primate Model
The field of animal testing deals with data and methodology from basic sciences such as physiology, genetics, chemistry, pharmacology and statistics. These data are paramount in evaluating the effects of therapeutic agents on non-human primates as they can be related to human health. Monkeys are used as human surrogates in vaccine and drug evaluations, and their responses are relevant to human exposures under similar conditions. Cynomolgus and Rhesus monkeys (Macaca fascicularis and Macaca mulatta, respectively) and Common Marmosets (Callithrix jacchus) are the most common non-human primates (NHPs) used in these investigations. Since great cost is associated with developing and maintaining a colony of NHPs, early research and toxicological studies are usually carried out in rodent models. In studies using behavioral measures such as drug addiction, NHPs are the first choice test animal. In addition, NHPs and individual humans exhibit differential sensitivities to many drugs and toxins and can be classified as a range of phenotypes from xe2x80x9cextensive metabolizersxe2x80x9d to xe2x80x9cpoor metabolizersxe2x80x9d of these agents.
In additional embodiments, the nucleic acid molecules which encode the mammalian protein may be used in any molecular biology techniques that have yet to be developed, provided the new techniques rely on properties of nucleic acid molecules that are currently known, including, but not limited to, such properties as the triplet genetic code and specific base pair interactions.