The present invention relates to polynucleotide sequences associated with the equine Y chromosome and to methods of identifying such polynucleotide sequences. The present invention also relates to methods of determining the primary (i.e. genetic) sex of individuals and of samples of cells removed from individuals, and is particularly concerned with equine sex determination.
Many sectors of the various horse industries prefer a preponderance of animals of one sex. This may be for reasons of reproductive potential, heritability of particular traits, tractability, performance, stature and physique, appearance or other reasons.
The ability to determine the sex of a fetus is advantageous since it allows optimal management and valuation of pregnancies.
Where methods of assisted reproduction are available, by embryo transfer (with or without induced multiple ovulation) or by recovers and return into the donor or by in vitro fertilisation, the ability to determine the sex of an embryo is advantageous since it allows the sex of potential progeny to be predetermined. If combined with artificial twinning by means of embryo bisection (1,2) it further allows enhanced propagation of the desired sex without reduction in the total number of potential progeny.
It would be particularly advantageous to predetermine the sex of progeny by means of insemination of a receptive mare with sperm populations comprising a preponderance of sperm having one or the other sex chromosome constitution, i.e. either the X chromosome (which sperm yield female progeny) or the Y chromosome (which sperm yield male progeny). Such enriched populations of sperm could also be used to great advantage in in vitro fertilisation. In a further very advantageous application, an individual sperm cell of a known sex chromosome constitution can be injected into the cytoplasm of a mature oocyte in vitro (ICSI: intra-cytoplasmic sperm injection), effecting fertilisation to yield a zygote of known sex. The ability to determine the sex chromosome constitution of populations of sperm cells and of individual sperm cells is an essential prerequisite in such applications.
The primary sex of equine species, as in the overwhelming majority of mammalian species, is determined by the presence or absence of the entire Y chromosome or a functional portion thereof (3-8). The essential portion is a gene known as SRY that is responsible for initiating testis differentiation (9-11). Secretions of the resultant testis have a dominant influence on the development of secondary sex characters (12).
The sex or presumptive sex of an individual horse can thus be determined by analysis for DNA sequences that are associated uniquely with the equine Y chromosome.
Previous reports of DNA sequences associated with the equine Y chromosome (11,13,14) concern presumptive sequences that are amplified by polymerase chain reaction (PCR; 15,16) from primer oligonucleotides whose sequences are derived from genes known to be Y-linked in other mammalian species, viz. ZFY(13,14) and SRY(11,13). There are no published DNA sequence data for DNA sequences associated with the equine Y chromosome. Both ZFY and SRY occur in single copy in all mammalian species examined (with the known exception of Mus species, in which two similar Zfv genes have been described; 17) and so, presumably, in the horse. In the context of determining the genetic sex of viable embryos where only a small number of cells is available from a microscopic biopsy, assay sensitivity is a significant consideration. The advantages for embryo sexing of testing for a DNA sequence that is repeated on the Y chromosome have been detailed previously (18,19).
A report of a repeated DNA sequence that is found on the Y chromosome of horses (20) concerns a short DNA sequence element known as Bkm (5xe2x80x2-G.A.C/T.A-3xe2x80x2; 21-23) that has been reported in many vertebrate species. It is also abundant elsewhere in the genome, to the extent that representatives on the Y chromosome comprise a small minority of the total. Such a sequence, of itself, has no utility in the diagnosis of genetic sex in microscopic biopsies.
The present inventors have now identified specific DNA sequences that are repeated in the Y chromosome of the horse. The nucleic acid isolates correspond to all or part of a DNA sequence found on the Y chromosome of Equus caballus. The present invention therefore provides a number of polynucleotide isolates capable of specifically hybridizing to samples of nucleic acid derived from horses which contain Y chromosomal DNA sequences.
A procedure similar in essence to that used in the first part of the present invention has been applied previously to animals where it was used to observe, but not isolate or otherwise define, DNA fragments associated with the heterogametic sex of chicken (24), cattle (25) and sheep (26).
Accordingly, in a first aspect the present invention provides an isolated polynucleotide, the polynucleotide having a sequence as set out in any one of SEQ ID NOS: 1 to 4 or 8 to 11, or a sequence which hybridizes thereto.
The polynucleotide sequences of the present invention hybridize specifically to the equine Y chromosome. By xe2x80x9chybridize specifically to the equine Y chromosomexe2x80x9d we mean the polynucleotides hybridize to a repeat sequence which is present on the equine Y chromosome in a substantially greater copy number than is present elsewhere in the equine genome. Preferably, the sequence is present in less than six copies and more preferably in only one copy in the haploid female genome.
In a preferred embodiment the polynucleotide sequence has a sequence as set out in SEQ ID NO: 3 or a sequence which hybridizes thereto.
The polynucleotide sequences of the present invention preferably hybridize to sequences set out in SEQ ID NOS: 1 to 4 or 8 to 11 under high stringency. When used herein, xe2x80x9chigh stringencyxe2x80x9d refers to conditions that (i) employ low ionic strength and high temperature for washing after hybridization, for example, 0.1xc3x97SSC and 0.1% (w/v) SDS at 50xc2x0 C.; (ii) employ during hybridization conditions such that the hybridization temperature is 25xc2x0 C. lower than the duplex melting temperature of the hybridizing polynucleotides, for example 1.5xc3x97SSPE, 10% (w/v) polyethylene glycol 6000 (27), 7% (w/v) SDS (28), 0.25 mg/ml fragmented herring sperm DNA at 65xc2x0 C.; or (iii) for example, 0.5M sodium phosphate, pH 7.2. 5 mM EDTA. 7% (w/v) SDS (28) and 0.5% (w/v) BLOTTO (29.30) at 70xc2x0 C.: or (iv) employ during hybridization a denaturing agent such as formamide (31), for example, 50% (v/v) formamide with 5xc3x97SSC, 50 mM sodium phosphate (pH 6.5) and 5xc3x97Denhardt""s solution (32) at 42xc2x0 C.; or (v) employ, for example, 50% (v/v) formamide, 5xc3x97SSC, 50 mM sodium phosphate (pH 6.8), 0.1% (w/v) sodium pyrophosphate, 5xc3x97Denhardt""s solution (32). Sonicated salmon sperm DNA (50 xcexcg/ml) and 10% dextran sulphate (33) at 42xc2x0 C. See generally references 34-36.
In a further preferred embodiment, the polynucleotide which hybridises under stringent conditions is less than 500 nucleotides, more preferably less than 200 nucleotides, and more preferably less than 100 nucleotides in length.
In a further preferred embodiment, the polynucleotide sequences of the present invention share at least 40% homology, more preferably at least 60% homology, more preferably at least 80% homology, more preferably at least 90% homology and more preferably at least 95% homology with a sequence shown in any one of SEQ ID NOS: 1 to 4 or 8 to 11, wherein the homology is calculated by the BLAST program blastn as described in Altschul, S. F., Madden, T. L., Schaffer, A. A., Zhang, J., Zhang, Z., Miller, W. And Lipman, D. J. (1997) xe2x80x9cGapped BLAST and PSI-BLAST: a new generation of protein database search programsxe2x80x9d, Nucleic Acids Research 25(17):3389-3402.
In a further preferred embodiment, the polynucloetide sequence of the present invention hybridises under stringent conditions to a sequence characterised by nucleotides 990-2497 of SEQ ID NO: 8, 421-1920 of SEQ ID NO. 9, 421-1930 of SEQ ID NO. 10, or 1502-2996 of SEQ ID NO. 11.
The polynucleotide of the present invention may comprise DNA or RNA sequences.
The present invention also provides a vector including a polynucleotide sequence according to the first aspect of the present invention and a host cell transformed with such a vector.
In a second aspect, the present invention provides an oligonucleotide probe or primer of at least 8 nucleotides, the oligonucleotide having a sequence that hybridizes to a polynucleotide of the first aspect of the present invention.
In a preferred embodiment the oligonucleotide is at least 10, more preferably at least 15 and more preferably at least 18 nucleotides in length.
In one preferred embodiment the oligonucleotide is derived from the sequence shown in SEQ ID NO:3. In one preferred embodiment the oligonucleotide comprises the sequence:
5xe2x80x2-AGCGGAGAAAGGAATCTCTGG-3xe2x80x2 (SEQ ID NO: 12) or
5xe2x80x2-TACCTAGCGCTTCGTCCTCTAT-3xe2x80x2 (SEQ ID NO: 13) derived from nts 6-26 and the reverse complement of nts 184-205, respectively, of the equine male genomic DNA sequence shown in SEQ ID NO: 7.
It will be appreciated that the probes or primers of the present invention may be produced by in vitro or in vivo synthesis. Methods of in vitro probe synthesis include organic chemical synthesis processes or enzymatically mediated synthesis, e.g. by means of SP6 RNA polymerase and a plasmid containing a polynucleotide sequence according to the first aspect of the present invention under transcriptional control of an SP6 specific promoter.
In a further preferred embodiment the oligonucleotide probe is conjugated with a label such as a radioisotope, an enzyme, biotin, a fluorescer or a chemiluminescer.
In a third aspect, the present invention provides a method of determining the sex of a horse, an equine fetus, an equine embryo or an equine cell(s) which method includes analysing a biological sample derived from the horse or the fetus or embryo or the population of cells, for the presence of a polynucleotide according to the first aspect of the present invention.
The equine cell(s) may be, for example, the sperm cells of a horse. In a preferred embodiment they may be populations of sperm cells or individual sperm cells that have been resolved by flow cytometry after staining with the fluorescent DNA-binding dye Hoechst 33342 (37,38).
The equine cell(s) may further be, for example, nucleated fetal cells. Such cells may be collected by amniocentesis or chorionic villus sampling. In a preferred embodiment they may be sampled in the peripheral blood of a pregnant mare (see generally reference 39 the disclosure of which is incorporated herein by reference).
In order to minimise the possibility of false negatives, the method is preferably conducted with one or more suitable positive controls. For example, the biological sample may be simultaneously analysed for the presence of a sequence which is present in approximately equal copy numbers in male and female horses. The biological sample may be analysed, for example, for the presence of a dispersed autosomal repeated sequence.
It will be understood by a person skilled in this field that an analysis to determine whether a sample contains the polynucleotide sequence of the present invention may be performed in a number of ways. For example, the analysis may involve Southern blot hybridization, dot blot hybridization or in situ hybridization tests using probes according to the present invention. Alternatively, the analysis may involve the technique of polymerase chain reaction (PCR; 16) or ligation amplification reaction (LAR: 40,41) using oligonucleotide primers and probes of the present invention.
The term xe2x80x9cpolymerase chain reactionxe2x80x9d or xe2x80x9cPCRxe2x80x9d when used herein generally refers to a procedure where minute amounts of a specific piece of nucleic acid, RNA and/or DNA, are amplified as described in references 42 and 43. Generally, sequence information from the ends of the region of interest or beyond needs to be available, such that oligonucleotide primers can be designed; these primers will be identical in sequence or similar in sequence to opposite strands of the template to be amplified. The 5xe2x80x2 terminal nucleotides of the two primers may coincide with the ends of the amplified material. PCR can be used to amplify specific RNA sequences, specific DNA sequences from total genomic DNA, and cDNA transcribed from total cellular RNA, bacteriophage or plasmid sequences, etc. See generally references 16 and 44.
As used herein, PCR is considered to be one, but not the only, example of a nucleic acid polymerase reaction method for amplifying a nucleic acid test sample, comprising the use of an established nucleic acid (DNA or RNA) as a primer, and utilises a nucleic acid polymerase to amplify or generate a specific piece of nucleic acid or to amplify or generate a specific piece of nucleic acid which is complementary to a particular nucleic acid (see, for example, references 45 and 46).
The terms xe2x80x9cligation chain reactionxe2x80x9d or xe2x80x9cLCRxe2x80x9d or xe2x80x9cligation amplification reactionxe2x80x9d or xe2x80x9cLARxe2x80x9d when used herein generally refer to a procedure where minute amounts of a specific piece of nucleic acid, RNA and/or DNA, are amplified as described in references 40 and 41. Generally, sequence information from the region of interest needs to be available, such that oligonucleotide pairs can be designed that are complementary to adjacent sites on an appropriate nucleic acid template. The oligonucleotide pair is ligated together by the action of a ligase enzyme. The amount of ligated product may be increased by either linear or exponential amplification using sequential rounds of such template-dependent ligation. In the case of linear amplification, a single pair of oligonucleotides is ligated, the reaction is heated to dissociate the ligation product from its template, and similar additional rounds of ligation are performed. Exponential amplification utilises two pairs of oligonucleotides, one pair being complementary to one strand of a target sequence and the other pair being complementary to the second strand of the target sequence. In this case the products of ligation serve as mutually complementary templates for subsequent rounds of ligation, interspersed with heating to separate the ligated products from the template strands. A single base-pair mismatch between the annealed oligonucleotides and the template prevents ligation, thus allowing the distinction of single base-pair differences between DNA templates. LAR can be used to amplify specific RNA sequences, specific DNA sequences from total genomic DNA, and cDNA transcribed from total cellular RNA, bacteriophage or plasmid sequences, etc. See generally references 40 and 41. As used herein, LAR is considered to be one, but not the only, example of a nucleic acid ligase reaction method for amplifying a nucleic acid test sample, comprising the use of an established nucleic acid (DNA or RNA) as a primer/probe, and utilises a nucleic acid ligase to amplify or generate a specific piece of nucleic acid or to amplify or generate a specific piece of nucleic acid which is complementary to a particular nucleic acid (see, for example, references 47 and 48).
In a fourth aspect, the present invention provides a kit for sex determination of a horse, an equine fetus, an equine embryo, an equine cell or a population of equine cells, which kit includes a polynucleotide according to the first aspect of the present invention or an oligonucleotide probe or primer according to the second aspect of the present invention.
The terms xe2x80x9cEY.AC6xe2x80x9d, xe2x80x9cEY.AD11xe2x80x9d, xe2x80x9cEY.AI5xe2x80x9d and xe2x80x9cEY.AM7xe2x80x9d as used herein refer to, where provided, the specific DNA sequences set forth in SEQ ID NOS: 1-4 respectively. These terms also include variants where nucleotides have been substituted, added to or deleted from the relevant sequences shown in SEQ ID NOS: 1-4 so long as the variants hybridize specifically to the equine Y chromosome.
Such variants may be naturally occurring variants which may arise within an individual or a population by virtue of point mutation(s), deletion(s) or insertion(s) of DNA sequences, by recombination, gene conversion, flawed replication or rearrangement. Alternatively, such variants may be produced artificially, for example by site-directed mutagenesis, by xe2x80x9cgene shufflingxe2x80x9d, by deletion using exonuclease(s) and/or endonuclease(s), or by the addition of DNA sequences by ligating portions of DNA together, or by the addition of DNA sequences by template-dependent and/or template-independent DNA polymerase(s).
The EY.AC6 DNA sequence is shown in SEQ ID NO: 1. The sequence, comprising 432 base pairs of nucleotides, was determined from a fragment of DNA that was cloned into plasmid pGEM-T (trademark Promega). The cloned fragment had been recovered from a polyacrylamide gel following electrophoresis and staining of the products of RAPD PCR of male equine genomic DNA with Operon (trademark) primer OPAC.06. The fragment was selected because it was visible as a product of RAPD PCR of male but not female genomic DNA. Homologues of the cloned fragment EY.AC6 have been shown, by its hybridization to Southern blots of genomic DNA from male and female Equus caballus, to be present in both sexes but are repeated at much higher amounts in males, with the haploid female genome containing just one or a small number of copies. The defined sequence EY.AC6 appears to be contiguous with sequence EY.AM7 in the equine Y chromosome since the two sequenced isolates share a region of overlap of 128 bp with 91% similarity.
The EY.AD11 DNA sequence is shown in SEQ ID NO: 2. The sequence, comprising 600 base pairs of nucleotides, was determined from a fragment of DNA that was cloned into plasmid pGEM-T (trademark Promega). The cloned fragment had been recovered from a polyacriylamide gel following electrophoresis and staining of the products of RAPD PCR of male equine genomic DNA with Operon (trademark) primer OPAD.11. The fragment was selected because it was visible as a product of RAPD PCR of male but not female genomic DNA. Homologues of the cloned fragment EY.AD11 have been shown, by its hybridization to Southern blots of genonlic DNA from male and female Equus caballus, to be present in both sexes but are repeated at much higher amounts in males, with the haploid female genome containing just one or a small number of copies.
The EY.AI5 DNA sequence is shown in SEQ ID NO: 3. The sequence, comprising 230 base pairs of nucleotides, was determined from a fragment of DNA that was cloned into plasmid pGEM-3Z (trademark Promega). The cloned fragment had been recovered from a polyacrylamide gel following electrophoresis and staining of the products of RAPD PCR of male equine genomic DNA with Operon (trademark) primer OPAI.05. The fragment was selected because it was visible as a product of RAPD PCR of male but not female genomic DNA. Homologues of the cloned fragment EY.AI5 have been shown, by its hybridization to Southern blots of genomic DNA from male and female Equus caballus, to be present in both sexes but are repeated at much higher amounts in males, with the haploid female genome containing just one or a small number of copies.
The EY.AM7 DNA sequence is shown in SEQ ID NO: 4. The sequence, comprising 285 base pairs of nucleotides, was determined from a fragment of DNA that was cloned into plasmid pGEM-T (trademark Promega). The cloned fragment had been recovered from a polyacrylamide gel following electrophoresis and staining of the products of RAPD PCR of male equine genomic DNA with Operon (trademark) primer OPAM.07. The fragment was selected because it was visible as a product of RAPD PCR of male but not female genomic DNA. Homologues of the cloned fragment EY.AM7 have been shown, by its hybridization to Southern blots of genomic DNA from male and female Equus caballus, to be present in both sexes but are repeated at much higher amounts in males, with the haploid female genome containing just one or a small number of copies. The defined sequence EY.AM7 appears to be contiguous with sequence EY.AC6 in the equine Y chromosome since the sequences isolated share a region of overlap of 128 bp with 91% similarity.
The DNA sequences described herein (SEQ ID NOS: 1-4) were determined by chain-termination DNA sequencing techniques (49) using fluorescence-labelled dideoxynucleotides (50-53).
It will be appreciated by those skilled in the art that the polynucleotide sequences of the present invention are advantageous in that they are present in multiple copies on the Y chromosome, thereby providing greater sensitivity in assays for the presence of a Y chromosome than is possible when the assay involves detection of a unique (single copy) DNA sequence. This allows detection to be applied with greater facility to very small samples, as in a few cells removed from a viable embryo (2) or cells of fetal origin in peripheral blood of a pregnant mare (39) or sperm cells separated by fluorescence activated cell sorting (38).
The polynucleotide sequences and oligonucleotide primers and probes of the present invention have application, for example, in sexing of embryo biopsy; fetal sex detection, i.e. by amniocentesis, chorionic villus sampling, fetal cells circulating in peripheral blood of a pregnant mare; analysis of the sex chromosome constitution of an individual sperm cell or of populations of sperm cells; resolution of ambiguities in sexual phenotype; sex analysis of tissues derived from horses (meat, hide, hair, bone, etc. from living or dead horses); and similar applications in related equine species, including extinct or endangered species.
The polynucleotide sequences and oligonucleotide primers and probes of the present invention also have a variety of uses in addition to their use in sexual identification. For example, the sequences may be used to screen recombinant DNA libraries prepared from a variety of mammalian species. The DNA sequences may be used to deduce similar sequences or genetically linked sequences having similar functionality. The sequences may also be used in chromosome walking or jumping techniques to isolate coding and non-coding sequences proximal to the nucleotide sequence of the present invention.
According to a further aspect of the present invention, there is provided a method for the isolation of Y-chromosomal DNA sequences comprising:
pooling equivalent amounts of genomic DNA from a number of male animals of a single species and pooling equivalent amounts of genomic DNA from a similar number of female animals of the same species, with the female animals preferably being related closely to the male animals, e.g. siblings;
subjecting equivalent samples of the male and female pooled DNA mixtures to PCR with an arbitrary oligonucleotide primer and resolving the resultant fragments by gel electrophoresis;
examining the stained resolved products for fragments that are amplified from male DNA but not from female DNA;
recovering said fragment(s) from an electrophoresis gel and isolating individual fragments by molecular cloning; and
PCR analysis of samples of male and female genomic DNA using oligonucleotide primers derived from the DNA sequence of said isolated fragment(s).
In a preferred embodiment the method includes the additional step after step (iii) of confirming the male association of fragment(s) by PCR and electrophoretic analysis of equivalent genomic DNA samples from a number of individual male and female animals. Preferably the method also includes an additional step after step (iv) of confirming the isolation of individual male-associated fragment(s) by hybridization of the labelled said fragment(s) with samples of male and female genomic DNA.
The terms xe2x80x9ccomprisexe2x80x9d, xe2x80x9ccomprisesxe2x80x9d and xe2x80x9ccomprisingxe2x80x9d as used throughout the specification are intended to refer to the inclusion of a stated component or feature or group of components or features with or without the inclusion of a further component or feature or group of components or features.
The present invention will now be described, by way of example only, with reference to the following non-limiting drawings and examples.