It is known that there are simple nucleotide sequences in the human genome that can occur in different numbers of repeats in different individuals, giving rise to a range of different alleles or variants of different length that can be used as genetic markers to typify the DNA of an individual.
Tandem repeat minisatellite and microsatellite regions in vertebrate DNA frequently show high levels of allelic variability in the number of repeat units. These highly informative genetic markers have found widespread applications in population genetics, forensic science, medicine and other natural scientific studies. For example, these markers can be used for linkage analysis, determination of kinship in paternity and immigration disputes and for individual identification in forensic medicine. In a minisatellite system, a core DNA sequence unit is usually 15 or more base pairs. To date most studies and applications of such systems have relied on Southern blot estimation of allele length, which requires at least 50 ng of relatively undegraded DNA. It is often very difficult to extract such large amounts of DNA from many forensic samples such as blood and semen stains.
Microsatellites, on the other hand, are short tandemly repeated (STR) polymorphic DNA sequences which are most commonly in the form of dinucleotide repeats such as (dC-dA)n, but can also be trinucleotide and tetranucleotide repeats. For a further discussion, see Pena. S. D. J. and Chakraborry, R. (1994). Paternity testing in DNA era. Trends in Genetics Vol.10, 204-209. Microsatellites can be amplified using the polymerase chain reaction (PCR) and the resulting ampileons normally range from 80-800 base pairs (bps) in length and so are well suited to processing in automated sequencing machines which are now widely used for gene scanning and typing. (See Read, P. W. et al (1994), Chromosome-specific microsatellite sets for fluorescence based, semiautomatic genome mapping. Nature Genet. 7,390-395.) To date, most microsatellite polymorphisms have been based upon dinucleotide repeats. Because of the very small size difference between adjacent alleles, some of the results can be difficult to interpret. Tri and tetranucleotide repeats are easier to use but occur less frequently in the human genome. Expansion of trinucleotide repeat sequences has also been implicated in a number of genetic diseases, including Huntingdon's disease, fragile X syndrome and myotoaic dystrophy.
The present invention is based on the discovery in the human inducible nitric oxide synthase (iNOS) gene of a pentanucleotide repeat (CCTTT/GGAAA)n. The repeat is located approximately 2.8 kb 5' end of upstream promotor region of the iNOS gene on 17q11.1-q11.2. Investigations have shown this pentanucleotide repeat (which is referred to for convenience as Xu-1) occurs in widely varying numbers in different individuals; so far, 12 different variants or alleles have been detected, having between 7 and 18 contiguous Xu-1 repeats. The different alleles are referred to as A7, A8 . . . A18. Because the Xu-1 repeat is highly polymorphic in the human population, the repeat leads itself to use as a microsatellite marker with uses in, for example, forensic medicine, population studies, family linkage studies and disease diagnosis.