Altered DNA copy number is one of the many ways that gene expression and function may be modified. Some variations are found among normal individuals, others occur in the course of normal processes in some species, and still others participate in causing various disease states. For example, many defects in human and non-human animal development are due to gains and losses of chromosomes and chromosomal segments that occur prior to or shortly after fertilization, whereas DNA dosage alterations that occur in somatic cells are frequent contributors to cancer. Therefore, detection of such aberrations, and interpreting them within the context of broader knowledge, facilitates identification of critical genes and pathways involved in biological processes and diseases, and provides clinically relevant information, such as in identifying efficacious drug regimes.
One obstacle in medical genetics has proven to be “ascertainment bias”, which refers to an inherent skewing ascribed to data, because of the manner in which such data is collected. Several examples of ascertainment bias are known. Indeed, many of the ‘classical’ patients described in the relevant art actually represent the more severe end of the spectrum, because such patients were much more likely to seek medical attention and therefore be observed. For example, the classical descriptions of human patients with Klinefelter syndrome (47,XXY) depict a mentally retarded male with gynecomastia (breast development) and infertility. In fact, however, an unbiased population survey reveals that 1:1,000 men have this syndrome and 80% of them have neither significant mental retardation nor gynecomastia (although all are infertile). Similarly, it was originally determined that the majority of females with Turner syndrome (45, X0) had mental retardation. However, this determination was also proven false, as those in the art had identified only the most severely affected patients. Indeed, because of ascertainment bias, cytogeneticists conducted large studies on unselected newborns, so that the true rate of chromosomal abnormalities could be more rigorously investigated. Of course such studies required prohibitively labor and time intensive cytogenetic analysis, but the researchers realized that data must be obtained from a relatively large number of individuals to provide a reference population.
Conceptual and technical developments in molecular cytogenetics are now enhancing the resolving power of conventional chromosome analysis techniques to levels that are unprecedented. Over the past several years array comparative genomic hybridization (array CGH) has demonstrated its value for analyzing DNA copy number variations. Array CGH (Comparative Genomic Hybridization), is a new technology that has the capacity of examining chromosomes at a much higher resolution than standard cytogenetics techniques. It is clear that array CGH technology will emerge as the dominant tool for diagnostics in the 21st century: a fundamental requirement for every cytogenetics and diagnostic reference lab as well as for the researchers focused on genetic research within academia, biotechnology and pharmaceutical industries.
Copy-number variation presents an important opportunity in medical genetics. The importance of normal copy-number variation involving large segments of DNA has been unappreciated, until now. Although array CGH has established the existence of copy number polymorphisms in human and non-human animal genomes, the picture of this normal variation is incomplete. In results reported to date, measurement noise has restricted detection to polymorphisms that involve genomic segments of many kilobases or larger, genome coverage has been far from comprehensive, and the population has not been adequately sampled.
A comprehensive understanding of these normal variations is of intrinsic biological interest and is essential for the proper interpretation of array CGH data and its relation to phenotype. Furthermore, understanding the copy number polymorphisms that are detectable by a particular array CGH technique is important so that normal variations are not falsely associated with disease, and, conversely, to determine if some so-called normal variation may underlie phenotypic characteristics such as disease susceptibility.
As such, the intense utilization of array CGH technology is driving the essential need for understanding normal variation throughout human and non-human animal populations. The present invention provides compositions and methods that fill this unmet need for understanding normal variation thus facilitating personalized genetic based evaluation and treatment. However, copy number abnormalities or variations currently represent an enormous untapped opportunity in the field of predictive personalized medicine. These copy number variations, also called copy number polymorphisms, occur in both normal situations as a part of the changes that have occurred within populations of individuals but also occur in disease states. Being able to distinguish between normal copy variations and those associated with a disease would permit a more accurate diagnosis based on a genetic analysis.
It is believed that copy number abnormalities are key genetic components which will be used to diagnose disease, as well as differentiate pharmaceuticals for drug efficacy and adverse reactions in an individual. Since many disorders can be associated in at least some cases with very rare variants, it is necessary for the size of the database utilized for such genetic analysis to be large. Utilizing a smaller database can provide absolutely incorrect results leading to erroneous diagnosis and treatment.
For example, a chromosome 8q24.3 microdeletion was first detected in a patient with a rare pediatric syndrome, Kabuki Make Up syndrome. The investigators at the time did not yet appreciate how frequent such variants were. Kabuki make-up syndrome (KMS) is a multiple malformation/mental retardation syndrome that was described initially in Japan but is now known to occur in many other ethnic groups. However, the immediate temptation was to conclude that this variant was associated with the disorder being investigated. Further investigations revealed the microdeletion to be present in a small percentage of Caucasians, none of whom suffered from Kabuki syndrome. There have been 13 chromosomal abnormalities associated with KMS. However, no common abnormalities or breakpoints that possibly contribute to positional cloning of the putative KMS gene(s) are known (Matsumoto et al. 2003). Although clinical manifestations of KMS are well established, its natural history, useful for genetic evaluation and advice, remains to be studied.
Because of the magnitude of the number of variations that exist in the genetic material and the existence of normal copy number abnormalities, sophisticated analysis tools are required to interpret the results of any genetic evaluation. There is thus the need for methods, and tools, such as variation knowledge management tools of the present invention, to permit an accurate diagnosis of a sub-microscopic chromosomal variant.