Tandem repeats occur in DNA when a pattern of one or more nucleotides is repeated and the repetitions are directly adjacent to each other. Detection of tandem repeats helps determine an individual's inherited traits and can also determine the individual's parentage. However, detection of tandem repeats can go beyond these uses. In particular, telomeres and sub-telomeres are tandem repeats. Knowing the length of telomeres can have clinical diagnostic applications.
The nuclear DNA in the human genome is partitioned into 23 separate pairs of chromosomes. Each pair of sister chromatids is attached by a protein complex at a central region of the chromosome known as the centromere. The distal regions of each chromatid are known as telomeres, which contain long stretches of repetitive nucleotide sequences at the termini of these linear DNA strands, are found in most eukaryotic organisms. For vertebrates, the repeated nucleotide sequence in telomeres is TTAGGG, the total length of which can be many kilobases (kb) long in humans (Moyzis et al., 1988).
The DNA polymerase protein complex responsible for DNA replication can only add nucleotides to an existing DNA or RNA strand that is paired with the template strand, and can only extend the new DNA sequence in the 5′ to 3′ direction. Thus, replication begins at the 5′ end a short nucleic acid fragment primer that must be bound to the template DNA strand. As a consequence, the polymerase is not able to replicate the sequences at the ends of the chromatid fibers. Consequently, chromatids become shorter with each successive cell division and the information in the telomeric region is lost. Normal human somatic cells such as fibroblasts, endothelia, and epithelial cells, telomeres have been shown to become shorter by 8-33 repeat sequences (50-200 bp) with each cell division event (Blackburn, 2000, 2001). The cumulative loss of telomeric DNA with successive cell divisions is believed to limit the number of times that a cell can divide. In human fibroblasts, this limit occurs after the cell population has doubled 50-100 times. The cells then remain in a quiescent but viable state for several months (Vaziri et al., 1994). Consequently, cell division stops before vital genetic information is lost from the chromosome.
Some types of white blood cells, certain stem cells such as embryonic stem cells, and germ cells can express an active form of telomerase that is capable of adding the repetitive nucleotide sequences to the ends of the DNA (Hiyama and Hiyama, 2007). This enzyme can “reset” the cell to an embryonic state, which restores its ability to undergo cell division. Developing the ability to reactivate telomerase in quiescent somatic cells that restores their ability to undergo cell division has important implications for the restoration of damaged tissues. However, the activation of telomerase is known to contribute significantly to the ability of malignant cells to proliferate and become immortalized.
Conversely, many aging-related diseases are linked to shortened telomeres (Zhu et al., 2011). Eukaryotic telomere ends contain a 3′ single stranded DNA overhang that forms a T-loop (telomeric loop). This loop is stabilized by a triple-stranded DNA structure known as a D-loop (displacement loop) that is also bound to several proteins that forms an end cap. When telomeres become too short there is an increased potential for damage to the end cap that can cause the cell to stop growing or go into senescence (cellular old age). Chromosomal fusions can also result when telomeres are uncapped, which cannot be repaired in somatic cells, and can induce apoptosis (cell death). Such increases in the number of cells undergoing senescence and apoptosis ultimately results in age-related organ deterioration (Aubert and Lansdorp, 2008).
It is clear that the ability to calculate the length of telomeres accurately and in a timely manner will be an important tool for the early diagnosis of cancer and for age-related illness, and also has valuable application in the development of stem cell technologies. To identify precancerous cells, the approach must be able to compute telomere lengths from the DNA of a single cell, and preferably be able to make the computation for each individual chromosome. To be practical, the computational approach requires high-throughput capability for the analysis of large numbers of samples.
Currently, none of the methods available are capable of directly measuring the length of telomeres in single cells without PCR amplification, let alone individual chromosomes (Wang et al., 2013). In addition, the errors in the existing calculations are so large as to limit their usefulness for practical diagnostic applications. Methods currently employed include terminal restriction fragment (TRF) Southern blots, fluorescent in situ hybridization methods known as Q-FISH and F-FISH, as well as PCR and quantitative real-time PCR assays.
1. TRF Computations
Telomere length is most commonly calculated by TRF analysis that provides the average length of fragments generated by complete digestion of genomic DNA with a restriction enzyme that does not cleave nucleic acids composed entirely of tandem arrays of the specific telomeric repeat sequence of interest (Kimura et al., 2010). This approach is only capable of calculating the mean telomere length of all chromosomes and requires large numbers (>105) of cells. In addition, TRF analysis can be confounded by the presence of interstitial telomeric sequences. The answer is calculated by separating the digested DNA fragments by electrophoresis, followed by a Southern blot where the DNA is hybridized to a radio-labelled telomeric probe. The telomeric DNA is then visualized by autoradiography and the answer is calculated from densitometric scans that estimate the amount of DNA in each band.
It is noteworthy that the use of densitometric scans of Southern blots in TRF has similarities to the DNA computing answer determination approach that we initially used to solve the asymmetric fully-connected 15-city traveling salesman problem (Xiong et al., 2009). Consequently, we are very familiar with the limitations in accuracy inherent in this time-consuming approach. Difficulties inherent in the electrophoretic migration of short DNA fragments also limit the ability of TRF to compute the length of short telomeres that are crucial for aging studies.
2. FISH Methods
The FISH techniques to calculate the telomere length can be accomplished with <30 cells and enables the length of individual chromosome arms to be determined. In these approaches, fluorescent protein nucleic acid (PNA) probes are hybridized to the DNA in a group of cells (Lansdorp et al., 1996; Martens et al., 2000; Perner et al., 2003). Fluorescence intensity, which is proportional to telomere length, is then measured using flow cytometry that examines one cell at a time. This time-intensive measurement severely limits the amount of samples that can be examined. The Q-FISH approach requires the use of metaphase cells, which forces the use of cultured cells and severely limits the number of cells available for the calculation (Ferlicot et al., 2003). This requirement also eliminates the ability of the method to determine telomere lengths of many of the most valuable cell types for diagnostic purposes such as post-mitotic, differentiated, and senescent cells.
The answer read-out with FISH is in arbitrary integrated fluorescence intensity units that are difficult to quantitate. Thus, to compute absolute values of telomere length, external calibration using plasmids with cloned telomere repeats of defined length, or cell lines that maintain a defined and known telomere length distribution are required for calibration. The fact that the calculation is based on hybridization imposes a minimum telomere length threshold below which the length cannot be calculated. In some cell lines, the standard deviation of the fluorescent intensity is higher than the entire range of telomere lengths (˜8 kb) (O'Sullivan et al., 2002).
The IQ-FISH method is an adaptation of Q-FISH that measures fluorescence intensity of probes hybridized to telomeres in individual interphase cells using fluorescence-activated cell sorting (FACS) technology (Narath et al., 2005). Following hybridization with fluorescent PNA probes specific to telomeres, the DNA is counterstained to normalize DNA content. The IQ-FISH approach requires accurate measurements of relatively weak fluorescence signals. Marked day-to-day variations in instrument calibration, and in hybridization efficiencies due to the fixatives that are required for cell preparation limit the accuracy and reproducibility of telomere length computations to a range that is greater than the length differences of 2-10 kb typically found in human cells.
3. PCR Approaches
The polymerase-chain reaction will amplify the number of copies of DNA strands along a chosen section of the parent strands defined by the two unique DNA primers bound at each end. Unfortunately, the repeating nature of the short telomeric DNA sequence (TTAGGG)n enables PCR primers to hybridize in myriad combinations staggered along the length of the telomere. As a result, heterogeneous amplification reactions occur simultaneously that make the computation of telomere length extremely difficult.
The PCR-based approach known as STELA (single telomere elongation length analysis) has been developed (Baird et al., 2003) that has higher resolution than other currently available approaches. However, since the length of DNA amplified by PCR is limited to ˜25 kb, longer telomeres cannot be amplified and the method is biased in favor of shorter telomeres. STELA also requires a known sub-telomeric primer binding site, which appears to be species-specific and difficult to obtain. This approach involves the ligation of an oligonucleotide to the 5′ end of the telomere that may end in any of the six nucleotides within the telomeric repeat sequence. To facilitate ligation, six telomerettes must be made and used, each carrying one of the six possible frames of a telomeric repeat at the 3′ end.