The discovery of cis-elements or control elements in non-coding DNA poses a difficult problem in genome analysis. Functional analysis by means of reporter constructs expressed in transgenic organisms is the most reliable method, but is by itself time-consuming and expensive. Searching non-coding DNA for known control elements by sequence analysis is problematic, since protein binding motifs are short, in the range of 8-10 base pairs (bp), and occur frequently by chance. Heretofore, the most reliable sequence analysis method has been the comparison of homologous sequence domains in related but moderately evolutionarily divergent species such as, for example, mouse and human. In such pair-wise combinations, control elements are conserved because they serve a vital function and can be identified by their similar sequences. Single pair-wise comparisons, however, allow the discovery of conserved sequence strings only at low resolution and without specific identity.
Functional analysis is a time-consuming and expensive approach to genomic analysis and a more efficient, rapid method of identifying genomic areas of interest would be beneficial.