1. Field of the Invention
The present invention relates generally to data processing systems. More specifically, the present invention is directed to a computer implemented method, system, and computer usable program code for calculating the probability of occurrence of a structured cluster.
2. Description of the Related Art
Consider the scenario, where common gene clusters are extracted from two closely related species such as humans and rats. It is quite possible that in each gene cluster, that there are further clusters within the cluster and so on. One traditional way of computing the probability of the occurrence of a cluster is to ignore the sub-clusters S and simply use the probability of occurrence of each gene in the cluster. The effectiveness of this model is unclear when the number of genes is very large and the number of occurrences of each gene is very small.