Large-scale DNA sequencing in diagnostic and prognostic applications has expanded rapidly as its speed has increased and its per-base cost has decreased, e.g. Ding et al, Nature, 483(7382): 506-510 (2012); Chiu et al, Brit. Med. J., 342: c7401 (201.1); Ku et al, Annals of Neurology, 71(1): 5-14 (2012); and the like. In particular, profiles of nucleic acids encoding immune molecules, such as T cell or B cell receptors, or their components, contain a wealth of information on the state of health or disease of an organism, so that the use of such profiles as diagnostic or prognostic indicators has been proposed for a wide variety of conditions, e.g. Faham and Willis, U.S. patent publication 2010/0151471; Freeman et al, Genome Research, 19: 1817-1824 (2009); Boyd et al, Sci. Transl. Med., 1(12): 12ra23 (2009); He et al, Oncotarget. (Mar. 8, 2011).
Patients treated for many cancers often retain a minimal residual disease (MRD) related to the cancer. That is, even though a patient may have by clinical measures a complete remission of the disease in response to treatment, a small fraction of the cancer cells may remain that have, for one reason or another, escaped destruction. The type and size of this residual population is an important prognostic factor for the patient's continued treatment, e.g. Campana, Hematol. Oncol. Clin. North Am., 23(5): 1083-1098 (2009); Buccisano et al, Blood, 119(2): 332-341 (2012), Consequently, several techniques for assessing this population have been developed, including techniques based on flow cytometry, in situ hybridization, cytogenetics, amplification of nucleic acid markers, and the like, e.g. Buccisano et al, Current Opinion in Oncology, 21: 582-588 (2009); van Dongen et al, Leukemia, 17(12): 2257-2317 (2003); and the like. The amplification of nucleic acids encoding segments of recombined immune receptors (i.e. clonotypes) have been particularly useful in assessing MRD in leukemias and lymphomas, since such clonotypes typically have unique sequences which may serve as molecular tags for their associated cancer cells. However, not all clonotypes encode highly diverse receptor segments, such as V(D)J segments. Clonotypes not infrequently encode receptor segments of lower diversity, such as DJ segments, or segments that form recurrent motifs, preferentially represented for possible developmental or functional reasons, and thereby being relatively common among different individuals, e.g. Gauss et al, Mol. Cell. Biol., 16(1): 258-269 (1996); Murugan et al, Proc. Natl. Acad. Sci., 109(40): 16161-16166 (2012); Venturi et al J. Immunol., 186: 4285-4294 (2011); Robins, J. Immunol., 189(6); 3221-3230 (2012). In either circumstance, such clonotypes may be suboptimal, or even fail, as markers for assessing MRD.
In view of the foregoing, it would be highly advantageous if a method were available for assessing clonotypes for rarity or uniqueness in order to determine whether they are likely to provide an accurate measure of minimal residual disease, with a low likelihood of false positive outcomes.