Analysis of biological or medical samples often requires the determination of nucleic acid sequences of large and complex populations of DNA and/or RNA, e.g. Gloor et al, PLoS ONE 5(10): e15406 (2010); Petrosino et al, Clinical Chemistry, 55(5): 856-866 (2009); Arstila et al, Science, 286: 958-961 (1999). In particular, profiles of nucleic acids encoding immune molecules, such as T cell or B cell receptors, or their components, contain a wealth of information on the state of health or disease of an organism, so that the use of such profiles as diagnostic or prognostic indicators has been proposed for a wide variety of conditions, e.g. Faham and Willis, U.S. patent publication 2010/0151471; Freeman et al, Genome Research, 19: 1817-1824 (2009); Boyd et al, Sci. Transl. Med., 1(12): 12ra23 (2009); He et al, Oncotarget (Mar. 8, 2011). Such sequence-based profiles are capable of much greater sensitivity than approaches based on size distributions of amplified target nucleic acids, sequence sampling by microarrays, hybridization kinetics curves from PCR amplicons, or other approaches, e.g. Morley et al, U.S. Pat. No. 5,418,134; van Dongen et al, Leukemia, 17: 2257-2317 (2003); Ogle et al, Nucleic Acids Research, 31: e139 (2003); Wang et al, BMC Genomics, 8: 329 (2007); Baum et al, Nature Methods, 3(11): 895-901 (2006). However, the efficient determination of clonotypes and clonotype profiles from sequence data poses challenges because of the size populations to be analyzed, the limited predictability of natural variability in the sequences extracted from samples, and noise introduced into the data by a host of sample preparation and measurement steps, e.g. Warren et al, Genome Research, 21(5): 790-797 (2011).
In view of the potential importance of clonotype profiles for diagnostic and prognostic applications, it would be advantageous to many fields in medicine and biology, if methods were available for overcoming drawbacks of current methodologies for determining clonotypes and clonotype profiles from sequence data.