It is difficult to determine the degree of variation on the level of single cells in a heterogeneous cell population for single-nucleotide polymorphisms (SNPs), variable sequence regions and splice variants. If bulk nucleic acids are isolated from cell populations, the information which nucleic acid variants were present in which combination in each individual cell is lost.
This information can be important in cases where different nucleic acid variants act together in cells to determine the specific biological behavior of the cells. For example, two mutations in two different signaling pathway molecules can result in malignancy of individual cancer cells, whereas other cells in the same tumor population that carry only one of these mutations are non-malignant. Other examples include the variable T cell receptor alpha and beta chain genes and transcripts, which act together in each T cell to produce T cell receptors that are variable among each T cell, and the immunoglobulin variable heavy chain (VH) and variable light chain (VL) genes and transcripts present in B cells, which act together in each B cell to produce immunoglobulins that are variable among each B cell.
Conventionally, combinations of nucleic acid variations can be analyzed after isolating single cells through simple titration, through cell picking or through fluorescence-activated cells sorting (FACS). Subsequently, the combinations of sequence variants can be analyzed after amplification of the nucleic acids of interest from each individual cell by polymerase chain reaction (PCR) or reverse transcription-PCR (RT-PCR). Analysis methods include nucleic acid sequencing, hybridization on microarrays or quantitative real-time PCR (qPCR). In order to facilitate the analysis of the pairing of nucleic acid variations, different gene sequences or reverse-transcribed RNA sequences can be coupled by overlap PCR, which has been reported to be compatible with water-in-oil emulsions. A method for coupling variable regions of immunoglobulin genes by multiplex overlap-extension RT-PCR from isolated single cells has been described before (U.S. Pat. No. 7,749,697B2). Similar methods have been used by others to clone functional antibody variable domains in the form of single-chain variable fragments (scFv) from natural repertoires such as hybridoma cells and spleen cells.
Another method for coupling of two or more copies of non-contiguous DNA sequences from single cells of a heterogeneous population has been described before (Patent EP597960B1). Gene elements can be coupled inside single cells (in situ “in-cell PCR”) within intact or substantially intact cells after cell fixing e.g. with formaldehyde and subsequent cell permeabilization to ensure access of PCR reagents to the gene elements (Embleton M J, Gorochov G, Jones P T, Winter G. In-cell PCR from mRNA: amplifying and linking the rearranged immunoglobulin heavy and light chain V-genes within single cells. Nucleic Acids Res. 1992 Aug. 11; 20(15):3831-7). Alternatively, cells and PCR reagents can be introduced into aqueous droplets and dispersed in an organic phase as an emulsion, wherein each droplet contains preferentially only one cell, such that gene elements from single cells are coupled together.
The coupling of nucleic acids is a known technique and has been used for the analysis antibodies and their VH and VL genes. EP1516929 and WO2008/104184 describe methods for the analysis of specific antibodies, wherein nucleic acids from single cells were coupled. The coupled nucleic acids were cloned into expression vectors and specific antibodies were identified using ELISA assays, which were then optionally sequenced.
The methods allow the identification of specific antibodies against a particular antigen, possibly in a high throughput manner. However it is not suitable to determine the complete and complex immune status of an organism. Furthermore for the analysis of specific antibodies the method involves FACS sorting, which requires expensive equipment and is time consuming.
The methods however are tedious.
The technical problem underlying the present invention is the provision of an enhanced method that facilitates the analysis of nucleic acid molecules in cases where these molecules act together in a cell.
The technical problem is solved by the embodiments provided herein and as described by the claims, specifically by a method for linking at least two target nucleic acid molecules from a single biological compartment, comprising the steps of isolating a fraction from a sample, wherein the fraction comprises the compartment comprising at least two nucleic acid molecules; diluting said fraction and aliquoting the dilution in multiple separate reaction vessels such that each reaction vessel comprises preferably one compartment, or encapsulating said compartment in emulsion droplets such that each droplet comprises preferably one compartment; linking said at least two target nucleic acid molecules.