The present invention concerns a method for categorising nucleic acid. In particular, the invention concerns a method for sorting nucleic acid, which method permits reduction in the complexity of a nucleic acid population of approximately one order of magnitude, or more. The invention also relates to a kit for carrying out the above method.
Analysis of nucleic acids is fundamental to much of modem molecular biology. A particular feature of nucleic acids derived from living organism is that they are almost invariably complex populations of sequences present in widely varying quantities. In order to characterise these populations of nucleic acids it is usual to attempt to reduce the complexity of the population of nucleic acids in some way. Traditionally the approach has been to clone complex nucleic acid molecules into vectors to allow them to be isolated and either sub-cloned further or analysed directly. Cloning requires the use of biological hosts and these are often difficult to use and require a great deal of specialist knowledge for the cloning procedures to be successful. The traditional processes of cloning to generate libraries of sequences are also only partially automatable.
A problem which cloning does not address is how to isolate sequences which are present only at low copies in backgrounds of sequences present at high copy numbers. Various techniques have been developed to xe2x80x98normalisexe2x80x99 complex nucleic acid populations prior to cloning in order to increase the quantities of sequences at low copy numbers relative to those at high copy numbers. Subtractive hybridisation methods have been used to try and normalise cDNA populations.
PCT/GB93/01452 describes methods of molecular sorting which uses restriction endonucleases that generate ambiguous sticky-ends in the nucleic acid sanple to be sorted. Adapters are designed with sticky ends complementary to a single sticky-end sequence or a subset of the these ambiguous sticky ends such that the individual sticky end or subset thereof is coupled to a distinct sequence in the double stranded region of the adapter. This allows subsets of the adaptored nucleic acid to be amplified using specific primers corresponding to sequences within the adapter which in turn relate to the sequence of the sticky end of the adapter. U.S. Pat. No. 5,508,169 (issued Nov. 7, 1995) describes methods very similar to those disclosed in PCT/GB93/01452.
A problem with the above method is that the nucleic acids can be sorted only according to the sequence present on the sticky-ends of the nucleic acid. The sticky-end sequence is of limited length, as determined by the choice of restriction enzyme, thus the basis for sorting is limited.
It is an object of the present invention to provide a method which overcomes the above problems, and provides a wider basis on which sorting of nucleic acid populations can be carried out, not limited by the sticky-end sequence. It is also an object of this invention to provide methods to reduce the complexity of nucleic acid populations by allowing them to be sorted into sub-populations without cloning and to permit normalisation of these populations. This invention describes methods of sorting nucleic acid molecules that have a variety of applications including gene expression profiling, preparation of templates for sequencing, linkage analysis, etc. This invention provides methods of generating sorted libraries. In many applications it is preferable that these sorted nucleic acids be captured on a solid phase support.
Accordingly, the present invention provides a method for categorising nucleic acid, which method comprises producing a nucleic acid population by action of an endonuclease on double-stranded nucleic acid, such that each nucleic acid in the nucleic acid population has a double-stranded portion, contacting the nucleic acid population with one or more oligonucleotide sequences, and isolating nucleic acid which correctly hybridises to an oligonucleotide sequence, wherein each oligonucleotide sequence has a pre-determined recognition sequence, the nucleic acid being categorised by its ability to correctly hybridise to oligonucleotide sequences having the recognition sequence, the recognition sequence being situated such that it recognises a sequence in the double-stranded portion of the nucleic acid, one or more different recognition sequences being represented in the oligonucleotide sequences.
The present invention also provides kit for categorising a nucleic acid, comprising one or more adaptors and one or more sets of oligonucleotide sequences, wherein the adaptors comprise nucleic acid having a double-stranded primer portion of a known sequence and a single-stranded portion of a pre-determined length, either each single-stranded portion of each nucleic acid in the adaptors having the same pre-determined sequence or all possible sequences of the single-stranded portion being represented in the adaptors, and wherein each oligonucleotide sequence comprises a first sequence, a second sequence attached to the first sequence and a third sequence attached to the second sequence, in which the first sequence is complementary to the sequence of the primer portion of the adaptor, the second sequence is the same sequence as the single-stranded portion of the adaptors or all possible second sequences of the same length as the single-stranded portion of the adaptors are represented within the set of oligonucleotides, and the third sequence comprises a pre-determined recognition sequence.