The present invention is particularly applicable to a method for rapidly sequencing linear and/or previously linearized and ordered biological sequences and it will be discussed with particularly reference thereto. However, the invention has broader applications which will become apparent upon the reading of the specification in conjunction with the drawings.
The current methods of determination of both proteins and nucleic acids are arduous to carry out.
For the determination of proteins, the Edman method may be mentioned in particular.
For determination of the sequence of a nucleic acid, there are broadly two major types of method at the present time: chemical methods and enzymatic methods.
The chemical method, represented essentially by the Maxam and Gilbert method, is based on obtaining fragments whose size enables the position of one of the four bases to be defined. Before any determination, the DNA is purified in single-stranded form and then labelled at one of its ends with .sup.32 P. It is then separated into four subsets, which each undergo a different chemical treatment which alters one type of base in an absolutely specific manner (modification of the base, removal of the modified base and cutting of the strand at the sugar residue) and brings about cleavage of the said sequence into two fragments. The chemical parameters are chosen in such a way that each fragment has on average a single base modified and then removed. After cleavage, each subset hence contains a population of labelled fragments of variable size, all having the same end and terminating in the same type of base in the sequence.
After electrophoretic separation and visualization by autoradiography, the length of the segment observed is equal to the distance between the point of cutting and the point of labelling. Since the resolution of the polyacrylamide gels used is one base, the linkage of the fragments on the autoradiogram corresponds to the linkage of the bases in the DNA fragment.
Reading is direct, and a sequence of 100 to 400 bases may be read on a single gel.
This method is best suited to fragments of the order of 500 bp, present in fairly large amounts and for which the secondary structures are significant.
The enzymatic method, essentially represented by the Sanger method, consists in synthesizing with a suitable enzyme (polymerase or reverse transcriptase) the strand complementary to the strand which it is desired to sequence and which must be integrated in a single- or double-stranded vector capable of replicating.
The synthesis of the second strand is accomplished in the presence of the 4 deoxyribonucleotides (dATP, dCTP, dTTP, dGTP), at least one of which must be radioactively labelled, and of a primer which will hybridize with a region immediately upstream of the fragment to be sequenced.
The principle of this technique consists in adding a different specific dideoxynucleotide (ddATP, ddCTP, ddTTP or ddGTP) into four reaction tubes. These ddNTPs may be incorporated in the chain undergoing elongation but not possessing a hydroxyl group at the 3'; their incorporation stops the elongation of the strand. Each reaction tube hence contains, at the end of the reaction, a population of fragments of variable size, all having the same 5' end and terminating in the same dideoxynucleotide (ddNTP). The size of the fragments synthesized, and hence the size of the fragments detectable by autoradiography after electrophoresis, is equal to the distance between the beginning of the primer and the base at which replication has stopped.
It is possible by this technique to determine sequences of 300 to 800 bases, depending on the lengths and the quality of the polyacrylamide gels used for the separation.
Other methods have been proposed to enable the steps of subcloning to be reduced (sequencing of the double-stranded fragment, use of PCR, shot-gun sequencing, for example).
The shot-gun method, in particular, consists in cloning fragments of the gene to be sequenced at random in M13 without having any information about their organization in the genome.
The fragments are then sequenced and their organization is worked out by cross-checking using a computer; however, this method necessitates the direct sequencing of at least 20% of the sequences which it has not been possible to position by cross-checking.
In view of the difficulty of obtaining the missing sequences, more efficacious adaptations have been developed; brief mention may be made of the method of sequencing by cloning and enzymatic deletions, and the method of sequencing by transposon-induced deletions.
However, all these methods are arduous to carry out.
International Application PCT WO 89/03432 describes, for its part, a method of rapid sequencing of DNA and RNA, in which a single-stranded DNA or RNA fragment, bound where appropriate to a solid support, is placed in a flowing liquid and is cleaved with an exonuclease, starting from one of its ends, so as to form a succession of bases in the flowing sample. The bases are then detected during their sequential passage through a detector, to reconstruct the base sequence of the said DNA or RNA fragment to be determined. In a particular embodiment, the strand complementary to the sequence to be determined is synthesized in the presence of modified nucleotides each possessing a specific characteristic (different fluorescence). The synthesized fragment is then placed on a solid support in the flowing liquid sample and the different nucleotides which are identifiable are cleaved sequentially and detected.
However, in such a method, not all of the bases can, in fact, be labelled, in particular for reasons of steric hindrance.