This application pertains to construction of pooled biological material such as DNA, RNA, proteins and the like that are able to be screened by a wide variety of methods such as PCR (Polymerase Chain Reaction), DNA/DNA hybridization, DNA/RNA hybridization, RNA/RNA hybridization, single strand DNA probing, protein/protein hybridization and a wide variety of additional methods. References describing many of these methods include “Ausubel et.al. Short Protocols in Molecular Biology, Wiley and Sons, New York” and “Sambrook et.al, Molecular Cloning, A Laboratory Manual, Cold Spring Harbor Press, New York” as well as numerous others and are hereby included by reference. Also included by reference are U.S. Pat. No. 5,780,222 (Method of PCR Testing of Pooled Blood Samples) and its references cited. Also included are U.S. Pat. Nos. 6,126,074 and 6,477,669 and their references including the references pertaining to Veterbi, Reed-Solomon and other Error Correction and Data Compression Coding schemes. This pooling method will allow the incorporation of ‘loss-less information compression and error correction’ or other ‘current art’ error correction strategies to improve the robustness of identification with significantly reduced numbers of samples to be processed by the end user. By having the samples pooled again after collection, it is possible to drastically reduce the manipulations required by the end user while still keeping very fine detail in the identification of the individual samples or populations that were originally pooled. These error-correction methods are well known in the computer data transmission field, but have not been used in the pooling of biological or chemical samples. The use of these methods will allow a large reduction in the number of experiments required to identify the specific biological sample or population containing a region of interest.
This pooled material can be from individuals or a population. In order to reduce the analysis time, materials and expense, the pooling of small high resolution pools in a matrix allows for a lower number of samples to be analyzed. The resulting high resolution data obtained from screening these matrix pools are equivalent to the data obtained if the researcher had analyzed the complete set of small pools (much more expensive, time consuming and difficult). This method also gives the added advantage of having two positive signals needed for identification. This reduces the problems associated with a false positive when only one signal is obtained for identification (as in the Current Art).
This matrix pooling can be just in one superpool. Alternatively, it can be a matrix of a variety of different superpools and/or across a variety of different types of pools to allow the screening of the complete library with just one round of experiments. To do this, each small pool would be added to between 6 and 20 of the collection of re-pooled intermediate or final pools. Then with the total number of pools of between 40 and 100, the complete library (or any set of biological samples) could be screened with high confidence and the ability to resolve multiple hits. If the library had a large redundancy of signal, the total number of pools could be increased to maintain accurate resolving power of the matrix method. The incorporation of positive controls in a matrix pattern can be used for quality assurance and for assisting in deconvolution if desired.
The current state of the art in pooling of biological materials such as Bacterial Artificial Chromosome (BAC) genomic DNA libraries (and other biological or chemical libraries like cDNA libraries, protein libraries, RNA libraries, DNA libraries cellular metabolic libraries and chemical libraries) for screening consist of the collection of all of the indexed microtiter plates containing the BAC library and then forming these plates into a large cube. These indexed plates are generally 96, 384, 864 well or sometimes even 1536 well microtiter plates. This large cube is then transected by a number of different planes (usually 4 to 8) which produce a large number of pools from each plane. This collection of all of the pools from all of the planes are then screened to identify the clones of interest. This scheme is the current state-of-the-art and can identify multiple clone hits with some degree of reliability to identify multiple targets (i.e. BAC clones) at a specific coordinate. According to Klein et al., their scheme with 6 planes in a collection of 24,576 BAC's could detect between 2 and 6 BAC's and over 90% could be reliably assigned to a specific coordinate with 184 screening pools (that is 184 user experiments are required).
Prior art as disclosed in S. Asakawa et al. Human BAC Library: Construction and Rapid Screening, Gene Vol. 191, pp. 69-79 (1979), may disclose some of the initial steps that are similar to the present invention in the Methods I section on page 72 but requires pooling clones before growth and requires construction of each screening pool directly from the pooled clones after growth.
The reason for the present invention accuracy, efficiency and reduced cost is that the present invention requires at least one additional step of repooling the intermediate subpooled genomic DNA clone DNA into a final screening pool, where the individual genomic DNA clone is in between three to ten unique final screening pools or between at least 4 unique Matrix Pools and no more than 8 unique Matrix Pools.
If the BAC Library is from an organism with a genome larger than 1,000 Mb, the researcher may find that there are very few ambiguous hits in the plate, row, column and diagonal (PRCD) plate. The Plate, Row, and Column pools correctly identify the clone of interest without the need for the Diagonal Pools. If the Diagonal Pools are only screened to solve the infrequent ambiguity, there would be a reduction in the number of PCR experiments.
A Bac-Bank is a way of storing fragments of DNA, together constituting the whole genome of an organism. The DNA of an organism is (semi) randomly cut in pieces, and these fragments are inserted into bacteria, which are then plated out so that a single colony grows from a single modified bacterium. Only modified bacteria are allowed to grow by using a bacterium that is potentially resistant to a certain antibiotic, and whose resistance is “switched on” by the presence of a foreign DNA fragment (insert), and by using a growth medium containing the antibiotic. The resulting (potentially) unique colonies of bacteria are then picked up individually and transferred to the wells of 384-well plates, and the resulting stack of plates holding a large number of unique bacteria, ideally containing the whole genome of the original organism, is known as a “Bac-Bank”. It serves as a research database of the genome of the original organism. This database can be searched for fragments of DNA using PCR techniques.
Pooling is a method that allows one to quickly and economically search a Bac-Bank for the presence of certain DNA fragments. A Bac-Bank normally contains a large number of clones (˜100,000), and testing all these clones individually for the presence of a fragment of DNA occurring only a few times (typically less than 100 times) in the original organism's genome is prohibitively expensive and laborious. When pooling is used the DNA of several clones is gathered into a much lower number of wells (pools), every well containing DNA from several clones and every clone's DNA being present in multiple wells. The distribution pattern (“pooling method”, “pooling strategy”,“rule-set”) is designed in such way that when using PCR reactions to screen the pools a pattern (of PCR reaction results) emerges that is (hopefully) unique to the clone(s) having the required properties. A simple example: take a 384-well plate having 16 rows of 24 columns; imagine pooling all wells horizontally and vertically, resulting in 16 row-pools and 24 column pools. If a single clone in this plate has a certain property, only the column-pool and the row-pool that particular clone is in will display a positive reaction when screened; the other 38 pools will be negative. Using only 40 PCR reaction it is therefore possible to pinpoint the positive clone in this 384-well plate; almost a tenfold reduction in labour and cost. As long as there are relatively few individuals with a certain property there is no problem; for properties that are shared among many individuals all pooling methods break down (yield incorrect results, either false-positives or (worse) false-negatives), and when this happens one has to resort to screening the clones individually.
Most often the individual clones in Bac-Bank are identified/labeled according to some hierarchical structure dictated by the physical properties of the Bac-Bank. The number of dimensions of a Bac-Bank is then related to the hierarchical structure of the storage format.
An example: The clones of a Bac-Bank are individually stored in wells on a plate. The wells are arranged in a rectangular pattern of rows and columns. If this plate constitutes the whole Bac-Bank this Bac-Bank can be viewed as one-dimensional if all the wells on the plate have consecutive numbers from left to right and top to bottom. One single parameter (well number) suffices to address every individual clone/well on the plate, and therefore the Bac-Bank is one-dimensional. A more natural approach in this example would be to address each well by its column and row numbers; then we would need two parameters to address an individual well, and therefore the same Bac-Bank can be two-dimensional as well.
For a larger Bac-Bank one plate would not suffice, and we could give each plate a separate ID code. This would add one coordinate to the number of coordinates required to address each individual well, and therefore there is one more dimension this case than there would before a single plate. Another approach is to store the well-plates in boxes of (for example) 10; each plate itself would then have two parameters (coordinates) for an address: the box number and (within that box) the plate number.
All these items are a matter or choice, and therefore the number of dimensions of a Bac-Bank is a choice as well; it is even possible to use several different addressing schemes without imposing any structure upon this number/code, but one may also choose to address an individual well as “[C, 23, 4, A, 6]”, when the clone is located in fridge C, box 23, plate 4, column A, and row 6.
Having said this, it is important to note that it is most convenient to have some sort of logical structure related to the physical location of a clone; this helps you find individual clones faster, and most often there is also a relation between this logical/physical organization and the way the Bac-Bank is pooled. In all other examples we will assume that the Bac-Bank consists of 300 plates of 24×16 wells, and that the Bac-Bank is three-dimensional.
It will be readily apparent that there are a number of important claims that will arise from this disclosure, including but not limited to:    1. Higher resolution deconvolution of complex data without as many analysis reactions.    2. Analyzing a two, three or more-dimensional matrix of pools allows significant reduction in analysis reactions while retaining a high degree of specificity.    3. The incorporation of loss-less compression and error-correction into the pooling strategy allows improved robustness of analysis and identification if individuals from the pools with increased effectiveness while reducing the numbers of analysis.    4. Significantly reducing the number of analysis reactions required from other, less sophisticated pooling systems if a matrix re-pooling design is utilized.    5. As the analytical methods improve, the ability of re-pooling pools (that currently are at the limits of detection) is another significant improvement and advantage.