Complex crop traits such as yield, stress tolerance, metabolite composition and related phenomena such as heterosis and combining ability are difficult to study due to their quantitative genetic nature and strong interaction with the environment. In addition, the genetics of such traits/phenomena caused by them is very often also complex and mostly quantitative and polygenic, which means that the resulting phenotype is caused by the interaction of the different alleles that are encoded by different genetic loci.
Attempts to characterise the individual loci that contribute to a quantitative trait have been successful when each of the individual loci has a measurable contribution to the total effect irrespective of the presence or absence of alleles of the other loci, which contribute to the quantitative trait. In this case the individual QTLs as they are called are of an additive nature and inherit in a simple Mendelian fashion.
Several methods for QTL mapping have been extensively described, however most of these methods fail when phenotypes are caused by the interaction of numerous heterozygous loci, especially when such loci are interdependent. This means that two or more specific loci need to be present simultaneously for the expression of a specific trait. In the absence of the required alleles on either of the two loci the phenotypic trait will not be expressed. The individually required alleles can occur either in homozygous or heterozygous form. Depending on the specific trait, different genetic constitutions of the loci may be required. For instance a measurable effect is only observed when 2 or more loci are present in heterozygous state and no effect is observed when either locus is homozygous. In such a case, one could state that such loci are interdependent.
As mentioned before, complex traits such as yield and stress tolerance, are of high industrial importance, and therefore, it is highly desirable to have tools like molecular markers linked to these complex traits, which allow for increased efficiency of breeding for such traits in different crops.
Contemporary plant breeding is routinely using genetic (molecular) marker technologies such as AFLP, RAPD's, SSR's, SNP's etc, for a review see e.g. Lakshmikumaran, T. et al., Molecular markers in improvement of wheat and Brassica. In: Plant Breeding—Mendelian to Molecular approaches. H. Jain and M. Kharkwal (eds.) Copyright 2004 Narosa Publishing House, New Delhi, India, page 229-255.
Molecular markers are very desirable as diagnostic tools that indicate the presence of a particular trait even in a developmental stage during which the trait is not expressed. In addition, molecular markers are insensitive to environmental conditions.
As an example, molecular markers (for example in the form of SNP=single nucleotide polymorphism, or associated with DNA bands on agarose or polyacrylamide gels) can be found that are genetically linked to genes that are responsible for the colour of pepper fruits when they are ripe. A DNA sample taken from a seedling can be used to determine which colour the fruits of the plant will eventually have. So in this case there is a direct association between the presence of a particular DNA sequence that is being “called” and the presence of a particular trait.
In essence, the same procedure is true for many polygenic traits (see e.g. Tanksley S., Mapping polygenes, Annu. Rev. Genet. 1993, 27: 205-233). In the latter case, the trait, whatever it may be, for instance disease resistance, resistance to stress, production of vitamins etc., may be controlled by more than one locus. It is assumed that the contribution of every individual locus, and its associated DNA marker can be measured and that the sum of the different loci and their respective DNA markers will phenotypically result in the presence of the particular trait (to some extent). This concept traces back to the classical work of R. A. Fisher (The correlations between relatives on the supposition of Mendelian inheritance, Trans. R. Soc. Edinb. (1918) 52, 399-433), who linked Mendelian genetics with earlier statistical approaches of correlation between relatives, to explain quantitatively inherited traits.
Eukaryote chromosome mapping by recombination is a well know technique for the person skilled in the art (Griffiths A J F et al., (2005) Eukaryote chromosome mapping by recombination, In: Introduction to Genetic Analysis, 8th edition. W. H. Freeman and Company, New York p 115-137).
The mapping of segregating traits, i.e. QTL-mapping (QTL=Quantitative Trait Locus), is not solely dependent on technical issues or recombination but equally important is the accurate observation or scoring, qualitatively or quantitatively, respectively, of the phenotype. In this respect, when mapping complex traits or effects, the person skilled in the art is preferentially using a population of doubled haploid lines (DH) or a population of recombinant inbred lines (RIL), which are segregating for the trait(s) of interest, and which are derived from a single F1 plant.
DH-lines are derived directly from the haploid F1-plant gametes, by plant regeneration and chromosome doubling. RILs are highly inbred lines, derived by single seed descent (SSD), i.e. via inbreeding over several generations, where each individual plant provides one seed for the next generation, starting in the F2.
Alternatively, so called Near Isogenic Lines (NIL) are used. NILs are homozygous lines that differ for a small DNA fragment. They are usually derived from backcrosses, but can also be obtained from segregating RILs (Tuinstra et al., (1997) Theor. Appl. Genet. 95: 1005-1011).
DH-lines, RILs and NILs greatly contributed to contemporary genetics and genetic mapping. The advantage of such pure lines exactly lies in the fact that phenotypic variation between lines (inter-line variation) is easily recorded as compared to segregation at the level of individual plants for a classical F2 mapping population. The availability of pure lines is of course increasingly important as also environmental influence may be accounted for by replication of genetically identical plants of a pure line. This in contrast to single, unreplicated, individual F2-plant phenotypes that are the product of the interaction of genes and environment.
The elucidation of complex effects such as heterosis or combining ability between lines is one of the biggest challenges for contemporary genetics and plant breeding. For heterosis, several hypotheses have been formulated (see e.g. Birchler J et al. (2003) The plant Cell 15, 2236-2239). The so-called historical explanations for heterosis are “overdominance” and dominance. Overdominance refers to the idea that allelic interactions occur in the hybrid such that the heterozygous class performs better than either homozygous class. Dominance refers to the situation in which the suboptimal recessive allele of one parent is complemented by the dominant allele of the other parent. Whereas heterotic effects explained by dominance can in principle be fixed in a homozygous state, it is obvious that for effects explained by overdominance this is impossible. It has recently become clear that the two competing single-locus explanations for heterosis are insufficient and that also epistatic effects, i.e. inter-locus interactions, play a major role as the genetic basis of heterosis (Yu S B et al. (1997) Proc. Natl. Acad. Sci. USA 94: 9226-9231).
As mentioned before, traditionally used mapping population structures with homozygous individuals, such as Recombinant Inbred Line populations (RILs) and Doubled Haploid (DH) populations, cannot easily be applied for mapping the specific effect of the heterozygous state at a certain locus. This disadvantage has been overcome by crossing these populations with testers, and assessing the phenotypes of the offspring hybrids. However, this approach has three disadvantages. First, it requires additional labour, space and time. Furthermore, it compares the heterozygous state of a locus with only one of the two possible homozygous states, unless at least one additional tester is used. And finally, it does not fully assess the interaction between the heterozygous locus and the genetic background, i.e. gene interaction with specific effects due to heterozygosity.
The use of diallel mating populations, as proposed by Charcosset et al. (1994) and Rebaï et al. (1994) (both in: Biometrics in plant breeding: applications of molecular markers; Eds: Ooijen J. and Jansen J. CPRO-DLO, Wageningen, The Netherlands), overcomes part of the latter two disadvantages, but requires even more labour and space.
F2- and back-cross populations can be applied to assess for mapping the specific effect of the heterozygous state at a certain locus. However, only limited gene interaction is allowed in the F2-based QTL-analysis, because of the available parameter space in the statistical model, which is limited by population size. Backcross populations require more time and labour to produce them, and the effect of the heterozygous state at a certain locus is only estimated for the genetic background of the recurrent parent, without taking into account possible interactions with other loci.
Another approach to avoid large investments in time, space and labour to develop mapping populations is linkage disequilibrium mapping (LD-mapping; Kraakman A T W et al. (2004) Genetics 168, 435-446; Kraft T et al. (2000) Theor. Appl. Genet. 101, 323-326). This method makes use of available existing genetic material, such as varieties and genebank accessions. If this material is sufficiently heterozygous, for example a mapping set of hybrid varieties, it is possible to estimate the specific effect of the heterozygote loci. However, in general LD-mapping methods do not consider epistatic effects and require large numbers of accessions to detect additively working QTLs within the statistical noise caused by the epistatic effects, which are due to the different genetic backgrounds across all accessions. (Flint-Garcia S A et al. (2003) Annu. Rev. Plant Biol. 54, 357-374).
Traits that are dependent on the combination of the allelic constitution of two or more loci are much more difficult to identify or map. In population genetics this interaction between several loci is called ‘epistasis’. In this case the contribution of one locus is only measurable in a certain allelic constitution of another or a third or fourth etc. locus.
In a simple theoretical case one could imagine that a homodimeric enzyme that is encoded by a specific gene (1 locus) may be more effective in catalysis if the dimers are slightly different (1 locus but 2 alleles in the heterozygote) so that, for instance, a more effective catalytic site is formed. In this case AA′ is superior in catalysis as compared to AA or A′A′. In addition to that, it is well possible that in a biosynthetic pathway this enzyme encoded by the “A” gene (whatever the genetic composition may be) could be dependent on the catalysis of another enzyme that is upstream or downstream of the particular enzyme in the cascade. It can be easily understood that if the enzyme “A” becomes more efficient, that this increase in efficiency can only be effectively executed if there is no limitation in the substrate that “feeds” the “A” encoded enzyme. In the case the substrate used by the A enzyme is provided by another enzyme (B) whereby the same rule is true (homodimeric enzyme improved by 2 alleles) then improvement is only obtained by the combination. In that case AA′/BB′ is better than AA/BB′ or AA′/BB and all the other combination where both heterozygous states would be absent.
If, on the other hand, the output of the pathway is not limited anymore by the step that A is controlling, but if an enzyme downstream of A constitutes the limiting step, than the effect of different alleles of A is not measurable and so the locus that is responsible for the enzyme downstream of A is epistatic to A.
A well know example where heterozygote individuals are superior versus homozygous individuals is sickle-cell anemia. Investigation into the persistence of an allele that is so obviously deleterious in homozygous individuals led to the finding that the allele confers a small but significant resistance to lethal forms of malaria in heterozygous individuals. Natural selection has resulted in an allele population that balances the deleterious effects of the homozygous condition against the resistance to malaria afforded by the heterozygous condition.
It is obvious that superior heterozygosity and epistasis may be present simultaneously and the effects described for homodimers can also be valid for heteromultimers.
In conclusion, this means that the contribution of one particular locus on its own cannot easily be measured or visualized, because at least part of the contribution of the individual locus is non-additive and interacting with the allelic state of one or more other loci. Therefore QTL mapping of epistatic traits cannot easily be done by traditional methods, which are generally assuming additivity between loci. Incorporation of inter-locus-interactions in these methods often results in problems with statistical parameter estimation and low power to detect QTLs, due to the high parameterization of the genetic models used for this purpose.
Alternative methods trying to solve this problem, are QTL× genetic background mapping, which is applied on diallel mating populations (Charcosset A et al. (1994) pp 75-84 and Rebaï A et al. (1994) pp 170-177, both in: Biometrics in plant breeding: applications of molecular markers; Eds: Ooijen J and Jansen J. CPRO-DLO, Wageningen, The Netherlands), and QTL× population-mapping, applied on multiple related inbred-line crosses (Jannink J-L & Jansen R (2001) Genetics 157: 445-454). The latter state that such methods can also be applied to other populations structures.
An interesting population structure for the purpose of detection of epistatic interactions is the Heterogeneous Inbred Family (HIF) (Haley S et al., (1994) Theor. Appl. Genet. 88, 337-342; Tuinstra M et al., (1997) Theor. Appl. Genet. 95: 1005-1011), because of its ‘multiple ceteris paribus’ property, i.e. the family contains many possible sub-populations, where in each of them only one QTL is segregating in a specific homozygous background for the other QTLs.
The construction of HIF-populations is very tedious. It takes several generations of single seed descent, which means it is slow and labour-demanding. By the time the HIF-population is completed the chosen population parents may not be up to date anymore. Facilities or alternative locations to decrease generation time require high investments. Also considering the fact that QTL-alleles of only two parents are analysed, it is often not worth to invest in such populations for commercial breeding purposes.
A more pragmatic approach for QTL-mapping in the presence of epistasis is presented in U.S. Application No. 2005/0015827. The position and effect of QTLs in the background given as it is in the ongoing breeding program is recurrently monitored. No specific population structure is applied, as in linkage disequilibrium mapping (see below), and changes in position and effect of QTLs are accepted as a fact of life. The main disadvantages of this method are the high number of accessions that have to be analysed and the lack of analytical power to establish specific epistatic effects. In other words, it is not analysed which specific locus in the genetic background is interacting with the changing QTLs.
A more radical way of avoiding epistatic effects in QTL-analysis is the use of backcross populations. In this way QTL-effects can be analysed in a more or less constant genetic background, namely that of the recurrent parent. Most backcross population types (for instance backcross inbred lines or BIL's) can be seen as analogues of regular mapping population types where one or more backcrossing generations have been included to create a more uniform genetic background, and in several cases, rule out one of the three allelic states of a locus.
In view of the above it is the object of the present invention to provide a method for mapping traits in organisms, in particular plants, that does not have the above described drawbacks.