A case control association study may be used to identify genetic markers that are correlated with a phenotypic trait of interest, such as a predisposition to a disease, condition, illness, response to a drug, or other physical characteristic. The most useful and interesting associations are those that are functionally linked to the trait of interest. If case and control groups are not carefully matched to have similar genetic compositions, then studies may find spurious associations that have no causal relationship with the trait of interest.
Association studies have frequently been criticized for their inability to distinguish causal associations from spurious ones due to incomplete matching of cases and controls. Accordingly, further inquiry and analysis are always needed after completing an association study to understand the biological bases of any putative associations, and to determine which are likely to be functionally relevant. Following up on spurious associations can be difficult and expensive.
To reduce the likelihood of spurious associations, some studies have employed homogenous groups of individuals having the same ancestry (ethnicity), in some cases an ethnicity with relatively little genetic variation such as groups from Northern Europe and Iceland. The homogeneity of such groups may reduce the incidence of “false positives,” i.e., spurious associations between genetic markers and phenotypes that are in fact due to systematically unequal incidence of genetic markers in case and control groups. However, there are various disadvantages to using homogeneous groups in an association study. For instance, in some homogeneous populations, it may be difficult to find enough individuals with the phenotypic trait of interest, such as a particular disease, or it may be difficult to recruit enough individuals willing to participate in a given study. Additionally, the set of predisposing genetic factors in a restricted population may be less likely to be generalizable to or replicable in other populations. Hence, the usefulness of data derived from these studies may be limited.
Accordingly, nonhomogeneous groups (e.g., groups having a mixture of genetic or ancestral backgrounds) are frequently used in association studies. This mixture can occur as a result of combining individuals of geographically or genetically isolated populations in a study group, and/or having individuals of mixed ancestry in such a group. As indicated, study designs having groups of mixed backgrounds can present difficulties when trying to identify causal associations if the incidence of the phenotype of interest is correlated with genetic loci that are not causally-related to the phenotype of interest but whose correlation is simply due to the unmatched nature of the populations being compared. Association studies have traditionally been designed with only limited attention to matching, often limited to the level of “ethnicity matching” by ascertaining self-reported ethnicities from the participants. This self-categorization may be unreliable and may not accurately reflect the detailed population structure of the groups being studied.