Cystic fibrosis (CF) is the most common, life-threatening, autosomal recessive disease in the Caucasian population. Approximately 1 in 2,500 live births is affected by this genetic disorder. Obstructive lung disease, pancreatic enzyme insufficiency and elevated sweat electrolytes are the hallmarks for CF but the severity of these symptoms vary from patient to patient. Patients with CF usually die at an early age due to lung infection. With recent advances in clinical treatments, which are directed against the symptoms, the mean survival age for patients has increased to 26 years.
Despite intensive research efforts for the past fifty years, the basic defect in CF remains to be speculative. It is generally believed that the heavy mucus found in the respiratory tracts and the blockage of exocrine secretion from the pancreas are due to imbalance in water secretion which is the consequence of a defect in the regulation of ion transport in the epithelial cells.
The precise localization of the CF locus on the long arm of chromosome 7, region q31, facilitated the recent isolation of the responsible gene. The CF gene spans 250 kilobase pairs (kb) of DNA and encode a mRNA of about 6,500 nucleotides in length. The CFTR gene is disclosed and claimed in U.S. application Ser. No. 401,609 filed Aug. 31, 1989. That application is co-owned by the applicant of this application.
Expression of this gene could be observed in a variety of tissues that are affected in CF patients, for example, lung, pancreas, liver, sweat gland and nasal epithelia. An open reading frame spanning 1480 amino acids could be deduced from the overlapping cDNA clones isolated. The putative protein as noted is called "Cystic Fibrosis Transmembrane Conductance Regulator" or CFTR for short, to reflect its possible role in the cells. The predicted molecular mass of CFTR is about 170,000.
Based on sequence alignment with other proteins of known functions, CFTR is thought to be a membrane-spanning protein which can function as a cyclic AMP-regulated chloride channel. The internal sequence identity between the first and second half of CFTR resembles the other prokaryotic and eukaryotic transport proteins, most notably, the mammalian P-glycoprotein.
The most frequent mutant allele of the CF gene involves a three base pair (bp) deletion which results in the deletion of a single amino acid residue (phenylalanine) at position 508, within the first ATP-binding domain of the predicted polypeptide. Although this mutation (.DELTA.F508) accounts for about 70% of all CF chromosomes, there is marked difference in its proportion among different populations. The remaining 30% of mutations in the CF gene appear to be heterogeneous and most of them are rare, with some represented by only single examples, as referenced in applicant's Canadian patent application filed Jul. 9, 1990.
The mutation screening study confirms that the ATP-binding domains detected by sequence alignment is important for CFTR function as multiple, different mutations have been found for many of the highly conserved amino acid residues in these regions. The locations of the various mutations also identified other functionally important regions in CFTR. There is, for example, a section three bp deletion resulting in the omission of an isoleucine residue at position 506 or 507 of the putative protein. While amino acid substitutions at these positions are apparently not disease-causing, this observation argues that the length of the peptide is more critical than the actual amino acid residue in the 506-508 region. Further, the existence of a large number of nonsense, frameshift as well as MRNA splicing mutations in the CF gene implies that absence of CFTR is not incompatible with life.
The varied symptoms among different CF patients suggest that disease severity is at least in part related to the mutations in the CF gene. Such association, which is expected to be concordant among patients within the same family, as they should have the same genotype at the CF locus, is observed for pancreatic function. Approximately 85% of CF patients are severely deficient in pancreatic enzyme secretion, thus diagnosed as pancreatic insufficient (PI), and the other 15% have sufficient enzyme, thus pancreatic sufficient (PS). Family studies showed that there was almost complete concordance of the pancreatic status among patients within the same family, leading to the suggestion that PI and PS are predisposed by the patients' genotypes. Subsequent studies showed that patients homozygous for the .DELTA.F508 mutation were almost exclusively PI. This information may be useful in disease prognosis.
There are other mutations that would be classified in the same group as .DELTA.F508, the so-called severe mutant alleles with respect to pancreatic function. In contrast, patients with one or two copies of other class (i.e. mild) of alleles are expected to be PS. Meconium ileus which is observed in about 30% of CF patients appears to be a clinical variation of PI and not directly determined by the CF genotype. Other clinical manifestations are more complicated and no apparent association has yet been detected.
With the identification of the CF gene, a better understanding of the basic defect and pathophysiology of the disease can now be attained. Progress and advance are being made in studies of the regulatory mechanisms governing the expression of this gene, and of the biosynthesis and subcellular localization of the protein (through generation of antibodies against various parts of the protein). In addition, it is important to develop effective assay systems for the function of CFTR. This information may be useful in development of rational therapies, including gene therapy.
In order to obtain a DNA sequence containing the entire coding region of CFTR, it is necessary to construct a full-length cDNA from overlapping clones previously isolated. A major difficulty has been encountered in the process, however. As the various proportions of the full-length cDNA is being linked together by standard procedures; i.e., restriction enzyme cutting and ligation, with plasmid vector in Escherichia coli, frequent sequence rearrangement has been detected in the resulting construct.
For purposes of better understanding of the regulatory functions of the CFTR protein and also for purposes of gene and drug therapy, it is useful to be able, in a commercial way, to propagate and express the normal CFTR gene and various mutant CFTR genes in a variety of hosts which include bacteria, yeast, molds, plant and animal cells and the like.
Although propagation and expression of the cDNA sequence for the CFTR gene can be achieved in some vehicles, there are, however, the aforementioned difficulties in obtaining stable propagation of the cDNA in some types of bacteria, particularly E. coli. It is thought that the cDNA contains sequence portions which, when propagated in the bacteria, results in a toxic effect which is countered by lack of propagation of the cDNA in the microorganism.