This invention relates to novel chiral peptide nucleic acids (cPNAs) which hybridise strongly with complementary nucleic acids. As such they have potential as antigene and antisense agents and as tools in molecular biology.
In our WO 98/16550 we describe PNAs of the formula: 
where
n is 1 or 2-200
B is a protected or unprotected heterocyclic base capable of Watson-Crick or of Hoogsteen pairing.
R is H, C1-C12 alkyl, C6-C12 aralkyl or C6-C12 heteroaryl which may carry one or more substituents preferably selected from hydroxyl, carboxyl, amine, amide, thiol, thioether or phenol,
X is OH or ORxe2x80x2xe2x80x3 where Rxe2x80x2xe2x80x3 is a protecting group or an activating group or a lipophilic group or an amino acid or amino amide or nucleoside,
Y is H or a protecting group or a lipophilic group or an amino acyl group or nucleoside.
When n is 1, these compounds are peptide nucleotide analogues. When n is 2 to about 30 these compounds are peptide oligonucleotides and can be hybridised to ordinary oligo or polynucleotides. Typically the two strands are hybridised to one another in a 1:1 molar ratio by base-specific Watson-Crick base pairing.
These chiral PNAs were shown to interact strongly with complementary DNA or RNA. Such chiral PNAs and their hybrids with oligonucleotides, however, have poor solubility in aqueous media, making biological studies difficult. Attempts to improve the solubility of PNAs have so far met with variable success.
According to the present invention, there is provided a compound of formula (I): 
where
n is 1 to 200,
B is a protected or unprotected base capable of Watson-Crick or of Hoogsteen pairing,
X is OH or ORxe2x80x2xe2x80x3 where Rxe2x80x2xe2x80x3 is a protecting group or an activating group or a lipophilic group or an amino acid or amino amide or nucleoside,
Y is H or a protecting group or a lipophilic group or an amino acyl group or nucleoside and Rxe2x80x2 and Rxe2x80x3 which are the same or different, are H, C1-6 alkyl, aryl, ar(C1-C6)alkyl or Rxe2x80x2 and Rxe2x80x3 together with the carbon atoms to which they are attached form a cycloalkyl ring.
B is a base capable of Watson-Crick or of Hoogsteen pairing. This may be a naturally occurring nucleobase selected from A, C, G, T and U; or a base analogue that may be base specific or degenerate, e.g. by having the ability to base pair with both pyrimidines (T/C) or both purines (A/G) or universal, by forming base pairs with each of the natural bases without discrimination. Many such base analogues are known e.g. hypoxanthene, 3-nitropyrrole, 5-nitroindole, and those cited in Nucleic Acids Research, 1989, 17, 10373-83 and all are envisaged for use in the present invention.
The compounds of formula (I) contain proline of undefined stereochemistry. Although compounds with the trans-stereochemistry may have interesting properties, compounds with the cis-stereochemistry are preferred either with the D-configuration as shown in (II) or the L-configuration shown in structure (III). 
A particular compound is (IIa) of the formula: 
Any one of B, X and Y may include a signal moiety, which may be for example a radioisotope, an isotope detectable by mass spectrometry or NMR, a hapten, a fluorescent group or a component of a chemiluminescent or fluorescent or chromogenic enzyme system. The signal moiety may be joined to the peptide nucleotide analogue either directly or through a linker chain of up to 30 atoms as well known in the field.
Rxe2x80x2 and Rxe2x80x3 are preferably both hydrogen. Suitable alkyl groups which these groups may contain include methyl, ethyl, propyl and butyl and while suitable aryl groups include phenyl so that they may be, for example, benzyl. The cycloalkyl ring which can be formed is suitably cyclopentyl or cyclohexyl. These various groups can be substituted but the presence of heteroatoms is to be avoided as they tend to make the compound unstable. Replacing the glycine carbonyl group in the glycyl-proline backbone of cPNA with a methylene group (xe2x80x9caminoethylprolyl cPNAxe2x80x9d) creates a more conformationally flexible backbone while the conformation of the side chain is still restricted. Increasing conformational flexibility of the backbone might decrease the binding affinity due to the increased entropy loss upon hybridization, but it should allow the cPNA to adopt a wider range of conformations than the glycine-derived cPNAs. Combination of the two factors may decrease or increase the binding affinity of the resulting cPNA to its complementary oligonucleotide, depending on how close the conformation of the cPNA in the hybrid is to that in its native state. The basic proline-nitrogen atom should be at least partially protonated under physiological conditions and should attract the negatively charged phosphate group of DNA so providing further stabilization of the hybrid formed with natural DNA.
The present invention also provides a process for preparing the compounds of the present invention which comprises:
(i) de-protecting the heterocyclic amino group of a compound of the formula: 
where
R2 is a protecting group, for example Dpm (diphenylmethyl),
R3 is a protecting group compatible with R2 for example Boc (t-butoxycarbonyl), and
B is a protected or unprotected heterocyclic base capable of Watson-Crick or Hoogsteen pairing, in particular N3-protected (such as by benzoyl) thymine, N6-protected adenine, N4-protected cytosine, N2xe2x80x94O6-protected guanine or N3-protected uracil.
(ii) reacting the de-protected product of (i) with an N-protected aziridine, for example as N-p-nitro benzene sulphonyl aziridine, to provide a compound of the formula: 
where R is a protecting group and optionally (iii) converting said R group to a different protecting group such as Boc or Fmoc (9-fluorenylmethyl formate) and optionally removing said protecting groups.
In another aspect the invention provides a method of converting a peptide nucleotide analogue of formula (I) in which n is 1 into a peptide oligonucleotide of formula (I) in which n is 2-200, comprising the steps of:
(i) providing a support carrying primary amine groups,
(ii) coupling an N-protected peptide nucleotide analogue of formula (I) to the support,
(iii) removing the N-terminal protecting group,
(iv) coupling an N-protected nucleotide analogue of formula (I) to the thus-derivatised support,
(v) repeating steps (iii) and (iv) one or more times, and
(vi) optionally removing the resulting peptide oligonucleotide from the support.
A synthetic route towards the N-aminoethylproline synthon carrying suitable protecting groups (7a and 7b) for solid phase peptide synthesis was developed (see scheme 1) (i) 2.5 equiv pTsOH/MeCN; (ii) N-nosylaziridine/DIEA/MeCN, rt; (iii) Boc2O/Et3N/DMAP in CH2Cl2; (iv) PhSH/K2CO3 in DMF, rt; (v) cyclohexane, cat Pd/C in MeOH, reflux; (vi) FmocCl/DIEA; (vii) 4 M HCl/dioxane. N-p-Nitrobenzenesulfonyl (xe2x80x9cNosylxe2x80x9d) aziridine, obtained by treatment of N-nosylethanolamine with Ph3P/DEAD in THF at low temperature, was employed as the electrophilic N-aminoethylating agent. The Boc group of protected cis-4-thyminyl-D-proline (4) (synthesized previously) was selectively removed in the presence of diphenylmethyl (Dpm) ester using p-toluenesulfonic acid in acetonitrile under conditions previously reported. Nucleophilic ring opening of N-nosylaziridine by this free amine proceeded efficiently to give the desired N-aminoethylproline derivative (5). Reaction of (5) with Boc2O was achieved in the presence of 4-dimethylaminopyridine (DMAP) to give the N-Boc derivative. Treatment with thiophenol in the presence of potassium carbonate in DMF at room temperature gave the N-Boc derivative (6a) as an amorphous solid [50% yield from (4)]. Deprotection of the carboxyl group by catalytic transfer hydrogenolysis (cyclohexene, Pd-C) gave the free N-Boc amino acid (7a) in quantitative yield.
The Boc protecting group in (6a) was converted to the Fmoc group in order to take advantage of the milder conditions of peptide synthesis employing Fmoc protection strategy. When (6a) was treated with p-toluenesulfonic acid in acetonitrile followed by 9-fluorenylmethyl chloroformate (Fmoc-Cl) in the presence of DIEA, the Fmoc derivative (6b) was obtained in 80% yield. The carboxyl protecting group was removed by HCl/dioxane to give the free acid (7b) (71%) as its hydrochloride salt with the N3-benzoyl protecting-group of thymine intact. The Fmoc protected monomer (7b) was oligomerized on Novasyn TGR resin, previously loaded with Fmoc-Lys(Boc) on 5 xcexcmol scale using standard HBTU/DIEA activation protocol with capping at the end of each cycle. Quantitative monitoring of the coupling efficiency, by measurement of the absorbance of dibenzofulvene-piperidine adduct at 264 nm, after deprotection cycle revealed the average coupling yield of approximately 92%. Using the Fmoc-ON purification strategy, it was possible to isolate the Fmoc-(3a) together with its partially debenzoylated product without difficulty using reverse phase HPLC. Subsequent treatment with 20% aqueous piperidine followed by HPLC gave the fully deprotected (3a). 
ESI mass spectrometric analysis revealed a single peak corresponding to the expected product (Mr=2787.27, calcd for Mr=2788.06). In contrast to the analogous glycylproline PNA with the same T10 sequence, the product is freely soluble in aqueous solvents and a concentration exceeding 5 mg/mL could be achieved (Scheme 1). 