The beer, wine and baking yeast Saccharomyces cerevisiae has already been used for centuries for the production of bread, wine and beer owing to its characteristic of fermenting sugar to ethanol and carbon dioxide. In biotechnology, S. cerevisiae is used particularly in ethanol production for industrial purposes, in addition to the production of heterologous proteins. Ethanol is used in numerous branches of industry as an initial substrate for syntheses. Ethanol is gaining increasing importance as an alternative fuel, due to the increasingly scarce presence of oil, the rising oil prices and continuously increasing need for petrol worldwide.
In order to make possible a favourably-priced and efficient bioethanol production, the use of lignocellulose-containing biomass, such as for example straw, waste from the timber industry and agriculture and the organic component of everyday household waste, presents itself as an initial substrate. Firstly, said biomass is very convenient and secondly is present in large quantities. The three major components of lignocellulose are lignin, cellulose and hemicellulose. Hemicellulose, which is the second most frequently occurring polymer after cellulose, is a highly branched heteropolymer. It consists of pentoses (L-arabinose, D-xylose), uronic acids (4-O-methyl-D-glucuronic acid, D-galacturonic acid) and hexoses (D-mannose, D-galactose, L-rhamnose, D-glucose) (see FIG. 1). Although, hemicellulose can be hydrolized more easily than cellulose, but it contains the pentoses L-arabinose and D-xylose, which can normally not be converted by the yeast S. cerevisae. 
In order to be able to use pentoses for fermentations, these must firstly enter the cell through the plasma membrane. Although S. cerevisiae is not able to metabolize D-xylose, it can uptake D-xylose into the cell. However, S. cerevisiae does not have a specific transporter. The transport takes place by means of the numerous hexosetransporters. The affinity of the transporters to D-xylose is, however, distinctly lower than to D-glucose (Kotter and Ciriacy, 1993). In yeasts which are able to metabolize D-xylose, such as for example P. stipitis, C. shehatae or P. tannophilus (Du Preez et al., 1986), there are both unspecific low-affinity transporters, which transport D-glucose, and also specific high-affinity proton symporters only for D-xylose (Hahn-Hagerdal et al., 2001).
In earlier experiments, some yeasts were found, such as for example Candida tropicalis, Pachysolen tannophilus, Pichia stipitis, Candida shehatae, which by nature ferment L-arabinose or can at least assimilate it. However, these yeast lack entirely the capability of fermenting L-arabinose to ethanol, or they only have a very low ethanol yield (Dien et al., 1996). Moreover, very little is yet known about the uptake of L-arabinose. In the yeast C. shehatae one assumes a proton symport (Lucas and Uden, 1986). In S. cerevisiae, it is known from the galactose permease Gal2 that it also transports L-arabinose, which is very similar in structure to D-galactose. (Kou et al., 1970).
Alcoholic fermentation of pentoses in biotechnologically modified yeast strains of S. cerevisiae, wherein inter alia various genes of the yeast strain Pichia stipitis were used for the genetic modification of S. cerevisiae, was described in recent years particularly in connection with the fermentation of xylose. The engineering concentrated here particularly on the introduction of the genes for the initial xylose assimilation from Pichia stipitis, a xylose-fermenting yeast, into S. cerevisiae, i.e. into a yeast which is traditionally used in the ethanol production from hexose (Jin et al. 2004).
Jeppson et al. (2006) describe xylose fermentation by S. cerevisiae by means of the introduction of a xylose metabolic pathway which is either similar to that in the yeasts Pichia stipitis and Candida shehatae, which naturally use xylose, or is similar to the bacterial metabolic pathway.
Katahira et al. (2006) describe sulphuric acid hydrolysates of lignocellulose biomass such as wood chips, as an important material for the production of fuel bioethanol. In this study, a recombinant yeast strain was constructed, which is able to ferment xylose and cellooligosaccharides. For this, various genes were integrated into this yeast strain and namely for the inter-cellular expression of xylose reductase and xylitol dehydrogenase from Pichia stipitis and xylulokinase from S. cerevisiae and for the presentation of beta-glucosidase from Aspergillus acleatus on the cell surface. In the fermentation of sulphuric acid hydrolysates of wood chips, xylose and cellooligosaccharides were fully fermented by the recombinant strain after 36 hours.
Pitkanen et al. (2005) describe the obtaining and characterizing of xylose chemostat isolates of a S. cervisiae strain, which over-expresses genes of Pichia stipitis coding for xylose reductase and xylitol dehydrogenase and the gene which codes endogenous xylulokinase. The isolates were obtained from aerobic chemostat cultures on xylose as the single or major carbon source. Under aerobic conditions on minimal medium with 30 g/l xylose, the growth rate of the chemostat isolates was 3 times higher than that of the original strain (0.15 h−1 compared with 0.05 h−1). The xylose uptake rate was increased almost two-fold. The activities of the key enzymes of the pentose phosphate metabolic pathway (transketolase, transaldolase) were increased two-fold, whilst the concentrations of their substrates (pentose-5-phosphates, sedoheptulose-7-phosphate) were lowered accordingly.
Becker and Boles (2003) describe the engineering and the selection of a laboratory strain of S. cerevisiae which is able to use L-arabinose for growth and for fermenting it to ethanol. This was possible due to the over-expression of a bacterial L-arabinose metabolic pathway, consisting of Bacillus subtilis AraA and Escherichia coli AraB and AraD and simultaneous over-expression of yeast galactose permease transporting L-arabinose in the yeast strain. Molecular analysis of the selected strain showed that the predetermining precondition for a use of L-arabinose is a lower activity of L-ribulokinase. However, inter alia, a very slow growth is reported from this yeast strain (see FIG. 2).
Therefore, a need exists in the art for specific pentose transporters, in particular L-arabinose transporters, which allow to specifically take up pentoses, in particular L-arabinose, into cells, such as yeast cells, and therefore to promote a utilization and fermentation of pentoses, in particular L-arabinose.
It is therefore an object of the present invention to provide specific pentose transporters, such as arabinose transporters.
The problem is solved according to the invention by providing polypeptides which have an in vitro and/or in vivo pentose transport function, and variants and fragments thereof.
In particular, the polypeptide according to the invention is selected from the group of    a. a polypeptide, which is at least 70%, preferably at least 80% identical to the amino acid sequence according to SEQ ID NO:1 and has an in vitro and/or in vivo pentose transport function,    b. a naturally occurring variant of a polypeptide comprising the amino acid sequence according to SEQ ID NO:1, which has an in vitro and/or in vivo pentose transport function,    c. a polypeptide which is identical to the amino acid sequence according to SEQ ID NO:1 and has an in vitro and/or in vivo pentose transport function, and    d. a fragment of the polypeptide of a., b, or c., comprising a fragment of at least 100 continuous amino acids according to SEQ ID NO:1.
Preferably, the polypeptide according to the invention comprises a fragment of at least 200 or 300 continuous amino acids according to SEQ ID NO:1. Here, such a fragment is characterized in that it has an in vitro and/or in vivo pentose transport function.
In a preferred embodiment, a polypeptide according to the invention comprises a fragment of 502 amino acids which corresponds to the first 502 amino acids of SEQ ID NO:1. Such a fragment is characterized in that it has an in vitro and/or in vivo pentose transport function.
The polypeptide according to the invention preferably comprises a polypeptide which is at least 90%, preferably 95%, more preferably 99% identical to the amino acid sequence according to SEQ ID NO:1 and has an in vitro and/or in vivo pentose transport function.
Variants of the polypeptides according to the invention can also be those which have conservative amino acid substitutions or smaller deletions and/or insertions as long as these modifications do not substantially affect the in vitro and/or in vivo pentose transport function.
Polypeptides according to the invention can further comprise heterologous amino acid sequences. The skilled artisan can select suitable heterologous amino acid sequences depending on the application or use.
Preferably, the pentose is arabinose, in particular L-arabinose, so that a polypeptide according to the invention preferably has an in vitro and/or in vivo arabinose transport function, in particular an L-arabinose transport function.
The polypeptide according to the invention preferably originates from a yeast, preferably from Pichia, in particular Pichia stipitis. 
The problem is further solved according to the invention by providing isolated nucleic acid molecules which code for a polypeptide according to the invention.
Preferably, a nucleic acid molecule according to the invention is at least 90%, preferably 95% and more preferably 99% identical to the nucleic acid sequence according to SEQ ID NO:2 or 3.
A nucleic acid molecule according to the invention further comprises vector nucleic acid sequences, preferably expression vector sequences. Vector nucleic acid sequences are preferably selected from sequences which are comprised from the vectors of the group consisting of YEp24, p426HXT7-6HIS, p426Met25, pYES260, pYES263, pVTU260, pVTU263, pVTL260, pVTL263. For further embodiments, see FIGS. 6A-E and Example 3.
Nucleic acid molecules according to the invention can furthermore comprise nucleic acid sequences which code for further heterologous polypeptides. The skilled artisan can select suitable heterologous nucleic acid sequences which code for the further heterologous polypeptides himself, depending on the application or use. These include for example antibiotic resistance marker sequences.
Nucleic acid molecules according to the invention preferably comprise dsDNA, ssDNA, PNA, CNA, RNA or mRNA or combinations thereof.
The problem is further solved according to the invention by providing host cells which contain at least one nucleic acid molecule according to the invention. Host cells according to the invention preferably also express said at least one nucleic acid molecule according to the invention.
A host cell according to the invention is, in particular, a fungal cell and preferably a yeast cell, such as Saccharomyces species, e.g. S. cerevisiae, Kluyveromyces sp., e.g. K. lactis, Hansenula sp., e.g. H. polymorpha, Pichia sp., e.g. P. pastoris, Yarrowia sp., e.g. Y. lipolytica. 
Preferably, host cells according to the invention further contain nucleic acid molecules which code for proteins of the arabinose metabolic pathway, in particular for L-ribulokinase, L-ribulose-5-P 4-epimerase, L-arabinose-isomerase.
Preferably, these concern proteins of the bacterial arabinose metabolic pathway, in particular E. coli araB L-ribulokinase, E. coli araB L-ribulose-5-P 4-epimerase and B. subtilis araA L-arabinose-isomerase. See also FIGS. 2 and 3.
Particularly preferred host cells of this invention are cells of the strain MKY06-4P, which was deposited, under the terms of the Budapest Treaty, on 23 Aug. 2006 at the German Collection of Microorganisms and Cell Cultures, located at Inhoffenstraβe 7 B, 38124 Braunschweig, Germany, under accession number DSM 18544. See also FIG. 3.
Further, the subject MKY06-4P deposit will be stored and made available to the public in accord with the provisions of the Budapest Treaty for the Deposit of Microorganisms, i.e., it will be stored with all the care necessary to keep it viable and uncontaminated for a period of at least five years after the most recent request for the furnishing of a sample of the deposit, and in any case, for a period of at least thirty (30) years after the date of deposit or for the enforceable life of any patent which ma issue disclosing the culture. The depositor acknowledges the duty to replace the deposit should the depository be unable to furnish a sample when requested, due to the condition of the deposit. During pendency of this application, access to the deposit will be afforded to one determined by the Commissioner to be entitled thereto. All restrictions on the availability to the public of the subject culture deposit will be irrevocably removed upon the granting of a patent disclosing it.
A preferred host cell according to this invention is a yeast cell which was modified by the introduction and expression of the genes araA (L-arabinose-isomerase), araB (L-ribulokinase) and araD (L-ribulose-5-P-4-epimerase) and in addition over-expresses a TAL1 (transaldolase) gene, as described for example by the inventors in EP 1 499 708 B1, and in addition to this contains at least one nucleic acid molecule according to the invention.
The problem is further solved according to the invention by providing antibodies or antibody fragments, comprising an immunologically active part, which binds selectively to a polypeptide according to the invention. Methods for the generation of antibodies or antibody fragments are known in the art.
The problem is further solved according to the invention by methods for the production of a polypeptide according to the invention. Such a method comprises the cultivating of a host cell according to the invention under conditions by which a nucleic acid molecule according to the invention is expressed. General methods for the generation of polypeptides by means of cell culture are known in the art.
The problem is further solved according to the invention by a kit comprising a compound which selectively binds to a polypeptide according to the invention, if applicable with further additives and instructions for use.
The compound is preferably a pentose, such as for example arabinose, and in particular L-arabinose, or a derivative of such a pentose.
The problem is further solved according to the invention by methods for identifying a compound which binds to a polypeptide according to the invention and/or modulates its activity. Such a method comprises the following steps:
Contacting a polypeptide or a cell, which expresses a polypeptide according to the invention, with a test compound, and
Determining whether the polypeptide binds to the test compound and, if applicable
Determining whether the test compound modulates the activity of the polypeptide.
The compound is preferably a pentose, such as for example arabinose, and in particular L-arabinose, or a derivative of such a pentose.
The problem is further solved according to the invention by methods for modulating the activity of a polypeptide according to the invention. Such a method comprises contacting a polypeptide or a cell, which expresses a polypeptide according to the invention, with a compound which binds to the polypeptide in a concentration which is sufficient to modulate the activity of the polypeptide.
The compound is preferably a pentose, such as for example arabinose, and in particular L-arabinose, or a derivative of such a pentose.
The problem is further solved according to the invention by methods for the production of bioethanol. Such a method according to the invention comprises the expression of a nucleic acid molecule according to the invention in a host cell according to the invention.
The polypeptides, nucleic acid molecules and host cells according to the invention are particularly preferably used for the production of bioethanol. For preferred embodiments, reference is made to FIG. 8 and Example 4.
The polypeptides, nucleic acid molecules and host cells according to the invention are further particularly preferably used for the recombinant fermentation of pentose-containing biomaterial.
Specific genes of Pichia stipitis, which specifically increase the uptake of the pentose L-arabinose in S. cerevisiae, were isolated using a gene bank and integrated into the yeast strain MKY06-3P, which is then able to ferment the L-arabinose to ethanol. The screening of the relevant genes led to a novel specific L-arabinose transporter, the nucleotide- and protein sequence of which is available (see SEQ ID NOs: 1-4). For this, reference is also made to the examples and figures.
Due to the specificity of this novel transporter, after expression in existing ethanol-producing systems the uptake rate for L-arabinose can be improved, because on the one hand the competitive situation with respect to glucose is improved at high L-arabinose concentrations, and on the other hand the transport of L-arabinose becomes more efficient at low L-arabinose concentrations due to a high affinity.
Uptake of L-arabinose
In order that the pentose L-arabinose can be metabolized by S. cerevisiae, it must firstly be taken up by the cell. Only little is known with regard to this uptake. Hitherto, no genes are known in eukaryontes, which code for specific L-arabinose transporters. All hexose transporters tested for the pentose D-xylose have a much higher affinity to D-glucose than to D-xylose. For L-arabinose, a similar situation is assumed. Of all strains constructed hitherto, which can utilize pentoses (D-xylose or L-arabinose), a relatively slow growth is reported. Above all, the slow and poor uptake of the pentoses is named as a reason for this (Becker and Boles, 2003; Richard et al., 2002). In fermentations in a sugar mixture, consisting of D-glucose and D-xylose or D-glucose and L-arabinose, the sugars are not converted simultaneously. Due to the high affinity of the transporters for D-glucose, D-glucose is metabolized at first. A so-called Diauxic shift occurs. Only after the D-glucose is exhausted is the pentose converted in a second, distinctly slower growth phase (Kuyper et al., 2005a; Kuyper et al., 2005b). The absence of specific transporters for pentoses is given as an explanation.
Novel Specific L-arabinose Transporter from P. stipitis 
For industrial applications, it would be ideal if the microorganism which was used could convert all the sugars present in the medium as far as possible simultaneously (Zaldivar et al., 2001). In order to achieve this, specific transporters for each sugar type would be of great benefit. None were known hitherto particularly for L-arabinose.
In this invention, the inventots succeeded in finding a specific L-arabinose transporter gene from the genome of P. stipitis with a test system (see examples). Genome fragments from P. stipitis are localized on the plasmids pAraT1 and pAraT7, which are responsible for a specific growth on L-arabinose but not on D-glucose, D-mannose or D-galactose medium. The observed low growth on D-galactose was not caused by the plasmids pAraT1 or pAraT7. This concerned only the weak growth of EBY.VW4000, the initial strain of MKY06, which was already reported by Wieczorke et al. (1999). The possibility that the obtained growth was caused by a genomic mutation in MKY06 was ruled out. After a selection for the loss of the plasmid of the P. stipitis gene bank by twice streaking on FOA medium, no further growth was established after again streaking on L-arabinose medium. Therefore, the growth originated from the plasmids of the P. stipitis gene bank (see examples). It was shown that the plasmids found code a transporter.
In a BLAST search with the recently published genome of Pichia stipitis, a 100% conformity with HGT2 was found. Due to its high homology to the high-affinity glucose transporter HGT1 of Candida albicans, HGT2 was annotated as putative high-affinity glucose transporter. When the sequence is examined with regard to the possible transmembrane domains, 12 transmembrane domains are obtained, which is typical for transporters. It is therefore surprising that it is a pentose transporter (arabinose transporter) and not a hexose transporter.
Furthermore, a multitude of experimental obstacles and difficulties had to be overcome in locating and providing the transporter according to the invention, which can also be seen in greater detail from the examples and figures.                In the initial strain EBY.VW4000, a total of 21 monosaccharide transporter genes had to be deleted.        Furthermore, TAL1 had to be genomically over-expressed in this strain.        The establishing of the optimum growth conditions for carrying out the screen proved to be very difficult and time-consuming.        The transporter according to the invention is the first described specific arabinose transporter of eucaryonts.        It is a heterologously expressed transporter which is at the same time functionally incorporated in the plasma membrane of S. cerevisiae, which is not necessarily to be expected.        
Some reports exist with regard to the difficulties concerning heterologously expressed transporters, see on this subject Chapter 2 in the book “Transmembrane Transporters” (Boles, 2002) and the article by Wieczorke et al., 2003.
Further biomass with significant amounts of arabinose:
Type of biomassL-arabinose [%]Switchgrass3.66Large bothriochloa3.55Tall fescue3.19Robinia3Corn stover2.69Wheat straw2.35Sugar can bagasse2.06Chinese lespedeza1.75Sorghum bicolor1.65
The arabinose transporter according to the invention is also of great importance for their utilization.
Possibilities for use of a functional and at the same time specific arabinose transporter in the yeast S. cerevisiae are on the one hand the production of bioethanol and the production of high-grade precursor products for further chemical syntheses.
The following list originates from the study “Top Value Added Chemicals From Biomass”. Here, 30 chemicals were categorized as being particularly valuable, which can be produced from biomass.
Number ofC atomsTop 30 Candidates1hydrogen, carbon monoxide23glycerol, 3-hydroxypropionic acid, lacticacid, malonic acid, propionic acid, serine4acetoin, asparaginic acid, fumaric acid,3-hydroxybutyrolactone, malic acid,succinic acid, threonine5arabitol, furfural, glutamic acid, itaconicacid, levulinic acid, proline, xylitol,xylonic acid6aconitic acid, citrate, 2,5-furandicarboxylicacid, glucaric acid, lysine, levoglucosan,sorbitol
As soon as these chemicals are produced from lignocelluloses by bioconversion (e.g. fermentations with yeasts), it is important to have a specific transporter for the hemicellulose arabinose.
The present invention is further clarified in the following figures, sequences and examples, without however being restricted thereto. The cited references are fully included herewith by reference. In the sequences and figures there are shown:    SEQ ID NO: 1: the protein sequence encoded by the open reading frame (ORF) of AraT,    SEQ ID NO: 2: the sequence of the open reading frame (ORF) of AraT,    SEQ ID NO: 3: the sequence of the open reading frame (ORF) of AraT in a codon-optimized form, and    SEQ ID NO: 4: the sequence of the open reading frame (ORF) of AraT with 500 promoter, ORF and 300 terminator.