The invention relates to recombinant expression of terpenoid synthase enzymes and geranylgeranyl diphosphate synthase (GGPPS) enzymes in cells and the production of diterpenoids.
1. A method comprising:
recombinantly expressing a terpenoid synthase enzyme and a geranylgeranyl diphosphate synthase (GGPPS) enzyme in a cell that overexpresses one or more components of the non-mevalonate (MEP) pathway. 2. The method of 3.-10. (canceled) 11. The method of 12.-14. (canceled) 15. The method of 16.-18. (canceled) 19. The method of 20. The method of 21. (canceled) 22. The method of 23.-24. (canceled) 25. The method of 26. The method of 27. (canceled) 28. The method of 29. (canceled) 30. The method of 31.-34. (canceled) 35. A cell that overexpresses one or more components of the non-mevalonate (MEP) pathway, and that recombinantly expresses a terpenoid synthase enzyme and a geranylgeranyl diphosphate synthase (GGPPS) enzyme. 36. The cell of 37.-44. (canceled) 45. The cell of 46.-48. (canceled) 49. The cell of 50.-52. (canceled) 53. The cell of 54. The cell of 55. (canceled) 56. The cell of 57.-58. (canceled) 59. The cell of 60. (canceled) 61. The cell of 62.-64. (canceled) 65. A cell that recombinantly expresses a levopimaradiene synthase (LPS) enzyme, wherein the LPS enzyme contains a mutation at one or more of the residues selected from the group consisting of: M593, C618, A620, L696, Y700, K723, A729, V731, N838, and I855, corresponding to residues within the full-length, wild-type, 66.-77. (canceled) 78. A cell that recombinantly expresses a geranylgeranyl diphosphate synthase (GGPPS) enzyme, wherein the GGPPS enzyme contains a mutation at residue S239 and/or G295, corresponding to residues within the full-length, wild-type, 79.-104. (canceled)
This application claims the benefit under 35 U.S.C. §120 and 35 U.S.C. §365(c) of U.S. application Ser. No. 12/615,985, entitled “Methods for Microbial Production of Terpenoids,” filed on Nov. 10, 2009, the entire disclosure of which is incorporated by reference herein in its entirety. The invention relates to the production of one or more terpenoids through recombinant gene expression. The pharmaceutically important diterpene lactone ginkgolides are products of secondary metabolism in The success of fermentation technology to produce many fine and commodity chemicals has inspired the heterologous production of several plant terpenoids using microbial hosts9-13. In plants, secondary metabolite pathways are genetically programmed and regulated (transcriptionally and post-translationally) so that these chemicals are only synthesized as needed14, 15. A particular branch pathway is not designed to overproduce a certain metabolite, but rather, so that the overall metabolism works in concert. A successful microbial production platform, on the other hand, requires that an imported pathway generate a high production yield. Metabolic engineering to increase flux through an engineered plant-derived pathway has been shown to improve terpenoid production12, 13, 16. The extent of product improvement through metabolic engineering is ultimately determined by the biosynthetic capacity of the heterologous pathway in the intracellular environment of the microbial host.17Described herein is a novel microbial platform for producing terpenoids and diterpenoids such as levopimaradiene, the key diterpenoid precursor of the ginkgolides. This system was constructed by “tuning” a heterologous pathway to confer overproduction in a microorganism. Codon-optimized Aspects of the invention relate to methods that include recombinantly expressing a terpenoid synthase enzyme and a geranylgeranyl diphosphate synthase (GGPPS) enzyme in a cell that overexpresses one or more components of the non-mevalonate (MEP) pathway. In some embodiments, the cell is a bacterial cell. In certain embodiments, the cell is an In some embodiments, the terpenoid synthase enzyme is a diterpenoid synthase enzyme such as a levopimaradiene synthase (LPS) enzyme. In some embodiments, the LPS enzyme is a In some embodiments, the GGPPS enzyme is a In some embodiments, the LPS enzyme contains the mutation M593I and/or Y700F, corresponding to residues within the full-length wild-type The gene encoding for the terpenoid synthase enzyme and/or the gene encoding for the geranylgeranyl diphosphate synthase (GGPPS) enzyme can be expressed from one or more plasmids and/or can be incorporated into the genome of the cell. In some embodiments, the terpenoid synthase enzyme and/or the geranylgeranyl diphosphate synthase (GGPPS) enzyme is codon-optimized. Aspects of the invention further include methods for culturing cells associated with the invention to produce a terpenoid. The terpenoids can have one or more cyclic structures. In some embodiments, the terpenoid is a diterpenoid such as levopimaradiene. Methods can further include recovering the terpenoid from the cell culture. In some embodiments, the terpenoid is recovered from the gas phase, while in other embodiments, an organic layer is added to the cell culture, and the terpenoid is recovered from the organic layer. In some embodiments, the cell produces a Taxol, a gibberellin, and/or a steviol glycoside. Aspects of the invention relate to cells that overexpress one or more components of the non-mevalonate (MEP) pathway, and that recombinantly express a terpenoid synthase enzyme and a geranylgeranyl diphosphate synthase (GGPPS) enzyme. In some embodiments, the cell is a bacterial cell. In certain embodiments, the cell is an In some embodiments, the terpenoid synthase enzyme is a diterpenoid synthase enzyme such as a levopimaradiene synthase (LPS) enzyme. In some embodiments, the LPS enzyme is a In some embodiments, the GGPPS enzyme is a In some embodiments, the LPS enzyme contains the mutation M593I and/or Y700F, corresponding to residues within the full-length wild-type The gene encoding for the terpenoid synthase enzyme and/or the gene encoding for the geranylgeranyl diphosphate synthase (GGPPS) enzyme can be expressed from one or more plasmids and/or can be incorporated into the genome of the cell. In some embodiments, the terpenoid synthase enzyme and/or the geranylgeranyl diphosphate synthase (GGPPS) enzyme is codon optimized. In some embodiments, cells associated with the invention produce a terpenoid. The terpenoid can have one or more cyclic structures. In certain embodiments, the terpenoid is a diterpenoid such as levopimaradiene. In some embodiments, the cell produces a Taxol, a gibberellin, and/or a steviol glycoside. Aspects of the invention relate to cells that recombinantly expresses a levopimaradiene synthase (LPS) enzyme, wherein the LPS enzyme contains a mutation at one or more of the residues selected from the group consisting of: M593, C618, A620, L696, Y700, K723, A729, V731, N838, and I855, corresponding to residues within the full-length, wild-type, In some embodiments, the cell is a bacterial cell. In certain embodiments, the cell is an Aspects of the invention relate to cells that recombinantly expresses a geranylgeranyl diphosphate synthase (GGPPS) enzyme, wherein the GGPPS enzyme contains a mutation at residue 5239 and/or G295, corresponding to residues within the full-length, wild-type, In some embodiments, the cell is a bacterial cell. In certain embodiments, the cell is an Aspects of the invention relate to isolated levopimaradiene synthase (LPS) polypeptides that contains a mutation at one or more of the residues selected from the group consisting of: M593, C618, A620, L696, Y700, K723, A729, V731, N838, and I855, corresponding to residues within the full-length, wild-type, Aspects of the invention relate to isolated geranylgeranyl diphosphate synthase (GGPPS) polypeptides, wherein the GGPPS polypeptide contains a mutation at residue S239 and/or G295, corresponding to residues within the full-length, wild-type, These and other aspects of the invention, as well as various embodiments thereof, will become more apparent in reference to the drawings and detailed description of the invention. The accompanying drawings are not intended to be drawn to scale. In the drawings, each identical or nearly identical component that is illustrated in various figures is represented by a like numeral. For purposes of clarity, not every component may be labeled in every drawing. In the drawings: Aspects of the invention relate to methods and compositions for the production of one or more terpenoids through recombinant gene expression in cells. Described herein is a novel microbial platform in which a terpenoid synthase enzyme, such as levopimaradiene synthase (LPS) and a geranylgeranyl diphosphate synthase (GGPPS) enzyme are recombinantly expressed in cells. Significantly, mutations in the LPS and GGPPS enzymes have been identified herein that lead to increased production of diterpenoids. This novel microbial platform represents an unexpectedly efficient new system for producing diterpenoids such as levopimaradiene, which has widespread therapeutic applications. This invention is not limited in its application to the details of construction and the arrangement of components set forth in the following description or illustrated in the drawings. The invention is capable of other embodiments and of being practiced or of being carried out in various ways. Also, the phraseology and terminology used herein is for the purpose of description and should not be regarded as limiting. The use of “including,” “comprising,” or “having,” “containing,” “involving,” and variations thereof herein, is meant to encompass the items listed thereafter and equivalents thereof as well as additional items. Aspects of the invention relate to the production of terpenoids. As used herein, a terpenoid, also referred to as an isoprenoid, is an organic chemical derived from a five-carbon isoprene unit. Several non-limiting examples of terpenoids, classified based on the number of isoprene units that they contain, include: hemiterpenoids (1 isoprene unit), monoterpenoids (2 isoprene units), sesquiterpenoids (3 isoprene units), diterpenoids (4 isoprene units), sesterterpenoids (5 isoprene units), triterpenoids (6 isoprene units), tetraterpenoids (8 isoprene units), and polyterpenoids with a larger number of isoprene units. Terpenoids are synthesized through at least two different metabolic pathways: the mevalonic acid pathway and the MEP (2-C-methyl-D-erythritol 4-phosphate) pathway, also called the MEP/DOXP (2-C-methyl-D-erythritol 4-phosphate/1-deoxy-D-xylulose 5-phosphate) pathway, the non-mevalonate pathway and the mevalonic acid-independent pathway. Described herein are methods for producing terpenoids, such as diterpenoids, in cells through recombinant gene expression of a terpenoid synthase (also referred to as terpene cyclase) enzyme, and a geranylgeranyl diphosphate synthase (GGPPS) enzyme. In some embodiments, a terpenoid synthase enzyme is a diterpenoid synthase enzyme. Several non-limiting examples of diterpenoid synthase enzymes include casbene synthase54, taxadiene synthase55, levopimaradiene synthase49, abietadiene synthase52, isopimaradiene synthase52, ent-copalyl diphosphate synthase56, syn-stemar-13-ene synthase56, syn-stemod-13(17)-ene synthase56, syn-pimara-7,15-diene synthase56, ent-sandaracopimaradiene synthase56, ent-cassa-12,15-diene synthase56, ent-pimara-8(14), 15-diene synthase57, ent-kaur-15-ene synthase57, ent-kaur-16-ene synthase57, aphidicolan-16β-ol synthase57, phyllocladan-16α-ol synthase57, fusicocca-2,10(14)-diene synthase57and terpentetriene cyclase58. In some embodiments, the diterpenoid synthase enzyme is levopimaradiene synthase49(LPS), involved in production of levopimaradiene. In engineered systems described herein, levopimaradiene synthesis can be accompanied by production of one or more other diterpenoids such as abietadiene, sandaracopimaradiene, and neoabietadiene (trace) isomers ( According to aspects of the invention, cell(s) that recombinantly express one or more enzymes associated with the invention, and the use of such cells in producing diterpenoids such as levopimaradiene are provided. It should be appreciated that the genes encoding for the enzymes associated with the invention can be obtained from a variety of sources. In some embodiments, the gene encoding for LPS is a plant gene. For example, the gene encoding for LPS can be from a species of As one of ordinary skill in the art would be aware, homologous genes for these enzymes can be obtained from other species and can be identified by homology searches, for example through a protein BLAST search, available at the National Center for Biotechnology Information (NCBI) internet site (www.ncbi.nlm.nih.gov). Genes associated with the invention can be cloned, for example by PCR amplification and/or restriction digestion, from DNA from any source of DNA which contains the given gene. In some embodiments, a gene associated with the invention is synthetic. Any means of obtaining a gene encoding for an enzyme associated with the invention is compatible with the instant invention. Aspects of the invention include strategies to optimize production of a diterpenoid from a cell. Optimized production of a diterpenoid refers to producing a higher amount of a diterpenoid following pursuit of an optimization strategy than would be achieved in the absence of such a strategy. Optimization of production of a diterpenoid can involve modifying a gene encoding for an enzyme before it is recombinantly expressed in a cell. In some embodiments, such a modification involves codon optimization for expression in a bacterial cell. Codon usages for a variety of organisms can be accessed in the Codon Usage Database (www.kazusa.or.jp/codon/). Codon optimization, including identification of optimal codons for a variety of organisms, and methods for achieving codon optimization, are familiar to one of ordinary skill in the art, and can be achieved using standard methods. In some embodiments, modifying a gene encoding for an enzyme before it is recombinantly expressed in a cell involves making one or more mutations in the gene encoding for the enzyme before it is recombinantly expressed in a cell. For example, a mutation can involve a substitution or deletion of a single nucleotide or multiple nucleotides. In some embodiments, a mutation of one or more nucleotides in a gene encoding for an enzyme will result in a mutation in the enzyme, such as a substitution or deletion of one or more amino acids. In some embodiments “rational design” is involved in constructing specific mutations in enzymes. As used herein, “rational design” refers to incorporating knowledge of the enzyme, or related enzymes, such as its three dimensional structure, its active site(s), its substrate(s) and/or the interaction between the enzyme and substrate, into the design of the specific mutation. Based on a rational design approach, mutations can be created in an enzyme which can then be screened for increased production of a diterpenoid. For example, as described in Example 1, rational design was implemented in creating specific mutations in LPS. Although the crystal structure of LPS is not available, the tertiary folds of other related terpene cyclase enzymes are similar. The structure of one such enzyme, 5-epi-aristolochene synthase30(EAS) was used to examine the second active site of LPS. This process of constructing an atomic-resolution model of one protein (e.g., LPS) from its amino acid sequence and a three-dimensional structure of a related homologous protein (e.g., EAS) is termed “homology modeling”. Mutations in the second active site within other terpene cyclases impacts their plasticity26, 31-33. In the second active site of an LPS-type enzyme, the bicylic (+)-copalyl diphosphate (CPP) intermediate (derived from the deprotonation of GGPP in the first active site) undergoes a diphosphate-ionization cyclization. The resulting C8-sandaracopimarenyl cation intermediate is further deprotonated at two alternative sites to release isopimaradiene or sandaracopimaradiene end products. This intermediate can also undergo intramolecular proton transfer and 1,2-methyl migration to yield abietenyl cation. Subsequent deprotonation of abietenyl cation at four possible sites then produce abietadiene, levopimaradiene, neoabietadiene, and palustradiene28, 29. Based on the structural data, mutations in LPS were generated in fifteen residues within a 10 Å solvation layer of the LPS model: M593, C618, L619, A620, L696, Y700, K723, A727, A729, V731, N769, E777, N838, G854 and I855 (See In some embodiments, the LPS enzyme contains a mutation in residue M593, alone or in combination with one or more other mutations. For example, the mutation can be M593I or a substitution with another hydrophobic residue such as leucine (M593L). In certain embodiments, the mutation in M593 can be M593C, M593S or M593T. Based on structural data, Met593 is located at the posterior of the binding pocket of LPS. Without wishing to be bound by any theory, hydrophobic amino acid substitutions at Met593 may improve the diterpenoid yield by disrupting hydrogen bonding at the end of the binding pocket, thus increasing the flexibility of the cavity to better fit the CPP substrate. Additionally, substitutions with large and/or bulky amino acids at Met593 may obstruct the cyclization pocket, reducing diterpenoid yield. Thus, in some embodiments, hydrophobic and/or small residues are preferred for substitution at Met593. In some embodiments, the LPS enzyme contains a mutation in residue Y700, alone or in combination with one or more other mutations. For example, the mutation can be Y700H, Y700F, Y700M or Y700W. Based on structural data, Y700 is positioned at the entrance of the binding pocket of the enzyme, in close vicinity of a DDXXD magnesium binding motif. Without wishing to be bound by any theory, absence of a hydroxyl group in amino acids that are similar to tyrosine may allow the repositioning of the magnesium closer to the aspartate-rich region, potentially increasing reaction efficiency by improving the chelation of the diphosphate group. In some embodiments, the LPS enzyme contains a mutation in residue A620, alone or in combination with one or more other mutations. In some embodiments, the mutation involves a substitution with a residue that is small and/or hydrophilic. In certain embodiments, the mutation can be A620C, A620G, A620S or A620T. The LPS enzyme can contain one mutation or multiple mutations. In some embodiments, the LPS enzyme contains a mutation in M593 and a mutation in Y700. For example, the LPS enzyme can contain the following combinations of mutations: M593I and Y700F, M593I and Y700A, or M593I and Y700C. The LPS enzyme containing these mutations can also contain one or more other mutations. In some embodiments, random mutagenesis is used for constructing specific mutations in enzymes. As described in Example 1, improved diterpenoid production was achieved in part through random mutagenesis of the GGPPS enzyme and screening for mutations within the enzyme that led to increased diterpenoid production. In some embodiments, the GGPPS enzyme has one or more of the follow mutations: A162V, G140C, L182M, F218Y, D160G, C184S, K367R, A151T, M185I, D264Y, E368D, C184R, L331I, G262V, R365S, A114D, S239C, G295D, 1276V, K343N, P183S, 1172T, D267G, 1149V, T234I, E153D and T259A. In some embodiments, the GGPPS enzyme has a mutation in residue 5239 and/or residue G295. In certain embodiments, the GGPPS enzyme has the mutation S239C and/or G295D. Mutations in GGPPS that had beneficial effects on diterpenoid production were frequently found to be located between two highly conserved aspartate-rich domains: DDXXXXD and DDXXD ( Combination of a mutant LPS enzyme and a mutant GGPPS enzyme can be expressed in a cell to provide increased production of diterpenoid. In some embodiments, the cell expresses an LPS enzyme containing the mutations M593I and/or Y700F, and a GGPPS enzyme containing the mutations S239C and/or G295D. It should be appreciated that the choice of mutations will in some instances depend on the desired end product. For example, some mutations or combinations of mutations may be selected because they lead to an overall increase in diterpenoid production, while other mutations or combinations of mutations may be selected because they lead to an increase production of one or more specific diterpenoids, such as levopimaradiene, relative to production of other diterpenoids. For example, a cell expressing an LPS enzyme containing the mutation M593I and either Y700A or Y700C produced a selectivity for levopimaradiene of approximately 97%. A cell expressing both an LPS enzyme containing the mutations M593I and Y700F and a GGPPS enzyme containing the mutations S239C and G295D was found to improve titer of levopimaradiene by approximately 19 fold over wild-type. In some embodiments, it may be advantageous to use a cell that has been optimized for production of a diterpenoid. For example, in some embodiments, a cell that overexpresses one or more components of the non-mevalonate (MEP) pathway is used, at least in part, to amplify isopentyl diphosphate (IPP) and dimethylallyl diphosphate (DMAPP), substrates of GGPPS. In some embodiments, overexpression of one or more components of the non-mevalonate (MEP) pathway is achieved by increasing the copy number of one or more components of the non-mevalonate (MEP) pathway. For example, copy numbers of components at rate-limiting steps in the MEP pathway such as (dxs, ispD, ispF, idi) can be amplified, such as by additional episomal expression. In some embodiments, screening for mutations in components of the MEP pathway, or components of other pathways, that lead to enhanced production of a diterpenoid may be conducted through a random mutagenesis screen, or through screening of known mutations. In some embodiments, shotgun cloning of genomic fragments could be used to identify genomic regions that lead to an increase in production of a diterpenoid, through screening cells or organisms that have these fragments for increased production of a diterpenoid. In some cases one or more mutations may be combined in the same cell or organism. In some embodiments, production of a diterpenoid in a cell can be increased through manipulation of enzymes that act in the same pathway as the enzymes associated with the invention. For example, in some embodiments it may be advantageous to increase expression of an enzyme or other factor that acts upstream of a target enzyme such as an enzyme associated with the invention. This could be achieved by over-expressing the upstream factor using any standard method. A further strategy for optimization of protein expression is to increase expression levels of one or more genes associated with the invention through selection of appropriate promoters and ribosome binding sites. In some embodiments, this may include the selection of high-copy number plasmids, or low or medium-copy number plasmids. The step of transcription termination can also be targeted for regulation of gene expression, through the introduction or elimination of structures such as stem-loops. The invention also encompasses isolated LPS and GGPPS polypeptides containing mutations in residues described above, and isolated nucleic acid molecules encoding such polypeptides. As used herein, the terms “protein” and “polypeptide” are used interchangeably and thus the term polypeptide may be used to refer to a full-length polypeptide and may also be used to refer to a fragment of a full-length polypeptide. As used herein with respect to polypeptides, proteins, or fragments thereof, “isolated” means separated from its native environment and present in sufficient quantity to permit its identification or use. Isolated, when referring to a protein or polypeptide, means, for example: (i) selectively produced by expression cloning or (ii) purified as by chromatography or electrophoresis. Isolated proteins or polypeptides may be, but need not be, substantially pure. The term “substantially pure” means that the proteins or polypeptides are essentially free of other substances with which they may be found in production, nature, or in vivo systems to an extent practical and appropriate for their intended use. Substantially pure polypeptides may be obtained naturally or produced using methods described herein and may be purified with techniques well known in the art. Because an isolated protein may be admixed with other components in a preparation, the protein may comprise only a small percentage by weight of the preparation. The protein is nonetheless isolated in that it has been separated from the substances with which it may be associated in living systems, i.e. isolated from other proteins. Isolated LPS polypeptides can contain mutations in one or more of the following residues: M593, C618, L619, A620, L696, Y700, K723, A727, A729, V731, N769, E777, N838, G854 and I855 (See Non-limiting examples of isolated In some embodiments, the isolated LPS polypeptide contains a mutation in residue M593, alone or in combination with one or more other mutations. For example, the mutation can be M593I or a substitution with another hydrophobic residue such as leucine (M593L). In certain embodiments, the mutation in M593 can be M593C, M593S or M593T. In some embodiments, the isolated LPS polypeptide contains a mutation in residue Y700, alone or in combination with one or more other mutations. For example, the mutation can be Y700H, Y700F, Y700M or Y700W. In some embodiments, the isolated LPS polypeptide contains a mutation in residue A620, alone or in combination with one or more other mutations. In some embodiments, the mutation involves a substitution with a residue that is small and/or hydrophilic. In certain embodiments, the mutation can be A620C, A620G, A620S or A620T. The isolated LPS polypeptide can contain one mutation or multiple mutations. In some embodiments, the isolated LPS polypeptide contains a mutation in M593 and a mutation in Y700. For example the isolated LPS polypeptide can contain the following combinations of mutations: M593I and Y700F, M593I and Y700A, or M593I and Y700C. The isolated LPS polypeptide containing these mutations can also contain one or more other mutations. Isolated GGPPS polypeptides can contain mutations in one or more of the following residues: A162, G140, L182, F218, D160, C184, K367, A151, M185, D264, E368, C184, L331, G262, R365, A114, S239, G295, I276, K343, P183, I172, D267, I149, T234, E153 and T259. Amino acid residue numbers indicated herein for GGPPS are based on amino acid numbers in the full-length, wild-type Non-limiting examples of isolated In some embodiments, the isolated GGPPS polypeptide contains a mutation in residue S239 and/or residue G295. In certain embodiments, the isolated GGPPS polypeptide has the mutation S239C and/or G295D. The isolated LPS polypeptide containing these mutations can also contain one or more other mutations. The invention also encompasses nucleic acids that encode for any of the polypeptides described herein, libraries that contain any of the nucleic acids and/or polypeptides described herein, and compositions that contain any of the nucleic acids and/or polypeptides described herein. It should be appreciated that libraries containing nucleic acids or proteins can be generated using methods known in the art. A library containing nucleic acids can contain fragments of genes and/or full-length genes and can contain wild-type sequences and mutated sequences. A library containing proteins can contain fragments of proteins and/or full length proteins and can contain wild-type sequences and mutated sequences. It should be appreciated that the invention encompasses codon-optimized forms of any of the nucleic acid and protein sequences described herein. The invention encompasses any type of cell that recombinantly expresses genes associated with the invention, including prokaryotic and eukaryotic cells. In some embodiments the cell is a bacterial cell, such as In some embodiments, one or more of the genes associated with the invention is expressed in a recombinant expression vector. As used herein, a “vector” may be any of a number of nucleic acids into which a desired sequence or sequences may be inserted by restriction and ligation for transport between different genetic environments or for expression in a host cell. Vectors are typically composed of DNA, although RNA vectors are also available. Vectors include, but are not limited to: plasmids, fosmids, phagemids, virus genomes and artificial chromosomes. A cloning vector is one which is able to replicate autonomously or integrated in the genome in a host cell, and which is further characterized by one or more endonuclease restriction sites at which the vector may be cut in a determinable fashion and into which a desired DNA sequence may be ligated such that the new recombinant vector retains its ability to replicate in the host cell. In the case of plasmids, replication of the desired sequence may occur many times as the plasmid increases in copy number within the host cell such as a host bacterium or just a single time per host before the host reproduces by mitosis. In the case of phage, replication may occur actively during a lytic phase or passively during a lysogenic phase. An expression vector is one into which a desired DNA sequence may be inserted by restriction and ligation such that it is operably joined to regulatory sequences and may be expressed as an RNA transcript. Vectors may further contain one or more marker sequences suitable for use in the identification of cells which have or have not been transformed or transfected with the vector. Markers include, for example, genes encoding proteins which increase or decrease either resistance or sensitivity to antibiotics or other compounds, genes which encode enzymes whose activities are detectable by standard assays known in the art (e.g., β-galactosidase, luciferase or alkaline phosphatase), and genes which visibly affect the phenotype of transformed or transfected cells, hosts, colonies or plaques (e.g., green fluorescent protein). Preferred vectors are those capable of autonomous replication and expression of the structural gene products present in the DNA segments to which they are operably joined. As used herein, a coding sequence and regulatory sequences are said to be “operably” joined when they are covalently linked in such a way as to place the expression or transcription of the coding sequence under the influence or control of the regulatory sequences. If it is desired that the coding sequences be translated into a functional protein, two DNA sequences are said to be operably joined if induction of a promoter in the 5′ regulatory sequences results in the transcription of the coding sequence and if the nature of the linkage between the two DNA sequences does not (1) result in the introduction of a frame-shift mutation, (2) interfere with the ability of the promoter region to direct the transcription of the coding sequences, or (3) interfere with the ability of the corresponding RNA transcript to be translated into a protein. Thus, a promoter region would be operably joined to a coding sequence if the promoter region were capable of effecting transcription of that DNA sequence such that the resulting transcript can be translated into the desired protein or polypeptide. When the nucleic acid molecule that encodes any of the enzymes of the claimed invention is expressed in a cell, a variety of transcription control sequences (e.g., promoter/enhancer sequences) can be used to direct its expression. The promoter can be a native promoter, i.e., the promoter of the gene in its endogenous context, which provides normal regulation of expression of the gene. In some embodiments the promoter can be constitutive, i.e., the promoter is unregulated allowing for continual transcription of its associated gene. A variety of conditional promoters also can be used, such as promoters controlled by the presence or absence of a molecule. The precise nature of the regulatory sequences needed for gene expression may vary between species or cell types, but shall in general include, as necessary, 5′ non-transcribed and 5′ non-translated sequences involved with the initiation of transcription and translation respectively, such as a TATA box, capping sequence, CAAT sequence, and the like. In particular, such 5′ non-transcribed regulatory sequences will include a promoter region which includes a promoter sequence for transcriptional control of the operably joined gene. Regulatory sequences may also include enhancer sequences or upstream activator sequences as desired. The vectors of the invention may optionally include 5′ leader or signal sequences. The choice and design of an appropriate vector is within the ability and discretion of one of ordinary skill in the art. Expression vectors containing all the necessary elements for expression are commercially available and known to those skilled in the art. See, e.g., Sambrook et al., A nucleic acid molecule that encodes the enzyme of the claimed invention can be introduced into a cell or cells using methods and techniques that are standard in the art. For example, nucleic acid molecules can be introduced by standard protocols such as transformation including chemical transformation and electroporation, transduction, particle bombardment, etc. Expressing the nucleic acid molecule encoding the enzymes of the claimed invention also may be accomplished by integrating the nucleic acid molecule into the genome. In some embodiments one or more genes associated with the invention is expressed recombinantly in a bacterial cell. Bacterial cells according to the invention can be cultured in media of any type (rich or minimal) and any composition. As would be understood by one of ordinary skill in the art, routine optimization would allow for use of a variety of types of media. The selected medium can be supplemented with various additional components. Some non-limiting examples of supplemental components include glucose, antibiotics, IPTG for gene induction, ATCC Trace Mineral Supplement, and glycolate. Similarly, other aspects of the medium, and growth conditions of the cells of the invention may be optimized through routine experimentation. For example, pH and temperature are non-limiting examples of factors which can be optimized. In some embodiments, factors such as choice of media, media supplements, and temperature can influence production levels of terpenoids, such as diterpenoids. In some embodiments the concentration and amount of a supplemental component may be optimized. In some embodiments, how often the media is supplemented with one or more supplemental components, and the amount of time that the media is cultured before harvesting a terpenoid, such as a diterpenoid, is optimized. According to aspects of the invention, high titers of a diterpenoid such as levopimaradiene, are produced through the recombinant expression of genes associated with the invention, in a cell. As used herein “high titer” refers to a titer in the milligrams per liter (mg L−1) scale. The titer produced for a given product will be influenced by multiple factors including choice of media. In some embodiments the total diterpenoid titer is at least 10 mg L−1. For example the titer may be 10, 20, 50, 75, 100, 125, 150, 175, 200, 225, 250, 275, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900 or more than 900 mg L−1including any intermediate values. In some embodiments, a cell that expresses an LPS enzyme containing the mutations M593I and Y700F, and a GGPPS enzyme containing the mutations S239C and G295D can produce a total diterpenoid titer of approximately 800 mg L−1in approximately 168 hours. The liquid cultures used to grow cells associated with the invention can be housed in any of the culture vessels known and used in the art. In some embodiments large scale production in an aerated reaction vessel such as a stirred tank reactor can be used to produce large quantities of terpenoids, such as diterpenoids, that can be recovered from the cell culture. In some embodiments, the terpenoid is recovered from the gas phase of the cell culture, for example by adding an organic layer such as dodecane to the cell culture and recovering the terpenoid from the organic layer. Diterpenoids, such as levopimaradiene, produced through methods described herein have widespread applications. Levopimaradiene is a key diterpenoid precursor of ginkgolides which can be administered for a variety of therapeutic purposes including improving vascular function, inhibiting thrombosis and embolism, neuroprotective functions, and cancer treatment. Terpenoid pathways also lead to compounds used in flavors, cosmetics, and biofuels. Furthermore, methods described herein to search for mutations in LPS can be applied to other diterpenoid synthases such as taxadiene synthase. GGPPS mutations described herein can also be applied to synthesis of precursors for other plant diterpenoids including cancer therapeutics such as Taxol, plant growth hormones such as gibberellins and food products such as the natural sweetener steviol glycoside. The engineering of secondary metabolite biosynthesis in heterologous microorganisms is a promising approach to produce drug precursors in a scalable manner. However, secondary metabolite pathways are typically low-yielding and produce side products. Herein, these limitations were addressed by harnessing the evolvability of a plant-derived terpenoid pathway to efficiently synthesize levopimaradiene, the gateway precursor of the bioactive ginkgolides. Variants of geranylgeranyl diphosphate synthase and levopimaradiene synthase were created to uncover mutations that confer divergent phenotypes in The simultaneous expression of the wild-type GGPPS and LPS in a pre-engineered The second active site was focused on because mutations in this site within other terpene cylases impacted their ‘plasticity’26, 31-33. In the second active site of an “LPS-type” enzyme, the bicyclic (+)-copalyl diphosphate (CPP) intermediate (derived from the deprotonation of GGPP in the first active site) undergoes a diphosphate-ionization cyclization. The resulting C8-sandaracopimarenyl cation intermediate is further deprotonated at two alternative sites to release isopimaradiene or sandaracopimaradiene end products. However, this intermediate can also undergo intramolecular proton transfer and 1,2-methyl migration to yield abietenyl cation. Subsequent deprotonation of abietenyl cation at four possible sites then produce abietadiene, levopimaradiene, neoabietadiene, and palustradiene28, 29. To allow sufficient sampling of the three dimensional space, fifteen residues within the 10 Å solvation layer of the LPS model were probed ( The pre-engineered The previous results pointed to mutations in LPS that significantly affected production phenotype, namely M593I and Y700H. Although the preliminary mutation of Ala729 imparted product selectivity changes, it was excluded from further analysis because even a conservative replacement such as glycine was deleterious. From analyzing the structural model, Met593 was observed to be located at the posterior of the binding pocket, whereas Tyr700 is positioned at the entrance (in close vicinity of the DDXXD magnesium binding motif). To obtain the complete LPS evolvability profile by these residues, all amino acids were sampled through saturation mutagenesis. Additionally, the effects of expressing the saturation mutagenesis library of Ala620 was explored because a mutation at this position in From the saturation mutagenesis library of Met593, two substitutions were found that conferred significant productivity improvement ( The replacement of Tyr700 with phenylalanine, methionine, and tryptophan improved productivity up-to ˜5-fold ( Finally, the sampling of all amino acid substitutions of Ala620 revealed that only replacement with residues similar to alanine (small or hydrophilic) (cysteine, glycine, serine, and threonine) as well as valine retained LPS activity; whereas other substitutions were destructive or deleterious ( In laboratory experiments, the beneficial effect of single mutations are often additive33, 37, 38. Therefore, the production improvement resulting from expressing the LPS M593I variant encouraged investigation of the effect of this beneficial mutation in combination with saturation mutagenesis of Tyr700. As shown in The generation of a high-producing pathway was extended by the creation of a GGPPS library. As an up-stream enzyme of LPS, GGPPS catalyzes the formation of the linear polyprenyl (C20) diphosphate starter unit by the sequential elongation of IPP with the allylic monomer. Concomitant with diterpenoid production increase, methyl jasmonate elicitation in Although the structural information of a plant GGPPS from an angiosperm origin is available41, the crystal structure for a gymnosperm GGPPS has not been solved. Furthermore, the folding similarity of gymnosperm GGPPS enzymes and their angiosperm analogs are not known. Despite catalyzing essentially the same enzymatic reaction, GGPPS enzymes are known to exhibit wide structural diversity among organisms41. Therefore, based on secondary structure analysis42, the notable division of gymnosperm from angiosperm GGPPS enzymes may imply significant tertiary fold differences. The lack of a suitable structural guide prompted us to devise a stochastic mutational approach to evolve Sequence analysis of G10 revealed that two positions were mutated, namely S239C and G295D ( The performance of the pre-engineered Herein, a combination of rational and random mutational searches were used to uncover cryptic genetic variations in an engineered plant pathway that imparted levopimaradiene production changes in The approval of more than 100 new natural product-derived drugs for clinical trial in 2007 signifies the long-standing role of these molecules as effective therapeutics. Yet, this figure represents about a 30% drop since 200147. One of the major challenges in many natural product research efforts is the reliance on bioprospecting, which typically generates low yield. This work demonstrated that by transferring and reengineering a heterologous biosynthetic pathway, the high level production of a plant-derived pharmaceutical can be achieved in a microbial host. This pathway ‘reprogramming’ framework should further enhance the extent of production improvement via metabolic engineering and complement a recently developed tool to mediate metabolite channeling in vivo48. In a broader sense, because terpenoid pathways also lead to compounds used in flavors, cosmetics, and biofuels, this strategy should also be readily extended to overproduce many commercially important compounds using microbial biotechnology. The sequences of ggpps43and lps49were obtained from The levopimaradiene pathways (wild type and mutants) were constructed by cloning PCR fragments of ggpps and lps into the HindIII-EcoRI and EcoRI-SalI sites of pTrcMod50to create pTrcGGPPS-LPS. To allow high throughput screening of GGPPS mutants, the biosynthetic gene cluster consisting of crtB and crtI derived from plasmid pAC-LYC16were cloned into the EcoRI-SalI sites of pTrcMod to yield pTrcCRT. The mutant ggpps library was subsequently cloned into pTrcCRT in between the HindIII and EcoRI sites to create pTrcGGPPS*-CRT. In all cases, Single transformants of pre-engineered For analysis of small-scale cultivations (libraries), 1 mL hexane was added into 1.5 mL culture aliquots and vortexed for 30 min. The mixture was centrifuged to separate the organic layer. For analysis of bioreactor cultivations, 1 μL of the dodecane layer was diluted to 200 μL with hexane. In both cases, 1 μL of hexane (containing the analytes) was analyzed by GC-MS (Varian Saturn 3800 GC attached to a Varian 2000 MS). The sample was injected into a HP5 ms column 30m×250 μM×0.25 μM thickness (Agilent). Helium (ultra purity) at a flow rate 1.0 ml/min was used as a carrier gas. The oven temperature was first kept constant at 50° C. for 1 min, and then increased to 220° C. at the increment of 10° C./min, and finally held at this temperature for 10 min. The injector and transfer line temperatures were set at 200° C. and 250° C., respectively. Because levopimaradiene, abietadiene, and sandaracopimaradiene are not commercially available, taxadiene, a diterpenoid possessing the same molecular mass as levopimaradiene, abietadiene, sandaracopimaradiene was used to construct a calibration curve for the peak areas obtained from the GC-MS. The 3D structural model of LPS was built based on EAS (Protein Data Bank ID code SEAT). Sequence alignment ( The introduction of point mutations and saturation mutagenesis in lps were performed using QuikChange II XL (Stratagene). Nucleotide changes were set by custom designed oligonucleotides (Table 7). Subsequent to sequencing to verify nucleotide changes, the lps variants were used to replace the wild-type lps in pTrcGGPPS-LPS and subjected to expression in the pre-engineered WT, wild type LPS. Quantification of production was determined based on sampling an average of three independent WT, wild type LPS; TA, trace amounts (<0.1%); ND, not detected. Levopimaradiene, 1; abietadiene, 2; sandaracopimaradiene, 3. Neoabietadiene is not included in the table because it was only produced in trace amounts in all strains. Quantification of production was determined based on sampling an average of three independent WT, wild type LPS; TA, trace amounts (<0.1%); ND, not detected. Levopimaradiene, 1; abietadiene, 2; sandaracopimaradiene, 3. Neoabietadiene is not included in the table because it was only produced in trace amounts in all strains. Quantification of production was determined based on sampling an average of three independent WT, wild type LPS; TA, trace amounts (<0.1%); ND, not detected. Levopimaradiene, 1; abietadiene, 2; sandaracopimaradiene, 3. Neoabietadiene is not included in the table because it was only produced in trace amounts in all strains. Quantification of production was determined based on sampling an average of three independent WT, wild-type LPS; TA, trace amounts (<0.1%); ND, not detected. Levopimaradiene, 1; abietadiene, 2; sandaracopimaradiene, 3. Neoabietadiene is not included in the table because it was only produced in trace amounts in all strains. Quantification of production was determined based on sampling an average of three independent WT, wild type GGPPS. Quantification of production was determined based on sampling an average of three independent The letter F and R in the beginning of each mutagenic oligonucleotide indicates ‘forward’ and ‘reverse’ sequence, respectively. Having thus described several aspects of at least one embodiment of this invention, it is to be appreciated various alterations, modifications, and improvements will readily occur to those skilled in the art. Such alterations, modifications, and improvements are intended to be part of this disclosure, and are intended to be within the spirit and scope of the invention. Accordingly, the foregoing description and drawings are by way of example only. Those skilled in the art will recognize, or be able to ascertain using no more than routine experimentation, many equivalents to the specific embodiments of the invention described herein. Such equivalents are intended to be encompassed by the following claims. All references disclosed herein are incorporated by reference in their entirety for the specific purpose mentioned herein.RELATED APPLICATION
FIELD OF THE INVENTION
BACKGROUND OF THE INVENTION
SUMMARY OF THE INVENTION
BRIEF DESCRIPTION OF THE DRAWINGS
DETAILED DESCRIPTION OF THE INVENTION
EXAMPLES
Example 1
Harnessing the Evolvability of a Terpenoid Biosynthetic Pathway for Overproduction and Selectivity Control
Introduction
Results
Probing the LPS Putative Binding Pocket by Phylogenetic-Based Mutations
Mutational Enrichment of Tunable LPS Residues
Combinatorial LPS Mutations
Random Mutagenesis of GGPPS
Levopimaradiene Overproduction in Controlled Culture Conditions
Discussion
Methods
Cloning and Pathway Construction
Culture Growth and Library Analysis
Molecular Modeling
Mutant Library Generation and Screening
Diterpenoid production from pre-engineered GGPPS and LPS mutants (phylogenetically-based mutations) LPS mutation Titer (mg/L) WT 26.5 M593I 98.6 C618N 4.0 L619F 12.7 A620T 33.9 L696Q 42.0 Y700H 11.5 K723S 48.0 A727S 0.0 A729G 0.6 V731L 2.3 N769A 0.0 E777A 0.0 N838E 58.5 G854T 36.8 I855L 17.7 Diterpenoid production from pre-engineered GGPPS and LPS mutants (saturation mutagenesis of Met593) LPS Met593 Product Selectivity (%) Titer Mutation 1 2 3 (mg/L) WT 87 11 2 26.5 Ala 65 26 9 16.2 Cys 65 28 7 48.4 Asp 51 15 34 10.3 Glu 30 7 63 9.7 Phe ND ND 100 3.8 Gly 67 23 10 1.3 His ND ND ND 0.0 Ile 84 12 5 98.6 Lys ND ND ND 0.0 Leu 80 13 7 55.2 Asn 75 11 14 40.4 Pro 49 8 43 12.6 Gln 43 TA 57 8.6 Arg TA ND 100 4.6 Ser 80 11 9 39.8 Thr 78 16 6 40.2 Val 71 22 6 18.2 Trp ND ND ND 0.0 Tyr ND ND ND 0.0 Diterpenoid production from pre-engineered GGPPS and LPS mutants (saturation mutagenesis of Ala620) LPS Ala620 Product Selectivity (%) Mutation 1 2 3 Titer (mg/L) WT 87 11 2 26.5 Cys 87 10 3 40.0 Asp 97 2 1 1.7 Glu ND ND ND 0.0 Phe ND ND ND 0.0 Gly 86 12 2 29.2 His 100 ND ND 0.9 Ile ND ND ND 0.0 Lys ND ND ND 0.0 Leu 97 2 1 7.3 Met ND ND ND 0.0 Asn 97 2 1 2.6 Pro ND ND ND 0.0 Gln ND ND ND 0.0 Arg ND ND ND 0.0 Ser 92 8 TA 42.2 Thr 87 11 2 33.9 Val 92 8 TA 42.5 Trp ND ND ND 0.0 Tyr ND ND ND 0.0 Diterpenoid production from pre-engineered GGPPS and LPS mutants (saturation mutagenesis of Tyr700) LPS Tyr700 Product Selectivity (%) Mutation 1 2 3 Titer (mg/L) WT 87 11 2 26.5 Ala 81 9 10 75.6 Cys 76 7 17 59.1 Asp 81 ND 19 13.3 Glu 64 7 29 46.3 Phe 79 16 5 133.5 Gly 79 6 15 36.1 His 74 ND 26 6.2 Ile 60 6 34 31.5 Lys TA ND 100 4.3 Leu 72 7 21 48.2 Met 80 13 7 132.8 Asn 70 9 21 60.2 Pro 56 ND 44 8.3 Gln 59 6 35 41.2 Arg 33 ND 67 5.0 Ser 78 7 15 84.9 Thr 72 7 21 65.4 Val 56 6 38 31.3 Trp 84 6 10 100.7 Diterpenoid production from pre-engineered GGPPS and LPS M593I mutants (saturation mutagenesis of Tyr700) LPS (M593I) Product Selectivity (%) Tyr700 mutation 1 2 3 Titer (mg/L) WT 87 11 2 26.5 Ala 91 2 7 167.8 Cys 97 1 2 155.7 Asp 85 TA 15 66.1 Glu 60 TA 40 92.4 Phe 84 9 7 273.8 Gly 80 TA 20 59.2 His 70 TA 30 63.8 Ile 65 TA 35 78.6 Lys ND ND ND 0.0 Leu 73 2 25 97.4 Met 89 2 9 132.1 Asn 79 TA 21 93.7 Pro 0 TA ND 0.0 Gln 60 TA 40 26.9 Arg ND ND ND 0.0 Ser 84 TA 16 48.1 Thr 73 TA 27 22.7 Val 91 TA 9 83.5 Trp 63 TA 37 51.6 Diterpenoid production from pre-engineered isolated GGPPS mutants and the LPS M593I/Y700F variant GGPPS mutation Titer (mg/L) WT 273.8 G1 261.3 G2 396.2 G3 343.2 G4 316.8 G5 257.1 G6 242.4 G7 380.9 G8 351.0 G9 350.7 G10 468.7 G11 211.8 G12 366.8 G13 411.1 G14 287.2 G15 406.7 Custom oligonucleotides used for LPS mutagenesis Mutation 5′-3′ Sequence F-C618N CGTACGCAAAAACCTCTAACCTGGCCGTAATCCTGG (SEQ ID NO: 5) R-C618N CCAGGATTACGGCCAGGTTAGAGGTTTTTGCGTACG (SEQ ID NO: 6) F-L619F GCAAAAACCTCTTGCTTCGCCGTAATCCTGGACGATC (SEQ ID NO: 7) R-L619F GATCGTCCAGGATTACGGCGAAGCAAGAGGTTTTTGC (SEQ ID NO: 8) F-L696Q GTAAAGTTTGGGAGGGCCAGCTGGCCTCCTATAC (SEQ ID NO: 9) R-L696Q GTATAGGAGGCCAGCTGGCCCTCCCAAACTTTAC (SEQ ID NO: 10) F-K723S GTATGTCGAGAACGCTAGTGTTAGCATCGCGCTGG (SEQ ID NO: 11) R-K723S CCAGCGCGATGCTAACACTAGCGTTCTCGACATAC (SEQ ID NO: 12) F-A727S CTAAAGTTAGCATCTCGCTGGCGACCGTTGTTCTG (SEQ ID NO: 13) R-A727S CAGAACAACGGTCGCCAGCGAGATGCTAACTTTAG (SEQ ID NO: 14) F-A729G CTAAAGTTAGCATCGCGCTGGGGACCGTTGTTCTG (SEQ ID NO: 15) R-A729G CAGAACAACGGTCCCCAGCGCGATGCTAACTTTAG (SEQ ID NO: 16) F-V731L CATCGCGCTGGCGACCCTTGTTCTGAACTC (SEQ ID NO: 17) R-V731L GAGTTCAGAACAAGGGTCGCCAGCGCGATG (SEQ ID NO: 18) F-N769A CCGGCCGTCTGATTGCCGACACCAAAACCTATCAG (SEQ ID NO: 19) R-N769A CTGATAGGTTTTGGTGTCGGCAATCAGACGGCCGG (SEQ ID NO: 20) F-E777A CCAAAACCTATCAGGCTGCACGTAACCGTGG (SEQ ID NO: 21) R-E777A CCACGGTTACGTGCAGCCTGATAGGTTTTGG (SEQ ID NO: 22) F-N838E CGTCGTCTGCTGTTCGAGACCGCGCGTGTAATGC (SEQ ID NO: 23) R-N838E GCATTACACGCGCGGTCTCGAACAGCAGACGACG (SEQ ID NO: 24) F-G854T GTACCGCGATGGCTTCACCATCAGCGATAAAGAAATG (SEQ ID NO: 25) R-G854T CATTTCTTTATCGCTGATGGTGAAGCCATCGCGGTAC (SEQ ID NO: 26) F-I855L CCGCGATGGCTTCGGCCTCAGCGATAAAG (SEQ ID NO: 27) R-I855L CTTTATCGCTGAGGCCGAAGCCATCGCGG (SEQ ID NO: 28) F-M593A GTCAGCGCCCGGTTGAAGCGTACTTTTCTGTTGCAG (SEQ ID NO: 29) R-M593A CTGCAACAGAAAAGTACGCTTCAACCGGGCGCTGAC (SEQ ID NO: 30) F-M593C GTCAGCGCCCGGTTGAATGTTACTTTTCTGTTGCAG (SEQ ID NO: 31) R-M593C CTGCAACAGAAAAGTAACATTCAACCGGGCGCTGAC (SEQ ID NO: 32) F-M593D GTCAGCGCCCGGTTGAAGACTACTTTTCTGTTGCAG (SEQ ID NO: 33) R-M593D CTGCAACAGAAAAGTAGTCTTCAACCGGGCGCTGAC (SEQ ID NO: 34) F-M593E GTCAGCGCCCGGTTGAAGAGTACTTTTCTGTTGCAG (SEQ ID NO: 35) R-M593E CTGCAACAGAAAAGTACTCTTCAACCGGGCGCTGAC (SEQ ID NO: 36) F-M593F GTCAGCGCCCGGTTGAATTTTACTTTTCTGTTGCAG (SEQ ID NO: 37) R-M593F CTGCAACAGAAAAGTAAAATTCAACCGGGCGCTGAC (SEQ ID NO: 38) F-M593G GTCAGCGCCCGGTTGAAGGGTACTTTTCTGTTGCAG (SEQ ID NO: 39) R-M593G CTGCAACAGAAAAGTACCCTTCAACCGGGCGCTGAC (SEQ ID NO: 40) F-M593H FGTCAGCGCCCGGTTGAACACTACTTTTCTGTTGCAG (SEQ ID NO: 41) R-M593H CTGCAACAGAAAAGTAGTGTTCAACCGGGCGCTGAC (SEQ ID NO: 42) F-M593I GTCAGCGCCCGGTTGAAATCTACTTTTCTGTTGCAG (SEQ ID NO: 43) R-M593I CTGCAACAGAAAAGTAGATTTCAACCGGGCGCTGAC (SEQ ID NO: 44) F-M593K GTCAGCGCCCGGTTGAAAAATACTTTTCTGTTGCAG (SEQ ID NO: 45) R-M593K CTGCAACAGAAAAGTATTTTTCAACCGGGCGCTGAC (SEQ ID NO: 46) F-M593L GTCAGCGCCCGGTTGAATTGTACTTTTCTGTTGCAG (SEQ ID NO: 47) R-M593L CTGCAACAGAAAAGTACAATTCAACCGGGCGCTGAC (SEQ ID NO: 48) F-M593N GTCAGCGCCCGGTTGAAAACTACTTTTCTGTTGCAG (SEQ ID NO: 49) R-M593N CTGCAACAGAAAAGTAGTTTTCAACCGGGCGCTGAC (SEQ ID NO: 50) F-M593Q GTCAGCGCCCGGTTGAACAGTACTTTTCTGTTGCAG (SEQ ID NO: 51) R-M593Q CTGCAACAGAAAAGTACTGTTCAACCGGGCGCTGAC (SEQ ID NO: 52) F-M593P GTCAGCGCCCGGTTGAACCGTACTTTTCTGTTGCAG (SEQ ID NO: 53) R-M593P CTGCAACAGAAAAGTACGGTTCAACCGGGCGCTGAC (SEQ ID NO: 54) F-M593R GTCAGCGCCCGGTTGAAAGGTACTTTTCTGTTGCAG (SEQ ID NO: 55) R-M593R CTGCAACAGAAAAGTACCTTTCAACCGGGCGCTGAC (SEQ ID NO: 56) F-M593S GTCAGCGCCCGGTTGAATCGTACTTTTCTGTTGCAG (SEQ ID NO: 57) R-M593S CTGCAACAGAAAAGTACGATTCAACCGGGCGCTGAC (SEQ ID NO: 58) F-M593T GTCAGCGCCCGGTTGAAACGTACTTTTCTGTTGCAG (SEQ ID NO: 59) R-M593T CTGCAACAGAAAAGTACGTTTCAACCGGGCGCTGAC (SEQ ID NO: 60) F-M593V GTCAGCGCCCGGTTGAAGTGTACTTTTCTGTTGCAG (SEQ ID NO: 61) R-M593V CTGCAACAGAAAAGTACACTTCAACCGGGCGCTGAC (SEQ ID NO: 62) F-M593W GTCAGCGCCCGGTTGAATGGTACTTTTCTGTTGCAG (SEQ ID NO: 63) R-M593W CTGCAACAGAAAAGTACCATTCAACCGGGCGCTGAC (SEQ ID NO: 64) F-M593Y GTCAGCGCCCGGTTGAATATTACTTTTCTGTTGCAG (SEQ ID NO: 65) R-M593Y CTGCAACAGAAAAGTAATATTCAACCGGGCGCTGAC (SEQ ID NO: 66) F-A620C CCTCTTGCCTGTGCGTAATCCTGGACG (SEQ ID NO: 67) R-A620C CGTCCAGGATTACGCACAGGCAAGAGG (SEQ ID NO: 68) F-A620D CCTCTTGCCTGGACGTAATCCTGGACG (SEQ ID NO: 69) R-A620D CGTCCAGGATTACGTCCAGGCAAGAGG (SEQ ID NO: 70) F-A620E CCTCTTGCCTGGAAGTAATCCTGGACG (SEQ ID NO: 71) R-A620E CGTCCAGGATTACTTCCAGGCAAGAGG (SEQ ID NO: 72) F-A620F CCTCTTGCCTGTTCGTAATCCTGGACG (SEQ ID NO: 73) R-A620F CGTCCAGGATTACGAACAGGCAAGAGG (SEQ ID NO: 74) F-A620G CCTCTTGCCTGGGCGTAATCCTGGACG (SEQ ID NO: 75) R-A620G CGTCCAGGATTACGCCCAGGCAAGAGG (SEQ ID NO: 76) F-A620H CCTCTTGCCTGCACGTAATCCTGGACG (SEQ ID NO: 77) R-A620H CGTCCAGGATTACGTGCAGGCAAGAGG (SEQ ID NO: 78) F-A620I CCTCTTGCCTGATCGTAATCCTGGACG (SEQ ID NO: 79) R-A620I CGTCCAGGATTACGATCAGGCAAGAGG (SEQ ID NO: 80) F-A620K CCTCTTGCCTGAAAGTAATCCTGGACG (SEQ ID NO: 81) R-A620K CGTCCAGGATTACTTTCAGGCAAGAGG (SEQ ID NO: 82) F-A620L CCTCTTGCCTGCTCGTAATCCTGGACG (SEQ ID NO: 83) R-A620L CGTCCAGGATTACGAGCAGGCAAGAGG (SEQ ID NO: 84) F-A620M CCTCTTGCCTGATGGTAATCCTGGACG (SEQ ID NO: 85) R-A620M CGTCCAGGATTACCATCAGGCAAGAGG (SEQ ID NO: 86) F-A620N CCTCTTGCCTGAACGTAATCCTGGACG (SEQ ID NO: 87) R-A620N CGTCCAGGATTACGTTCAGGCAAGAGG (SEQ ID NO: 88) F-A620P CCTCTTGCCTGCCCGTAATCCTGGACG (SEQ ID NO: 89) R-A620P CGTCCAGGATTACGGGCAGGCAAGAGG (SEQ ID NO: 90) F-A620Q CCTCTTGCCTGCAAGTAATCCTGGACG (SEQ ID NO: 91) R-A620Q CGTCCAGGATTACTTGCAGGCAAGAGG (SEQ ID NO: 92) F-A620R CCTCTTGCCTGCGCGTAATCCTGGACG (SEQ ID NO: 93) R-A620R CGTCCAGGATTACGCGCAGGCAAGAGG (SEQ ID NO: 94) F-A620S CCTCTTGCCTGTCCGTAATCCTGGACG (SEQ ID NO: 95) R-A620S CGTCCAGGATTACGGACAGGCAAGAGG (SEQ ID NO: 96) F-A620T CCTCTTGCCTGACCGTAATCCTGGACG (SEQ ID NO: 97) R-A620T CGTCCAGGATTACGGTCAGGCAAGAGG (SEQ ID NO: 98) F-A620V CCTCTTGCCTGGTCGTAATCCTGGACG (SEQ ID NO: 99) R-A620V CGTCCAGGATTACGACCAGGCAAGAGG (SEQ ID NO: 100) F-A620W CCTCTTGCCTGTGGGTAATCCTGGACG (SEQ ID NO: 101) R-A620W CGTCCAGGATTACCCACAGGCAAGAGG (SEQ ID NO: 102) F-A620Y CCTCTTGCCTGTACGTAATCCTGGACG (SEQ ID NO: 103) R-A620Y CGTCCAGGATTACGTACAGGCAAGAGG (SEQ ID NO: 104) F-Y700A GGCCTGCTGGCCTCCGCTACCAAGGAAGCG (SEQ ID NO: 105) R-Y700A CGCTTCCTTGGTAGCGGAGGCCAGCAGGCC (SEQ ID NO: 106) F-Y700C GGCCTGCTGGCCTCCTGTACCAAGGAAGCG (SEQ ID NO: 107) R-Y700C CGCTTCCTTGGTACAGGAGGCCAGCAGGCC (SEQ ID NO: 108) F-Y700D GGCCTGCTGGCCTCCGATACCAAGGAAGCG (SEQ ID NO: 109) R-Y700D CGCTTCCTTGGTATCGGAGGCCAGCAGGCC (SEQ ID NO: 110) F-Y700E GGCCTGCTGGCCTCCGAAACCAAGGAAGCG (SEQ ID NO: 111) R-Y700E CGCTTCCTTGGTTTCGGAGGCCAGCAGGCC (SEQ ID NO: 112) F-Y700F GGCCTGCTGGCCTCCTTTACCAAGGAAGCG (SEQ ID NO: 113) R-Y700F CGCTTCCTTGGTAAAGGAGGCCAGCAGGCC (SEQ ID NO: 114) F-Y700G GGCCTGCTGGCCTCCGGTACCAAGGAAGCG (SEQ ID NO: 115) R-Y700G CGCTTCCTTGGTACCGGAGGCCAGCAGGCC (SEQ ID NO: 116) F-Y700H GGCCTGCTGGCCTCCCATACCAAGGAAGCG (SEQ ID NO: 117) R-Y700H CGCTTCCTTGGTATGGGAGGCCAGCAGGCC (SEQ ID NO: 118) F-Y700I GGCCTGCTGGCCTCCATTACCAAGGAAGCG (SEQ ID NO: 119) R-Y700I CGCTTCCTTGGTAATGGAGGCCAGCAGGCC (SEQ ID NO: 120) F-Y700K GGCCTGCTGGCCTCCAAAACCAAGGAAGCG (SEQ ID NO: 121) R-Y700K CGCTTCCTTGGTTTTGGAGGCCAGCAGGCC (SEQ ID NO: 122) F-Y700L GGCCTGCTGGCCTCCTTAACCAAGGAAGCG (SEQ ID NO: 123) R-Y700L CGCTTCCTTGGTTAAGGAGGCCAGCAGGCC (SEQ ID NO: 124) F-Y700M GGCCTGCTGGCCTCCATGACCAAGGAAGCG (SEQ ID NO: 125) R-Y700M CGCTTCCTTGGTCATGGAGGCCAGCAGGCC (SEQ ID NO: 126) F-Y700N GGCCTGCTGGCCTCCAATACCAAGGAAGCG (SEQ ID NO: 127) R-Y700N CGCTTCCTTGGTATTGGAGGCCAGCAGGCC (SEQ ID NO: 128) F-Y700P GGCCTGCTGGCCTCCCCTACCAAGGAAGCG (SEQ ID NO: 129) R-Y700P CGCTTCCTTGGTAGGGGAGGCCAGCAGGCC (SEQ ID NO: 130) F-Y700Q GGCCTGCTGGCCTCCGAAACCAAGGAAGCG (SEQ ID NO: 131) R-Y700Q CGCTTCCTTGGTTTCGGAGGCCAGCAGGCC (SEQ ID NO: 132) F-Y700R GGCCTGCTGGCCTCCCGTACCAAGGAAGCG (SEQ ID NO: 133) RY700R CGCTTCCTTGGTACGGGAGGCCAGCAGGCC (SEQ ID NO: 134) F-Y700S GGCCTGCTGGCCTCCTCTACCAAGGAAGCG (SEQ ID NO: 135) R-Y700S CGCTTCCTTGGTAGAGGAGGCCAGCAGGCC (SEQ ID NO: 136) F-Y700T GGCCTGCTGGCCTCCACTACCAAGGAAGCG (SEQ ID NO: 137) R-Y700T CGCTTCCTTGGTAGTGGAGGCCAGCAGGCC (SEQ ID NO: 138) F-Y700V GGCCTGCTGGCCTCCGTTACCAAGGAAGCG (SEQ ID NO: 139) R-Y700V CGCTTCCTTGGTAACGGAGGCCAGCAGGCC (SEQ ID NO: 140) F-Y700W GGCCTGCTGGCCTCCTGGACCAAGGAAGCG (SEQ ID NO: 141) R-Y700W CGCTTCCTTGGTCCAGGAGGCCAGCAGGCC (SEQ ID NO: 142) REFERENCES