Engineered adenine base editor (ABE) variants with reduced RNA editing activity, and methods of using the same.
1. An adenine base editor (ABE) variant comprising an adenosine deaminase and a programmable DNA binding domain, the adenosine deaminase comprising one or more 2. The ABE variant of 3. The ABE variant of 4. The ABE variant of 5. The ABE variant of 6. The ABE variant of 7. The ABE variant of R13A; T17A; K20A and R21A; K20A, R21A, and R23A; R23W; E25A; R26A; A48G; I49A; A56G; R74A; D77A; V82G; W11A; V106G; N108A; A109W; K110A; T111A; A138G; D139A and E140A; A142G; A143G; R153A; V155G; V155W; A58G; N72A; V106W; K110A; H128A and R129A; A138W; D139A and E140A; A142W; F148A; or R150A of wild type 8. The ABE variant of 9. The ABE variant of 10. The ABE variant of 9 11. The ABE variant of 12. The ABE variant of 13. A base editing system comprising:
(i) the ABE variant of (ii) at least one guide RNA compatible with the base editor that directs the base editor to a target sequence. 14. An isolated nucleic acid encoding the ABE variant of 1. 15. A vector comprising the isolated nucleic acid of 16. An isolated host cell, preferably a mammalian host cell, comprising the nucleic acid of 17. The isolated host cell of 18. A method of deaminating a selected adenine in a nucleic acid, the method comprising contacting the nucleic acid with the ABE variant of 19. The method of 20. The method of 21. The method of 22. A composition comprising a purified ABE variant of 23. The composition of
This application claims the benefit of U.S. Provisional Application Ser. No. 62/800,974, filed on Feb. 4, 2019 and U.S. Provisional Application Ser. No. 62/844,717, filed on May 7, 2019. The entire contents of the foregoing are incorporated herein by reference. This invention was made with Government support under Grant No. HG009490 awarded by the National Institutes of Health and HR0011-17-2-0042 awarded by the Defense Advanced Research Projects Agency (DARPA). The Government has certain rights in the invention. The instant application contains a Sequence Listing which has been filed electronically in ASCII format and is hereby incorporated by reference in its entirety. Said ASCII copy, created on Mar. 27, 2020, is named 29539-0387001_SL.txt and is 201,290 bytes in size. Described herein are variants of wild type and engineered Base editors represent a new genome editing platform that allows efficient installation of single base substitutions in DNA (Rees, H. A. & Liu, D. R. Base editing: precision chemistry on the genome and transcriptome of living cells. Nat. Rev. Genet. (2018); Komor, A. C., et al., Programmable editing of a target base in genomic DNA without double-stranded DNA cleavage, Nature (2016); Gaudelli, N. M. et al., Programmable base editing of A•T to G•C in genomic DNA without DNA cleavage. Described herein are adenine base editors (ABEs) having reduced RNA editing activity. These ABEs comprise a programmable DNA-binding domain fused to an adenosine deaminase, e.g. TadA or previously described engineered TadA variants (e.g. ABEs 0.1, 0.2, 1.1, 1.2, 2.1, 2.2, 2.3, 2.4, 2.5, 2.6, 2.7, 2.8, 2.9, 2.10, 2.11, 2.12, 3.1, 3.2, 3.3, 3.4, 3.5, 3.6, 3.7, 3.8, 4.1, 4.2, 4.3, 5.1, 5.2, 5.3, 5.4, 5.5, 5.6, 5.7, 5.8, 5.9, 5.10, 5.11, 5.12, 5.13, 5.14, 6.1, 6.2, 6.3, 6.4, 6.5, 6.6, 7.1, 7.2, 7.3, 7.4, 7.5, 7.6, 7.7, 7.8, 7.9, 7.10, ABEmax as described in Gaudelli et al., In one aspect, the present disclosure relates to an adenine base editor (ABE) variant including an adenosine deaminase and a programmable DNA binding domain, the adenosine deaminase including one or more In one embodiment, the adenosine deaminase includes a wild type or engineered In one embodiment, the adenosine deaminase includes ABE 0.1, ABE 0.2, ABE 1.1, ABE 1.2, ABE 2.1, ABE 2.2, ABE 2.3, ABE 2.4, ABE 2.5, ABE 2.6, ABE 2.7, ABE 2.8, ABE 2.9, ABE 2.10, ABE 2.11, ABE 2.12, ABE 3.1, ABE 3.2, ABE 3.3, ABE 3.4, ABE 3.5, ABE 3.6, ABE 3.7, ABE 3.8, ABE 4.1, ABE 4.2, ABE 4.3, ABE 5.1, ABE 5.2, ABE 5.3, ABE 5.4, ABE 5.5, ABE 5.6, ABE 5.7, ABE 5.8, ABE 5.9, ABE 5.10, ABE 5.11, ABE 5.12, ABE 5.13, ABE 5.14, ABE 6.1, ABE 6.2, ABE 6.3, ABE 6.4, ABE 6.5, ABE 6.6, ABE 7.1, ABE 7.2, ABE 7.3, ABE 7.4, ABE 7.5, ABE 7.6, ABE 7.7, ABE 7.8, ABE 7.9, ABE 7.10, or ABEmax. In one embodiment, the one or more mutations include one or more mutations at amino acid positions that correspond to residues of wild type In one embodiment, the one or more mutations include mutations that correspond to Y10A, W11A, R13A, T17A, K20A, R21A, R23A, R23W, E25A, R26A, A48G, 149A, A56G, A58G, Q71A, N72A, R74A, D77A, V82G, V106G, V106W, R107A, N108A, A109G, A109W, K110A, T111A, H122A, Y123A, H128A, R129A, A138W, A138G, D139A, E140A, A142W, A142G, A143G, F148A, R150A, R153A, V155G, and/or V155W of wild type In one embodiment, the at least one of the one or more In one embodiment, the ABE variant described herein further includes one or more nuclear localization sequences (NLS). In one embodiment, the ABE variant described herein includes a linker between the adenosine deaminase monomers and/or between the adenosine deaminase monomer or between a single-chain dimer and the programmable DNA binding domain. In one embodiment, the programmable DNA binding domain is a engineered C2H2 zinc-finger, a transcription activator effector-like effector (TALE), or a Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) Cas RNA-guided nucleases (CRISPR-Cas nuclease), or a variant thereof. In one embodiment, the CRISPR-Cas nuclease is a single strand DNA (ssDNA) nickase or is catalytically inactive. In one embodiment, the CRISPR-Cas nuclease is a Cas9 or Cas12a that has ssDNA nickase activity or is catalytically inactive. In one aspect, the present disclosure relates to a base editing system including: (i) an ABE variant described herein, where the programmable DNA binding domain is a CRISPR Cas RGN or a variant thereof; and (ii) at least one guide RNA compatible with the base editor that directs the base editor to a target sequence. In one aspect, the present disclosure relates to an isolated nucleic acid encoding an ABE variant disclosed herein. In one aspect, the present disclosure relates to a vector including an isolated nucleic acid described herein. In one aspect, the present disclosure relates to an isolated host cell, preferably a mammalian host cell, including a nucleic acid described herein. In one embodiment, the isolated host cell described herein expresses any one of the ABE variant described herein. In one aspect, the present disclosure relates to a method of deaminating a selected adenine in a nucleic acid, the method including contacting the nucleic acid with an ABE variant or a base editing system described herein. In one embodiment, the nucleic acid is in a cell. In one embodiment, the cell is in a living subject. In one embodiment, the living subject is a mammal. In one aspect, the present disclosure relates to a composition including a purified ABE variant or a base editing system described herein. In one embodiment, the composition described herein includes one or more ribonucleoprotein (RNP) complexes. Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Methods and materials are described herein for use in the present invention; other, suitable methods and materials known in the art can also be used. The materials, methods, and examples are illustrative only and not intended to be limiting. All publications, patent applications, patents, sequences, database entries, and other references mentioned herein are incorporated by reference in their entirety. In case of conflict, the present specification, including definitions, will control. Other features and advantages of the invention will be apparent from the following detailed description and figures, and from the claims. The patent or application file contains at least one drawing executed in color. Copies of this patent or patent application publication with color drawing(s) will be provided by the Office upon request and payment of the necessary fee. (a) Schematic illustration of ABEmax and miniABEmax architectures and overview of experimental testing of miniABEmax for on-target DNA and off-target RNA editing. Left-most and right-most boxes=bipartite NLS at N- and C-termini, TadA*=mutant TadA 7.103, and small grey boxes (between TadA and TadA*, or TadA* and SpCas9(D10A))=32AA flanked XTEN linkers. nCas9 (SpCas9 D10A)=grey shape, TadA WT and mutant monomers=circles. Halo=sites of potential adenine deamination on DNA and RNA. (b) Unstratified sequence logo (left) and stratified sequence logos for RNA adenines edited with high (80-1001%, middle (50-801%, and low (0-501% efficiencies by ABEmax. RNA-seq data shown in the Jitter plot was obtained from HEK293T cells in an earlier published study. (c) Bar plots showing the number of RNA A-to-I edits observed in RNA-seq experiments in HEK293T cells with expression of ABEmax, miniABEmax, miniABEmax-K20A/R21A, or miniABEmax-V82G each with three different gRNAs (ABE site 16, HEK site 2, and non-targeting (NT)) and performed in independent biological replicates (n=3). GFP negative control also performed in independent biological replicates (n=3) is also shown. (d) Jitter plot showing the efficiencies of RNA A-to-I edits from the RNA-seq experiments shown in c. Each dot represents an edited adenine position in RNA. (e) Structural representations of Heat maps (a) and bar plots (b) showing the on-target DNA A-to-G editing efficiencies of nCas9 (Control), ABEmax, miniABEmax-K20A/R21A, and miniABEmax-V82G with 22 gRNAs (n=4 independent replicates). For (a), editing window shown includes only the most highly edited adenines and not the entire spacer sequence. A-to-G editing efficiencies are shown in heatmap format. Numbering at the bottom represents spacer position with 1 being the most PAM-distal location. For (b), A-to-G editing efficiencies for only the most highly edited adenine for each gRNA on-target site are reported; error bars represent standard deviation (SD). (a) Scatterplots showing A-to-I self-editing induced by expression of ABEmax, miniABEmax, miniABEmax-K20A/R21A, and miniABEmax-V82G (sorted for all GFP-positive cells) with gRNAs targeting HEK site 2, ABE site 16, and a non-targeting gRNA (NT) in HEK293T cells. Each dot represents an edited A and the color of the dot indicates the predicted type of mutation caused by a A-to-I edit at that position. The y-axis shows editing efficiencies for each A-to-I modification and the x-axis represents the position of each A within the ABE coding sequence (with the architecture of the editor shown schematically below but not displaying the NLS and linkers). n=total number of modified As. (b) UpSet plots showing the intersections of RNA A-to-I self-edits induced by ABEmax on its own transcript across three replicates. Each plot shows data from co-expression of ABEmax with one of three different gRNAs. (c) UpSet plots showing the intersection of RNA A-to-I self-edits induced by ABEmax across three different gRNAs. For each gRNA, we used A-to-I edits that represent the union of all such edits across the three replicates. Heat maps showing the on-target DNA editing efficiencies of nCas9 (Control), ABEmax, miniABEmax, miniABEmax-K20A/R21A, and miniABEmax-V82G each assessed with two gRNAs targeted to ABE site 16 and HEK site 2 and performed in triplicate. Note that these were performed with the same transfected cells used for the RNA-seq experiments shown in Histograms showing the total number of RNA A-to-I edits observed (y-axis) for different editing efficiencies (x-axis) for ABEmax, miniABEmax, miniABEmax-K20A/R21A, and miniABEmax-V82G each tested with the ABE site 16, HEK site 2, and NT gRNAs. n=number of modified adenines. Experiments were performed in biological triplicate (data is derived from the same experiments as Sequence logos derived using all RNA-edited adenines (0-100]% or stratified RNA-edited adenines with high (80-1001%, middle (50-80]%, or low (0-50]% edit efficiencies induced by (a) ABEmax co-expressed with an ABE site 16, HEK site 2 or NT (non-targeting) gRNA or (b) miniABEmax co-expressed with an ABE site 16, HEK site 2, or NT gRNA. Logos are shown for biological triplicates from the same RNA-seq experiments displayed (a) Alignment of Scatterplots showing A-to-I self-editing induced by expression of ABEmax, miniABEmax, miniABEmax-K20A/R21A, and miniABEmax-V82G (sorted for all GFP-positive cells) with gRNAs targeting HEK site 2, ABE site 16, and a non-targeting gRNA (NT) in HEK293T cells for 2 other replicates. Each dot represents an edited A and the color of the dot indicates the predicted type of mutation caused by a A-to-I edit at that position. The y-axis shows editing efficiencies for each A-to-I modification and the x-axis represents the position of each A within the ABE coding sequence (with the architecture of the editor shown schematically below but not displaying the NLS and linkers). n=total number of modified As. Heat maps showing A-to-G DNA on-target (left) and A-to-G DNA off-target (right) editing efficiencies of nCas9 (Control), ABEmax, miniABEmax-K20A/R21A, and miniABEmax-V82G each co-expressed with HEK site 2, HEK site 3, or HEK site 4 gRNAs (n=4 independent replicates). Editing windows shown include the most highly edited adenines. Numbering at the bottom represents spacer position with 1 being the most PAM-distal location. ABEs efficiently install A-to-G transitions in DNA (Gaudelli, N. M. et al. Programmable base editing of A•T to G•C in genomic DNA without DNA cleavage. We sought to investigate if the RNA editing capability of the TadA enzyme might have been preserved or even expanded (e.g., to other RNA species) when present in an ABE context ( Thus, described herein are variants of wild type and engineered (ABE0.1-7.10 and ABEmax) TadA domains, each as monomers and/or combined as single-chain homodimers and/or single-chain heterodimers, bearing mutations that may exhibit reduced RNA editing (RRE) activities while preserving DNA deamination activities, optionally fused to an engineered DNA binding domain such as a CRISPR-Cas nuclease modified to either be a nickase or catalytically inactive, to enable DNA adenine base editing with reduced RNA mutation profiles. These SElective Curbing of Unwanted RNA Editing (SECURE)-ABE variants exhibit substantially reduced unwanted RNA editing activities while retaining robust and more precise on-target DNA editing. Herein are described structure-guided engineering of SECURE-ABE variants that not only possess reduced off-target RNA editing with comparable on-target DNA activities but are also the smallest The work described here extends our understanding of the off-target RNA editing activities of DNA base editors, expands the options available to minimize these unwanted effects, and provides novel SECURE base editor architectures with other desirable properties. The successful engineering of SECURE-ABE variants shows that it is possible to minimize unwanted RNA editing while retaining efficient on-target DNA editing for an ABE. In the process of engineering these variants, we discovered a more extended consensus sequence motif for adenines edited with high efficiencies by ABEmax (CUACGAA) that appears to be recognized by the wild-type TadA part of this fusion. Deletion of this TadA domain abolished recognition of these high efficiency sites and also resulted in the generation of the smallest SpCas9 base editors (1605 amino acids in length) described to date. Our findings further expand the toolbox of base editors that can be used without inducing high-level RNA editing. Our description of self-editing by DNA base editors provides yet another strong motivation to avoid the use of base editors that possess off-target RNA editing activities. Self-editing by ABEs potentially creates a heterogeneous population of base editor-encoding transcripts in human cells including missense mutations that might lead to the generation of novel epitopes or other gain/loss-of-function effects. The potential impacts of creating diverse mutated forms of base editor proteins in cells are particularly important to consider because these fusions will be highly overexpressed for most applications. One possibility is that these truncated forms might further exacerbate RNA editing activity levels because these proteins would still be expected to induce off-target RNA editing but not on-target DNA editing. Thus, the existence of self-editing further underscores the importance of using DNA base editors with reduced RNA editing activities for both research and therapeutic applications. In some embodiments, the adenosine deaminase is TadA from Reduced RNA Editing (RRE) Base Editor Variants Thus described herein are base editors comprising adenosine deaminases with one or more mutations to reduce undesirable RNA editing activity. In general, these base editors have one or more mutations as described herein. In some embodiments, they have mutations shown in Table A that correspond to residues in wild type (SEQ ID NO: 1) or engineered The mutations can include substitution with any other amino acid other than the WT amino acid; in some embodiments the substitution is with alanine or glycine. The wild type sequence of wild type The engineered In the most commonly used ABEs (ABE7.10 and ABEmax), these two proteins are fused using a 32 amino acid linker (bolded in sequence below), forming a heterodimer, the sequence of which is as follows: Other exemplary sequences are shown in the list below as well as aligned to In some embodiments, the base editors do not include catalytically dead adenine deaminase variants, e.g. E59A. (Gaudelli et al, 2017, PM ID: 29160308). Programmable DNA Binding Domain In some embodiments, the base editors include programmable DNA binding domains such as engineered C2H2 zinc-fingers, transcription activator effector-like effectors (TALEs), and Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) Cas RNA-guided nucleases (RGNs) and their variants, including ssDNA nickases (nCas9) or their analogs and catalytically inactive dead Cas9 (dCas9) and its analogs (e.g., as shown in Table C), and any engineered protospacer-adjacent motif (PAM) or high-fidelity variants (e.g., as shown inTable D). A programmable DNA binding domain is one that can be engineered to bind to a selected target sequence. CRISPR-Cas Nucleases Although herein we refer to Cas9, in general any Cas9-like nickase could be used (including the related Cpf1/Cas12a enzyme classes), unless specifically indicated. These orthologs, and mutants and variants thereof as known in the art, can be used in any of the fusion proteins described herein. See, e.g., WO 2017/040348 (which describes variants of SaCas9 and SpCas 9 with increased specificity) and WO 2016/141224 (which describes variants of SaCas9 and SpCas 9 with altered PAM specificity). The Cas9 nuclease from In some embodiments, the present system utilizes a wild type or variant Cas9 protein from In some embodiments, the Cas9 also includes one of the following mutations, which reduce nuclease activity of the Cas9; e.g., for SpCas9, mutations at D10A or H840A (which creates a single-strand nickase). In some embodiments, the SpCas9 variants also include mutations at one of each of the two sets of the following amino acid positions, which together destroy the nuclease activity of the Cas9: D10, E762, D839, H983, or D986 and H840 or N863, e.g., D10A/D10N and H840A/H840N/H840Y, to render the nuclease portion of the protein catalytically inactive; substitutions at these positions could be alanine (as they are in Nishimasu al., Cell 156, 935-949 (2014)), or other residues, e.g., glutamine, asparagine, tyrosine, serine, or aspartate, e.g., E762Q, H983N, H983Y, D986N, N863D, N863S, or N863H (see WO 2014/152432). In some embodiments, the Cas9 is fused to one or more SV40 or bipartite (bp) nuclear localization sequences (NLSs) protein sequences; an exemplary (bp)NLS sequence is as follows: (KRTADGSEFES)PKKKRKV (SEQ ID NO: 23). Typically, the NLSs are at the N- and C-termini of an ABEmax fusion protein, but can also be positioned at the N- or C-terminus in other ABEs, or between the DNA binding domain and the deaminase domain. Linkers as known in the art can be used to separate domains. TAL Effector Repeat Arrays Transcription activator like effectors (TALEs) of plant pathogenic bacteria in the genus Xanthomonas play important roles in disease, or trigger defense, by binding host DNA and activating effector-specific host genes. Specificity depends on an effector-variable number of imperfect, typically ˜33-35 amino acid repeats. Polymorphisms are present primarily at repeat positions 12 and 13, which are referred to herein as the repeat variable-diresidue (RVD). The RVDs of TAL effectors correspond to the nucleotides in their target sites in a direct, linear fashion, one RVD to one nucleotide, with some degeneracy and no apparent context dependence. In some embodiments, the polymorphic region that grants nucleotide specificity may be expressed as a triresidue or triplet. Each DNA binding repeat can include a RVD that determines recognition of a base pair in the target DNA sequence, wherein each DNA binding repeat is responsible for recognizing one base pair in the target DNA sequence. In some embodiments, the RVD can comprise one or more of: HA for recognizing C; ND for recognizing C; HI for recognizing C; HN for recognizing G; NA for recognizing G; SN for recognizing G or A; YG for recognizing T; and NK for recognizing G, and one or more of: HD for recognizing C; NG for recognizing T; NI for recognizing A; NN for recognizing G or A; NS for recognizing A or C or G or T; N* for recognizing C or T, wherein * represents a gap in the second position of the RVD; HG for recognizing T; H* for recognizing T, wherein * represents a gap in the second position of the RVD; and IG for recognizing T. TALE proteins may be useful in research and biotechnology as targeted chimeric nucleases that can facilitate homologous recombination in genome engineering (e.g., to add or enhance traits useful for biofuels or biorenewables in plants). These proteins also may be useful as, for example, transcription factors, and especially for therapeutic applications requiring a very high level of specificity such as therapeutics against pathogens (e.g., viruses) as non-limiting examples. Methods for generating engineered TALE arrays are known in the art, see, e.g., the fast ligation-based automatable solid-phase high-throughput (FLASH) system described in U.S. Ser. No. 61/610,212, and Reyon et al., Zinc Fingers Zinc finger (ZF) proteins are DNA-binding proteins that contain one or more zinc fingers, independently folded zinc-containing mini-domains, the structure of which is well known in the art and defined in, for example, Miller et al., 1985, Multiple studies have shown that it is possible to artificially engineer the DNA binding characteristics of individual zinc fingers by randomizing the amino acids at the alpha-helical positions involved in DNA binding and using selection methodologies such as phage display to identify desired variants capable of binding to DNA target sites of interest (Rebar et al., 1994, One existing method for engineering zinc finger arrays, known as “modular assembly,” advocates the simple joining together of pre-selected zinc finger modules into arrays (Segal et al., 2003, Combinatorial selection-based methods that identify zinc finger arrays from randomized libraries have been shown to have higher success rates than modular assembly (Maeder et al., 2008, Variants In some embodiments, the components of the fusion proteins are at least 80%, e.g., at least 85%, 90%, 95%, 97%, or 99% identical to the amino acid sequence of an exemplary sequence (e.g., a TadA or DBD as provided herein), e.g., have differences at up to 1%, 2%, 5%, 10%, 15%, or 20% of the residues of the exemplary sequence replaced, e.g., with conservative mutations, e.g., including or in addition to the mutations described herein. In preferred embodiments, the variant retains a desired activity of the parent, e.g., deaminase activity, and/or the ability to interact with a guide RNA and/or target DNA, optionally with improved specificity or altered substrate specificity. To determine the percent identity of two nucleic acid sequences, the sequences are aligned for optimal comparison purposes (e.g., gaps can be introduced in one or both of a first and a second amino acid or nucleic acid sequence for optimal alignment and non-homologous sequences can be disregarded for comparison purposes). The length of a reference sequence aligned for comparison purposes is at least 80% of the length of the reference sequence, and in some embodiments is at least 90% or 100%. The nucleotides at corresponding amino acid positions or nucleotide positions are then compared. When a position in the first sequence is occupied by the same nucleotide as the corresponding position in the second sequence, then the molecules are identical at that position (as used herein nucleic acid “identity” is equivalent to nucleic acid “homology”). The percent identity between the two sequences is a function of the number of identical positions shared by the sequences, taking into account the number of gaps, and the length of each gap, which need to be introduced for optimal alignment of the two sequences. Percent identity between two polypeptides or nucleic acid sequences is determined in various ways that are within the skill in the art, for instance, using publicly available computer software such as Smith Waterman Alignment (Smith, T. F. and M. S. Waterman (1981) For purposes of the present disclosure, the comparison of sequences and determination of percent identity between two sequences can be accomplished using a Blossum 62 scoring matrix with a gap penalty of 12, a gap extend penalty of 4, and a frameshift gap penalty of 5. Conservative substitutions typically include substitutions within the following groups: glycine, alanine; valine, isoleucine, leucine; aspartic acid, glutamic acid, asparagine, glutamine; serine, threonine; lysine, arginine; and phenylalanine, tyrosine. Also provided herein are isolated nucleic acids encoding the base editor fusion proteins, vectors comprising the isolated nucleic acids, optionally operably linked to one or more regulatory domains for expressing the variant proteins, and host cells, e.g., mammalian host cells, comprising the nucleic acids, and optionally expressing the variant proteins. In some embodiments, the host cells are stem cells, e.g., hematopoietic stem cells. In some embodiments, the fusion proteins include a linker between the DNA binding domain (e.g., ZFN, TALE, or nCas9) and the BE domains. Linkers that can be used in these fusion proteins (or between fusion proteins in a concatenated structure) can include any sequence that does not interfere with the function of the fusion proteins. In preferred embodiments, the linkers are short, e.g., 2-20 amino acids, and are typically flexible (i.e., comprising amino acids with a high degree of freedom such as glycine, alanine, and serine). In some embodiments, the linker comprises one or more units consisting of GGGS (SEQ ID NO:24) or GGGGS (SEQ ID NO:25), e.g., two, three, four, or more repeats of the GGGS (SEQ ID NO:24) or GGGGS (SEQ ID NO:25) unit. Other linker sequences can also be used. In some embodiments, the deaminase fusion protein includes a cell-penetrating peptide sequence that facilitates delivery to the intracellular space, e.g., HIV-derived TAT peptide, penetratins, transportans, or hCT derived cell-penetrating peptides, see, e.g., Caron et al., (2001) Cell penetrating peptides (CPPs) are short peptides that facilitate the movement of a wide range of biomolecules across the cell membrane into the cytoplasm or other organelles, e.g. the mitochondria and the nucleus. Examples of molecules that can be delivered by CPPs include therapeutic drugs, plasmid DNA, oligonucleotides, siRNA, peptide-nucleic acid (PNA), proteins, peptides, nanoparticles, and liposomes. CPPs are generally 30 amino acids or less, are derived from naturally or non-naturally occurring protein or chimeric sequences, and contain either a high relative abundance of positively charged amino acids, e.g. lysine or arginine, or an alternating pattern of polar and non-polar amino acids. CPPs that are commonly used in the art include Tat (Frankel et al., (1988) CPPs can be linked with their cargo through covalent or non-covalent strategies. Methods for covalently joining a CPP and its cargo are known in the art, e.g. chemical cross-linking (Stetsenko et al., (2000) CPPs have been utilized in the art to deliver potentially therapeutic biomolecules into cells. Examples include cyclosporine linked to polyarginine for immunosuppression (Rothbard et al., (2000) CPPs have been utilized in the art to transport contrast agents into cells for imaging and biosensing applications. For example, green fluorescent protein (GFP) attached to Tat has been used to label cancer cells (Shokolenko et al., (2005) DNA Repair 4(4):511-518). Tat conjugated to quantum dots have been used to successfully cross the blood-brain barrier for visualization of the rat brain (Santra et al., (2005) Alternatively or in addition, the deaminase fusion proteins can include a nuclear localization sequence, e.g., SV40 large T antigen NLS (PKKKRRV (SEQ ID NO:26)) and nucleoplasmin NLS (KRPAATKKAGQAKKKK (SEQ ID NO:27)). Other NLSs are known in the art; see, e.g., Cokol et al., In some embodiments, the deaminase fusion proteins include a moiety that has a high affinity for a ligand, for example GST, FLAG or hexahistidine (SEQ ID NO: 35) sequences. Such affinity tags can facilitate the purification of recombinant deaminase fusion proteins. The deaminase fusion proteins described herein can be used for altering the genome of a cell. The methods generally include expressing or contacting the deaminase fusion proteins in the cells; in versions using one or two Cas9s, the methods include using a guide RNA having a region complementary to a selected portion of the genome of the cell. Methods for selectively altering the genome of a cell are known in the art, see, e.g., U.S. Pat. No. 8,993,233; US 20140186958; U.S. Pat. No. 9,023,649; WO/2014/099744; WO 2014/089290; WO2014/144592; WO144288; WO2014/204578; WO2014/152432; WO2115/099850; U.S. Pat. No. 8,697,359; US20160024529; US20160024524; US20160024523; US20160024510; US20160017366; US20160017301; US20150376652; US20150356239; US20150315576; US20150291965; US20150252358; US20150247150; US20150232883; US20150232882; US20150203872; US20150191744; US20150184139; US20150176064; US20150167000; US20150166969; US20150159175; US20150159174; US20150093473; US20150079681; US20150067922; US20150056629; US20150044772; US20150024500; US20150024499; US20150020223;; US20140356867; US20140295557; US20140273235; US20140273226; US20140273037; US20140189896; US20140113376; US20140093941; US20130330778; US20130288251; US20120088676; US20110300538; US20110236530; US20110217739; US20110002889; US20100076057; US20110189776; US20110223638; US20130130248; US20150050699; US20150071899; US20150050699; ; US20150045546; US20150031134; US20150024500; US20140377868; US20140357530; US20140349400; US20140335620; US20140335063; US20140315985; US20140310830; US20140310828; US20140309487; US20140304853; US20140298547; US20140295556; US20140294773; US20140287938; US20140273234; US20140273232; US20140273231; US20140273230; US20140271987; US20140256046; US20140248702; US20140242702; US20140242700; US20140242699; US20140242664; US20140234972; US20140227787; US20140212869; US20140201857; US20140199767; US20140189896; US20140186958; US20140186919; US20140186843; US20140179770; US20140179006; US20140170753; WO/2008/108989; WO/2010/054108; WO/2012/164565; WO/2013/098244; WO/2013/176772; US 20150071899; Makarova et al., “Evolution and classification of the CRISPR-Cas systems” 9(6) Nature Reviews Microbiology 467-477 (1-23) (June 2011); Wedenheft et al., “RNA-guided genetic silencing systems in bacteria and archaea” 482 Nature 331-338 (Feb. 16, 2012); Gasiunas et al., “Cas9-crRNA ribonucleoprotein complex mediates specific DNA cleavage for adaptive immunity in bacteria” 109(39) Proceedings of the National Academy of Sciences USA E2579-E2586 (Sep. 4, 2012); Jinek et al., “A Programmable Dual-RNA-Guided DNA Endonuclease in Adaptive Bacterial Immunity” 337 Science 816-821 (Aug. 17, 2012); Carroll, “A CRISPR Approach to Gene Targeting” 20(9) Molecular Therapy 1658-1660 (September 2012); U.S. Appl. No. 61/652,086, filed May 25, 2012; Al-Attar et al., Clustered Regularly Interspaced Short Palindromic Repeats (CRISPRs): The Hallmark of an Ingenious Antiviral Defense Mechanism in Prokaryotes, Biol Chem. (2011) vol. 392, Issue 4, pp. 277-289; Hale et al., Essential Features and Rational Design of CRISPR RNAs That Function With the Cas RAMP Module Complex to Cleave RNAs, Molecular Cell, (2012) vol. 45, Issue 3, 292-302. For methods in which the deaminase fusion proteins are delivered to cells, the proteins can be produced using any method known in the art, e.g., by in vitro translation, or expression in a suitable host cell from nucleic acid encoding the deaminase fusion protein; a number of methods are known in the art for producing proteins. For example, the proteins can be produced in and purified from yeast, Expression Systems To use the deaminase fusion proteins described herein, it may be desirable to express them from a nucleic acid that encodes them. This can be performed in a variety of ways. For example, the nucleic acid encoding the deaminase fusion can be cloned into an intermediate vector for transformation into prokaryotic or eukaryotic cells for replication and/or expression. Intermediate vectors are typically prokaryote vectors, e.g., plasmids, or shuttle vectors, or insect vectors, for storage or manipulation of the nucleic acid encoding the deaminase fusion for production of the deaminase fusion protein. The nucleic acid encoding the deaminase fusion protein can also be cloned into an expression vector, for administration to a plant cell, animal cell, preferably a mammalian cell or a human cell, fungal cell, bacterial cell, or protozoan cell. To obtain expression, a sequence encoding a deaminase fusion protein is typically subcloned into an expression vector that contains a promoter to direct transcription. Suitable bacterial and eukaryotic promoters are well known in the art and described, e.g., in Sambrook et al., Molecular Cloning, A Laboratory Manual (3d ed. 2001); Kriegler, Gene Transfer and Expression: A Laboratory Manual (1990); and Current Protocols in Molecular Biology (Ausubel et al., eds., 2010). Bacterial expression systems for expressing the engineered protein are available in, e.g., The promoter used to direct expression of a nucleic acid depends on the particular application. For example, a strong constitutive promoter is typically used for expression and purification of fusion proteins. In contrast, when the deaminase fusion protein is to be administered in vivo for gene regulation, either a constitutive or an inducible promoter can be used, depending on the particular use of the deaminase fusion protein. In addition, a preferred promoter for administration of the deaminase fusion protein can be a weak promoter, such as HSV TK or a promoter having similar activity. The promoter can also include elements that are responsive to transactivation, e.g., hypoxia response elements, Gal4 response elements, lac repressor response element, and small molecule control systems such as tetracycline-regulated systems and the RU-486 system (see, e.g., Gossen & Bujard, 1992, Proc. Natl. Acad. Sci. USA, 89:5547; Oligino et al., 1998, Gene Ther., 5:491-496; Wang et al., 1997, Gene Ther., 4:432-441; Neering et al., 1996, Blood, 88:1147-55; and Rendahl et al., 1998, Nat. Biotechnol., 16:757-761). In addition to the promoter, the expression vector typically contains a transcription unit or expression cassette that contains all the additional elements required for the expression of the nucleic acid in host cells, either prokaryotic or eukaryotic. A typical expression cassette thus contains a promoter operably linked, e.g., to the nucleic acid sequence encoding the deaminase fusion protein, and any signals required, e.g., for efficient polyadenylation of the transcript, transcriptional termination, ribosome binding sites, or translation termination. Additional elements of the cassette may include, e.g., enhancers, and heterologous spliced intronic signals. The particular expression vector used to transport the genetic information into the cell is selected with regard to the intended use of the deaminase fusion protein, e.g., expression in plants, animals, bacteria, fungus, protozoa, etc. Standard bacterial expression vectors include plasmids such as pBR322 based plasmids, pSKF, pET23D, and commercially available tag-fusion expression systems such as GST and LacZ. Expression vectors containing regulatory elements from eukaryotic viruses are often used in eukaryotic expression vectors, e.g., SV40 vectors, papilloma virus vectors, and vectors derived from Epstein-Barr virus. Other exemplary eukaryotic vectors include pMSG, pAV009/A+, pMT010/A+, pMAMneo-5, baculovirus pDSVE, and any other vector allowing expression of proteins under the direction of the SV40 early promoter, SV40 late promoter, metallothionein promoter, murine mammary tumor virus promoter, Rous sarcoma virus promoter, polyhedrin promoter, or other promoters shown effective for expression in eukaryotic cells. The vectors for expressing the deaminase fusion protein can include RNA Pol III promoters to drive expression of the guide RNAs, e.g., the H1, U6 or 7SK promoters. These human promoters allow for expression of deaminase fusion protein in mammalian cells following plasmid transfection. Some expression systems have markers for selection of stably transfected cell lines such as thymidine kinase, hygromycin B phosphotransferase, and dihydrofolate reductase. High yield expression systems are also suitable, such as using a baculovirus vector in insect cells, with the gRNA encoding sequence under the direction of the polyhedrin promoter or other strong baculovirus promoters. The elements that are typically included in expression vectors also include a replicon that functions in Standard transfection methods are used to produce bacterial, mammalian, yeast or insect cell lines that express large quantities of protein, which are then purified using standard techniques (see, e.g., Colley et al., 1989, J. Biol. Chem., 264:17619-22; Guide to Protein Purification, in Methods in Enzymology, vol. 182 (Deutscher, ed., 1990)). Transformation of eukaryotic and prokaryotic cells are performed according to standard techniques (see, e.g., Morrison, 1977, J. Bacteriol. 132:349-351; Clark-Curtiss & Curtiss, Methods in Enzymology 101:347-362 (Wu et al., eds, 1983). Any of the known procedures for introducing foreign nucleotide sequences into host cells may be used. These include the use of calcium phosphate transfection, polybrene, protoplast fusion, electroporation, nucleofection, liposomes, microinjection, naked DNA, plasmid vectors, viral vectors, both episomal and integrative, and any of the other well-known methods for introducing cloned genomic DNA, cDNA, synthetic DNA or other foreign genetic material into a host cell (see, e.g., Sambrook et al., supra). It is only necessary that the particular genetic engineering procedure used be capable of successfully introducing at least one gene into the host cell capable of expressing the deaminase fusion protein. In methods wherein the fusion proteins include a Cas9 domain, the methods also include delivering at least one gRNA that interacts with the Cas9, or a nucleic acid that encodes a gRNA. Alternatively, the methods can include delivering the deaminase fusion protein and guide RNA together, e.g., as a complex. For example, the deaminase fusion protein and gRNA can be can be overexpressed in a host cell and purified, then complexed with the guide RNA (e.g., in a test tube) to form a ribonucleoprotein (RNP), and delivered to cells. In some embodiments, the deaminase fusion protein can be expressed in and purified from bacteria through the use of bacterial expression plasmids. For example, His-tagged deaminase fusion protein can be expressed in bacterial cells and then purified using nickel affinity chromatography. The use of RNPs circumvents the necessity of delivering plasmid DNAs encoding the nuclease or the guide, or encoding the nuclease as an mRNA. RNP delivery may also improve specificity, presumably because the half-life of the RNP is shorter and there's no persistent expression of the nuclease and guide (as you'd get from a plasmid). The RNPs can be delivered to the cells in vivo or in vitro, e.g., using lipid-mediated transfection or electroporation. See, e.g., Liang et al. “Rapid and highly efficient mammalian cell engineering via Cas9 protein transfection.” Journal of biotechnology 208 (2015): 44-53; Zuris, John A., et al. “Cationic lipid-mediated delivery of proteins enables efficient protein-based genome editing in vitro and in vivo.” Nature biotechnology 33.1 (2015): 73-80; Kim et al. “Highly efficient RNA-guided genome editing in human cells via delivery of purified Cas9 ribonucleoproteins.” Genome research 24.6 (2014): 1012-1019. The present invention also includes the vectors and cells comprising the vectors, as well as kits comprising the proteins and nucleic acids described herein, e.g., for use in a method described herein. Methods of Use The base editors described herein can be used to deaminate a selected adenine in a nucleic acid sequence, e.g., in a cell, e.g., a cell in an animal (e.g., a mammal such as a human or veterinary subject), or a synthetic nucleic acid substrate. The methods include contacting the nucleic acid with a base editor as described herein. Where the base editor includes a CRISPR Cas9 or Cas12a protein, the methods further include the use of one or more guide RNAs that direct binding of the base editor to a sequence to be deaminated. For example, the base editors described herein can be used for in vitro, in vivo or in situ directed evolution, e.g., to engineer polypeptides or proteins based on a synthetic selection framework, e.g. antibiotic resistance in The invention is further described in the following examples, which do not limit the scope of the invention described in the claims. Methods The following materials and methods were used in the Examples set forth herein. Molecular Cloning Expression plasmids are constructed by selectively amplifying desired DNA sequences using the PCR method such that they have significant overlapping ends and using isothermal assembly (or “Gibson Assembly”, NEB) to assemble them in the desired order in a CAG or CMV expression vector. PCR is conducted using Phusion HF polymerase (NEB). Cas9 gRNAs is cloned into the pUC19-based entry vector BPK1520 (via BsmBI) under control of a U6 promoter. Guide RNAs All gRNAs are of the form 5′-NNNNNNNNNNNNNNNNNNNNCGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGCTAGTC CGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTTTTTTT-3′. (SEQ ID NO:28) Shown below are the protospacer regions (NNNNNNNNNNNNNNNNNNNN in SEQ ID NO:28) for these gRNAs (all written 5′ to 3′). Cas9 guide RNA protospacer, RNF2 site 1: GTCATCTTAGTCATTACCTG (SEQ ID NO:30) Cas9 guide RNA protospacer, RNF2 site 1: GTCATCTTAGTCATTACCTG (SEQ ID NO:30) Cell Culture and Transfections HEK293T cells (CRL-3216, ATCC) are grown in culture using Dulbeccos Modified Medium (Gibco) supplemented with 10% FBS (Gibco) and 1% penicillin-streptomycin solution (Gibco). Cells are passaged at ˜80% confluency every 2-3 days to maintain an actively growing population. HepG2 cells (HB80-65, ATCC) are grown in Eagle's Minimum Essential Medium (ATCC) supplemented with 10% FBS and 0.5% penicillin-streptomycin solution (Gibco). Cells are passaged at ˜80% confluency every 4 days. Both cell lines are used for experiments until passage 20 for HEK293T and passage 12 for HepG2. Cells are tested for mycoplasma bi-weekly. For sorting experiments, transfections with 50 ug of transfection quality DNA (Qiagen Maxiprep) encoding desired ABEmax-P2A-EGFP fusion proteins or controls (same construct, lacking TadA-TadA* heterodimer, * marks the engineered variants, e.g. 7.10) and gRNAs (75:25%) were conducted by seeding 6×106HEK293T or 15×106HepG2 into TC-treated 150 mm plates 18-24 h prior to transfection to yield -80% confluency on the day of transfection. Cells are transfected at 60-80% confluency using TransIT-293 (HEK293T, Mirus) or tranfeX (HepG2, ATCC) reagents according to the manufacturers' protocols. To ensure maximal correlation of negative controls to ABE overexpression, cells of the same passage are transfected with bpNLS-32AAlinker-nCas9-bpNLS (negative control) and adenine base editors in parallel. RNA and gDNA is harvested after cell sorting. For experiments validating DNA on-target activity of ABE or ABEmax-RRE variants, 1.5×104HEK293T cells are seeded into the wells of a 96-well plate and transfected 18-24 h after seeding with 220 ng DNA (ABE/nCas9-NLS control:gRNA ration of 75:25%). For these experiments, gDNA is harvested 72 h post-transfection. FACS & RNA/DNA Harvest Sorting of negative control and BE expressing cells as well as RNA/DNA harvest is carried out on the same day. Cells are sorted on a BD FACSARIAII 36-40 h after transfection. We gate on the cell population on forward/sideward scatter after exclusion of doublets. We then sort all GFP-positive cells and/or top 5% of cells with the highest FITC signal into pre-chilled 100% FBS and 5% of mean fluorescence intensity (MFI)-matched cells for nCas9-NLS negative controls, matching the MFI/GeoMean of top 5% of ABE or ABEmax-transfected cells. We use MFI-matching for these controls, as the bpNLS-32AAlinker-nCas9-bpNLS-P2A-EGFP (control) plasmid is smaller than ABEmax-P2A-EGFP—due to the lack of the TadA-TadA* heterodimer—and thus yields higher transfection efficiency and overall higher FITC signal. After sorting, cells are spun down, lysed using DNA lysis buffer (Laird et al, 1991) with DTT and Proteinase K or RNA lysis buffer (Macherey-Nagel). gDNA is extracted using magnetic beads (made from FisherSci Sera-Mag SpeedBeads Carboxyl Magnetic Beads, hydrophobic according to Rohland & Reich, 2012), after overnight lysis. RNA then is extracted with Macherey-Nagel's NucleoSpin RNA Plus kit. High-throughput Amplicon Sequencing, RT-PCR & Base Editing Data Analysis Genomic DNA is amplified using gene-specific DNA primers flanking desired target sequence. These primers include illumina-compatible adapter-flaps. The amplicons are molecularly indexed with NEBNext Dual Index Primers (NEB) or index primers with the same or similar sequence ordered from IDT. Samples are combined into libraries and sequenced on the Illumina MiSeq machine using the MiSeq Reagent Kit v2 or Micro Kit v2 (Illumina). Sequencing results are analyzed using a batch version of the software CRISPResso 2.0 (crispresso.rocks). Reverse transcription is performed using the High Capacity RNA-to-cDNA kit (Thermo Fisher) following the manufacturer's instructions. Amplicon PCR and library preparation for Next-Generation Sequencing (NGS) off of cDNA is done as described above for gDNA. If possible, we use exon-exon junction spanning primers to exclude amplification of gDNA. RNA-seq and Single Nucleotide Variant Calling RNA library preparation is performed using Illumina's TruSeq Stranded Total RNA Gold Kit with initial input of -500 ng of extracted RNA per sample, using SuperScript III for first-strand synthesis (Thermo Fisher). rRNA depletion is confirmed during library preparation by fluorometric quantitation using the Qubit HS RNA kit before and after depletion (Thermo Fisher). For indexing, we use IDT-Illumina Unique Dual Indeces (Illumina). Libraries are pooled based on qPCR quantification (NEBNext Library Quant Kit for IIlumina) and loaded onto a NextSeq (at MGH Cancer Center, PE 2x150, 500/550 MidOutput Cartridge) or HiSeq2500 in High Output mode (Broad Institute, PE 2x76). Illumina fastq sequencing reads are aligned to the human hg38 reference genome with STAR (Dobin et al., 2013, PMID: 23104886) and processed with GATK best practices (McKenna et al., 2010, PMID: 20644199: DePristo et al., 2011, PMID: 21478889). RNA variants are called using HaplotypeCaller, and empirical editing efficiencies are established on PCR-de-duplicated alignment data. Variant loci in ABE/ABEmax overexpression experiments are further required to have comparable read coverage in the corresponding control experiment (read coverage for SNV in control >90th percentile of read coverage across all SNVs in overexpression). Additionally, the above loci are required to have a consensus of at least 99% of reads calling the reference allele in control. Protein Structure Analysis and DNA/RNA Binding Prediction We access the crystal structures of Alignment of tRNA Adenosine Deaminase Homologues and Orthologues The amino acid sequence of To test whether ABEs might be capable of editing adenines in RNA, we assessed whether this base editor fusion could edit adenines transcriptome-wide using RNA-seq. To do this, we transfected human HEK293T cells with a plasmid that expressed an ABEmax-P2A-EGFP fusion protein (the P2A sequence mediates a post-translational cleavage that releases EGFP from the ABEmax part of the fusion) (Methods). At 36 hours after transfection, we then used flow cytometry to sort out the cells with the highest (top 5%) GFP/FITC signal and isolated total RNA from these cells. As a negative control, we transfected HEK293T cells in parallel with a plasmid that expressed a bpNLS-32AAlinker-nickase Cas9 (nCas9)-bpNLS-P2A-EGFP (called nCas9-NLS below) fusion protein (i.e., a plasmid identical to the ABEmax-P2A-EGFP expression plasmid but lacking the TadA-TadA* heterodimer within the ABEmax part of the fusion protein) and also sorted these for the top 5% GFP signal and isolated total RNA. We used a gRNA targeting a genomic site in the RNF2 gene and on-target DNA base editing was high (˜70% A-to-G, data not shown). Using RNA-seq, we found that ABEmax edited tens of thousands of adenosines in RNA with high efficiency ( Total transcriptome-wide numbers of edited adenosines in different biological replicates Cells were transfected 18-24 h after seeding and sorted 36-40 h after transfection for top 5% FITC signal (Methods). These edited As were distributed throughout the human genome and had considerable editing efficiencies ( Given the transcriptome-wide RNA editing induced by ABEmax, it is desirable to create variants of the adenine base editor that would diminish this unwanted activity while retaining the desired capability to perform targeted DNA base editing (RRE or Reduced RNA Editing variants). We reasoned that the introduction of mutations into the TadA-TadA* part (* marking the engineered variant of Methods: The following materials and methods were used in Example 3. PyMOL Analysis of TadA structures. Plasmid cloning. All ABE constructs (reported in Supplementary Table 1) were cloned using the backbone and the P2A-EGFP-NLS fragment of ABEmax-P2A-EGFP-N LS (Agel/NotI digest; Addgene ID 112101). ABEmax and variants were expressed under the control of a pCMV promoter. For the P2A-EGFP fragments in these constructs, we used BPK4335 (pCMV-BE3-P2A-EGFP) as a template. Guide RNA (gRNA) plasmids were cloned using the SpCas9 gRNA entry vector BPK1520 (pUC19 backbone; Bsmbl cassette, Addgene ID 65777). All remaining constructs were generated using isothermal amplification (Gibson assembly, NEB). All gRNA and ABE plasmids were midi or maxi prepped using the Qiagen Midi/Maxi Plus kits. Cell culture. HEK293T cells (CRL-3216) and HepG2 cells (HB-8065) were purchased from and STR-authenticated by ATCC. Cells were cultured in Dulbecco's Modified Eagle Medium (DMEM, Gibco) supplemented with 10% (v/v) fetal bovine serum (FBS, Gibco) and 1% (v/v) penicillin-streptomycin (Gibco) or Eagle's Minimum Essential Medium with 10% (v/v) FBS and 0.5% (v/v) penicillin. Cells were passaged every 2-3 days when reaching around 80-90% confluency. Both cell lines were used only until passage 20 for all experiments, and the media was tested every two weeks for mycoplasma. Transfections. For ABE DNA on-target screening experiments, 2×104HEK293T cells were seeded into 96-well Flat Bottom Cell Culture plates (Corning), transfected 24 h post seeding with 165 ng base editor or negative control (bpNLS-32AA linker-nCas9(D10A)-bpNLS), 55 ng guide RNA expression plasmid, and 0.66 μL TransIT-293 (Mirus), and harvested 72 h after transfection for DNA. For ABE RNA off-target screening experiments, 2×105HEK293T cells were seeded into 12-well Cell Culture plates (Corning), transfected 24 h post seeding with 1.65 μg base editor or negative control, 0.55 μg guide RNA, and 6.6 μL TransIT-293, and harvested 36 h after transfection for RNA. For experiments with FACS-sorted cells, 6.5-7×106HEK293T cells were seeded into 150 mm Cell Culture dishes (Corning), transfected 24 h post seeding with 37.5 μg base editor or an appropriate negative control fused to P2A-EGFP, 12.5 μg guide RNA, and 150 μL TransIT-293. Sorting took place 36-40 h post transfection. Fluorescence-activated cell sorting (FACS). Cells were prepared for sorting by diluting to 1×107cells per ml with 1× Phosphate Buffer Saline (PBS, Corning) supplemented with 10% FBS and filtering through 35 μm cell strainer caps (Corning). Cells were sorted on a FACSAria II (BD Biosciences) using FACSDiva version 6.1.3 (BD Biosciences) after gating for single live cells. Cells treated with base editor were sorted for either all GFP signal (standard expression) or top 5% of cells with the highest GFP (FITC) signal (overexpression) into FBS; cells treated with nCas9 negative controls were sorted for either all GFP positive cells or the 5% of cells with a mean fluorescence intensity (MFI) matching that of the top 5% of cells treated with base editor. DNA extraction. For ABE DNA on-target experiments, cells were lysed for DNA 72 h post-transfection with freshly prepared 43.5 μL DNA lysis buffer (50 mM Tris HCl pH 8.0, 100 mM NaCl, 5 mM EDTA, 0.05% SDS, adapted from ref. 15), 5.25 μL Proteinase K (NEB), and 1.25 μL 1M DTT (Sigma). For experiments with sorted cells, cells were centrifuged (200 g, 8 min) and lysed with 174 μL DNA lysis buffer, 21 μL Proteinase K, and 5 μL 1M DTT. Lysates were incubated at 55° C. on a plate shaker overnight, then gDNA were extracted with 2× paramagnetic beads (as described in ref. 16), washed 3 times with 70% EtOH, and eluted in 30 μL 0.1× EB buffer (Qiagen). RNA extraction & reverse transcription. Cells were lysed for RNA 36 h-40 h post-transfection with 350 μL RNA lysis buffer LBP (Macherey-Nagel), and RNA were extracted with the NucleoSpin RNA Plus kit (Macherey-Nagel) following the manufacturer's instructions. RNA was then reverse transcribed into cDNA with the High Capacity RNA-to-cDNA kit (Thermo Fisher) following the manufacturer's instructions. Library preparation for DNA or cDNA targeted amplicon sequencing. Next-generation sequencing (NGS) of DNA or cDNA was performed as previously described5. In summary, the first PCR was performed to amplify genomic or transcriptomic sites of interested with primers containing Illumina forward and reverse adapter sequences (see RNA library preparation & sequencing. RNA-seq experiments were performed as previously described5. Briefly, RNA libraries were prepared with the TruSeq Stranded Total RNA Library Prep Gold kit (Illumina) following the manufacturer's instructions. SuperScript III (Invitrogen) was used for first-strand synthesis, and IDT for Illumina TruSeq RNA unique dual indexes (96 indexes) were used to avoid index hopping. The libraries were pooled based on qPCR measurements with the NEBNext Library Quant Kit for Illumina. The final pool was sequenced PE 2×76 on the Illumina HiSeq2500 machine (for the ABE experiment shown in Amplicon sequencing analysis. Amplicon sequencing data was analyzed with CRISPResso2 v.2.0.2717. The heatmaps for the SECURE-ABE screening in RNA Variant Calling Pipeline All bioinformatic analysis was performed in concordance with GATK Best Practices18,19for RNA-seq mutation calling as we have previously described5. Briefly, raw sequencing reads were two-pass aligned to the reference hg38 reference genome with STAR2° with parameters to discard multi-mapping reads. After PCR duplicate removal and base recalibration, mutations in RNA-seq libraries were called using GATK HaplotypeCaller. RNA edits in ABE overexpression experiments were identified using a downstream modification of the GATK pipeline output as we have previously described5. Specifically, mutation positions called by HaplotypeCaller were further filtered to include only those satisfying the following criteria with reference to the corresponding control experiments: (1) Read coverage for a given edit in control experiment should be greater than the 90th percentile of read coverage across all edits in the overexpression experiment. (2) 99% of reads covering each edit in the control experiment were required to contain the reference allele. Edits were further filtered to exclude those with fewer than 10 reads or 0% alternate allele frequencies. A-G edits include A-G edits identified on the positive strand as well as T-C edits identified on the negative strand. Six A-to-I edits identified from the above pipeline were chosen to test SECURE ABE variants based on the following criteria. These were sites that had (1) read coverage of at least 50 in all replicates of control and overexpression experiments, (2) 99% reads in all control experiments containing reference allele and (3) at least 60% alternate allele frequencies in all replicates. From this list, primers were tested for the top 15 edited sites that were also within 150 bases of an exon-exon junction and the 6 highest edited sites with robust amplification from cDNA were chosen. To identify self-edits occurring on the base-editing construct, we generated a modified hg38 reference genome with additional contigs for the gRNA and base editor constructs. These additional contigs were appended to the reference genome, and each library was re-processed using GATK best practices, including variant calling with HaplotypeCaller. Variants were then further filtered using a similar process as described above for the transcriptome (i.e. filtering for no more than 1% editing in the negative control) with the exception that positions poorly covered in the control due to differences in the construct design (i.e. the deaminase domain) were not filtered out. We note that since both control and BE constructs were expressed from plasmids, the overall expression of these transcripts is much higher than most detected genes which supersedes the control of coverage between control and BE expression in this analysis (see part 1 of transcriptome variant calling above). Editing efficiencies per position were computed based on the abundance of Gs (ABE) over total coverage from bam-readcount estimated on the PCR deduplicated .bam files. Edits were further filtered to exclude those with fewer than 50 reads or 0% alternate allele frequencies. Results To engineer SECURE-ABE variants, we first used a protein truncation strategy to reduce the RNA recognition capability of the widely used ABEmax fusion. ABEmax harbors a single-chain heterodimer of the wild type (WT) We used RNA-seq to compare the transcriptome-wide off-target RNA editing activities of miniABEmax to ABEmax in HEK293T cells. Both editors and a nickase Cas9 (nCas9) control were each assayed in biological triplicate with three different gRNAs: two targeted to endogenous human gene sites (HEK site 2 and ABE site 16)3and one to a site that does not occur in the human genome (NT)5. We performed these studies by sorting for GFP-positive cells (ABEmax was expressed as a P2A fusion with the base editor or nCas9 (Methods)). As an internal control, we first confirmed that ABEmax and miniABEmax both induced comparable on-target DNA editing efficiencies with HEK site 2 and ABE site 16 gRNAs ( We reasoned we might further reduce the off-target RNA editing activity of miniABEmax by altering amino acid residues within the remaining engineered We generated a total of 34 miniABEmax variants bearing various substitutions at the amino acid positions described above and screened each editor for on-target DNA editing and off-target RNA editing activities in HEK293T cells. To assess on-target DNA editing, we examined the efficiencies of A-to-G edits induced by each of the 34 variants with four gRNAs targeted to different endogenous gene sequences. To screen for off-target RNA editing activities, we quantified editing by each of the 34 variants at six RNA adenines using standard plasmid expression conditions (i.e., without sorting for GFP expression; see Methods); these six adenines were previously identified as being highly edited with ABEmax overexpression in HEK293T cells8. These experiments revealed that 23 of the 34 variants induced robust on-target DNA editing at least comparable to that observed with miniABEmax and ABEmax ( We characterized the transcriptome-wide off-target RNA editing profiles of the miniABEmax K20A/R21A and V82G variants using RNA-seq. The two variants were assessed in biological triplicate with the HEK site 2, ABE site 16, and NT gRNAs. In contrast to what we observed with miniABEmax, the K20A/R21A and V82G variants both induced substantially reduced numbers of edited adenines relative to ABEmax but still approximately four-fold and three-fold higher numbers, respectively, than background (determined with the GFP-only negative control) ( Finally, given their abilities to edit RNA transcripts, we wondered whether ABEs might also self-edit their own transcripts, thereby potentially generating a set of heterogeneous base editor proteins. To assess this, we applied our analysis pipeline to quantify self-edit events in our previously published RNA-seq data5obtained with BE3 expressed at standard or overexpression levels in HEK293T cells. These data showed ABEmax and miniABEmax both inducing dozens (29 to 67) of A-to-I changes throughout their own transcripts with editing efficiencies ranging from 7.3% to 58.7% among replicates performed with three different gRNAs ( To screen for additional SECURE-ABE variants with minimized unwanted RNA editing activities that maintain efficient DNA on-target editing, we engineered 30 more miniABEmax variants and assessed their DNA and RNA editing efficiencies. In this second screen, we included two SECURE-ABE variants (miniABEmax-K20A/R21A and -V82G) with reduce RNA off-target editing. DNA on-target editing was examined with four gRNAs targeted to different endogenous gene sequences (HEK site 2, ABE site 2, site 3 and site 4), and 25 out of 30 variants induced DNA editing comparable to that observed with miniABEmax and ABEmax. RNA off-target editing was examined on six RNA sites that were previously identified to be highly edited with ABEmax and were used for first round of screening, and 24 out of 30 variants showed reduced RNA editing compared to miniABEmax on all 6 sites tested. Based on both DNA and RNA editing profiles, miniABEmax-W11A, -K110A, and -D139A/E140A showed the most promising characteristics to become SECURE-ABE variants, while A58G, N72A, V106W, K110A, H128A/R129A, A138W, D139A/E140A, A142W, F148A, and R150A all showed promising reductions of RNA off-targets as well, with reasonably maintained DNA on-target editing capabilities. 1 Rees, H. A. & Liu, D. R. Base editing: precision chemistry on the genome and transcriptome of living cells. 2 Komor, A. C., Kim, Y. B., Packer, M. S., Zuris, J. A. & Liu, D. R. Programmable editing of a target base in genomic DNA without double-stranded DNA cleavage. Nature 533, 420-424, doi:10.1038/nature17946 (2016). 3 Gaudelli, N. M. et al. Programmable base editing of A*T to G*C in genomic DNA without DNA cleavage. 4 Nishida, K. et al. Targeted nucleotide editing using hybrid prokaryotic and vertebrate adaptive immune systems. 5 Grunewald, J. et al. Transcriptome-wide off-target RNA editing induced by CRISPR-guided DNA base editors. 6 Wolf, J., Gerber, A. P. & Keller, W. tadA, an essential tRNA-specific adenosine deaminase from 7 Koblan, L. W. et al. Improving cytidine and adenine base editors by expression optimization and ancestral reconstruction. 8 Kim, J. et al. Structural and kinetic characterization of 9 Losey, H. C., Ruthenburg, A. J. & Verdine, G. L. Crystal structure of 10 Wang, X. et al. Efficient base editing in methylated regions with a human APOBEC3A-Cas9 fusion. 11 Gehrke, J. M. et al. An APOBEC3A-Cas9 base editor with minimized bystander and off-target activities. 12 Komor, A. C. et al. Improved base excision repair inhibition and bacteriophage Mu Gam protein yields C:G-to-T:A base editors with higher efficiency and product purity. 13 Sharma, S., Patnaik, S. K., Kemer, Z. & Baysal, B. E. Transient overexpression of exogenous APOBEC3A causes C-to-U RNA editing of thousands of genes. 14 Fritz, E. L. et al. A comprehensive analysis of the effects of the deaminase AID on the transcriptome and methylome of activated B cells. 15 Laird, P. W. et al. Simplified mammalian DNA isolation procedure. 16 Rohland, N. & Reich, D. Cost-effective, high-throughput DNA sequencing libraries for multiplexed target capture. 17 Clement, K. et al. CRISPResso2 provides accurate and rapid genome editing sequence analysis. 18 McKenna, A. et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. 19 DePristo, M. A. et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. 20 Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. It is to be understood that while the invention has been described in conjunction with the detailed description thereof, the foregoing description is intended to illustrate and not limit the scope of the invention, which is defined by the scope of the appended claims. Other aspects, advantages, and modifications are within the scope of the following claims.CROSS-REFERENCE TO RELATED APPLICATIONS
FEDERALLY SPONSORED RESEARCH OR DEVELOPMENT
SEQUENCE LISTING
TECHNICAL FIELD
BACKGROUND
SUMMARY
DESCRIPTION OF DRAWINGS
DETAILED DESCRIPTION
Amino acid substitutions predicted to generate ABE variants with reduced RNA editing. This table lists the residue changes in either or both TadA domains of the TadA-heterodimer (present in e.g. ABE7.10) predicted to cause an RRE phenotype, next to the reasoning behind the proposed changes. Residues to Change Rationale Wild type (WT) Engineered Protein Binding TadA TadA structure prediction S7 S205 x H8 H206 x E9 E207 x Y10 Y208 x W11 W209 x M12 M210 x R13 R211 x x H14 H212 x T17 T215 x K20 K218 x x R21 R219 x x W23 R221 x E25 E223 x x R26 R224 x x E27 E225 x V28 V226 x x P29 P227 x V30 V228 x G31 G229 x H36 L234 x N37 N235 x N38 N236 x N46 N244 x R47 R245 x P48 A246 x I49 I247 x G50 G248 x R51 I249 x H52 H250 x D53 D251 x P54 P252 x T55 T253 x A56 A254 x H57 H255 x x A58 A256 x E59 E257 x R64 R262 x Q65 Q263 x G67 G265 x L68 L266 x Q71 Q269 x N72 N270 x R74 R272 x I76 I274 x D77 D275 x Y81 Y279 x V82 V280 x T83 T281 x L84 F282 x E85 E283 x P86 P284 x x C87 C285 x x V88 V286 x M89 M287 x C90 C288 x x R98 R296 x G100 G298 x R101 R299 x A106 V304 x R107 R305 x D108 N306 x A109 A307 x K110 K308 x T111 T309 x D119 D317 x H122 H320 x H123 Y321 x P124 P322 x G125 G323 x M126 M324 x N127 N325 x H128 H326 x R129 R327 x V130 V328 x E131 E329 x I132 I330 x T133 T331 x E134 E332 x G135 G333 x L137 L335 x A138 A336 x x D139 D337 x E140 E338 x C141 C339 x x A142 A340 x x A143 A341 x x L144 L342 x L145 L343 x x S146 C344 x D147 Y345 x F148 F346 x x F149 F347 x x R150 R348 x x M151 M349 x R152 P350 x x R153 R351 x Q154 Q352 x E155 V353 x x I156 F354 x K157 N355 x K160 K358 x K161 K359 x Exemplary TadA proteins. Some or all residues listed in Table A as well as combinations thereof might also be introduced in any of these TadA orthologues or tRNA adenosine deaminase homologues (same proteins were aligned in FIG. 5). tRNA-specific Uniprot adenosine accession Sequence Seq. deaminase number version # ID P68398 2 1 Q99W51 1 2 Q5XE14 2 3 Q8XGY4 2 4 O67050 1 5 O94642 2 6 P53065 1 7 P47058 1 8 Q6IDB6 1 9 Q4V7V8 1 10 Q0P4H0 1 11 Q5RIV4 2 12 Q5E9J7 1 13 Q6P6J0 1 14 Q7Z6V5 1 15 (SEQ ID NO: 1) MSEVEFSHEYWMRHALTLAKRAWDEREVPVGAVLVHNNRVIGEGWNRPIG RHDPTAHAEIMALRQGGLVMQNYRLIDATLYVTLEPCVMCAGAMIHSRIG RVVFGARDAKTGAAGSLMDVLHHPGMNHRVEITEGILADECAALLSDFFR MRRQEIKAQKKAQSSTD. (SEQ ID NO: 21) SEVEFSHEYWMRHALTLAKRARDEREVPVGAVLVLNNRVIGEGWNRAIGL HDPTAHAEIMALRQGGLVMQNYRLIDATLYVTFEPCVMCAGAMIHSRIGR VVFGVRNAKTGAAGSLMDVLHYPGMNHRVEITEGILADECAALLCYFFRM PRQVFNAQKKAQSSTD. (SEQ ID NO: 22) MSEVEFSHEYWMRHALTLAKRAWDEREVPVGAVLVHNNRVIGEGWNRPIG RHDPTAHAEIMALRQGGLVMQNYRLIDATLYVTLEPCVMCAGAMIHSRIG RVVFGARDAKTGAAGSLMDVLHHPGMNHRVEITEGILADECAALLSDFFR MRRQEIKAQKKAQSSTDSGGSSGGSSGSETPGTSESATPESSGGSSGGSS EVEFSHEYWMRHALTLAKRARDEREVPVGAVLVLNNRVIGEGWNRAIGLH DPTAHAEIMALRQGGLVMQNYRLIDATLYVTFEPCVMCAGAMIHSRIGRV VFGVRNAKTGAAGSLMDVLHYPGMNHRVEITEGILADECAALLCYFFRMP RQVFNAQKKAQSSTD. List of Exemplary Cas9 or Cas12a Orthologs UniProt or GenBank Nickase Mutations/Catalytic Ortholog Accession Number residues Q99ZW2.1 D10A, E762A, H840A, N854A, (SpCas9) N863A, D986A17 J7RUA5.1 D10A and N58018 (SaCas9) G3ECR1.2 D31A and N891A19 (St1Cas9) BAK30384.1 D10, H599* (SpaCas9) Q0P897.1 D8A, H559A20 (CjCas9) A0Q5Y3.1 D11, N99521 (FnCas9) A7HP89.1 D8, H601* (PlCas9) G1UFN3.1 D7, H567* Q9CLT2.1 A0Q7Q2.1 D917, E1006, D125521 (FnCpf1) WP_052585281.1 D986A** (MbCpf1) A. sp. BV3L6 Cpf1 U2UMQ6.1 D908, 993E, Q1226, D126323 (AsCpf1) A0A182DWE3.1 D832A24 (LbCpf1) *predicted based on UniRule annotation on the UniProt database. **Unpublished but deposited at addgene by Ervin Welker: pTE4565 (Addgene plasmid # 88903) List of Exemplary High Fidelity and/or PAM-relaxed RGN Orthologs Published HF/PAM-RGN variants PMID Mutations* 26628643 K810A/K1003A/R1060A (1.0); Cas9 (SpCas9) K848A/K1003A/R1060A(1.1) eSpCas9 29431739 M495V/Y515N/K526E/R661Q; Cas9 (SpCas9) (M495V/Y515N/K526E/R661S; evoCas9 M495V/Y515N/K526E/R661L) 26735016 N497A/R661A/Q695A/Q926A Cas9 (SpCas9) HF1 30082871 R691A Cas9 (SpCas9) HiFi Cas9 28931002 N692A, M694A, Q695A, H698A Cas9 (SpCas9) HypaCas9 30082838 F539S, M763I, K890N Cas9 (SpCas9) Sniper-Cas9 29512652 A262T, R324L, S409I, E480K, E543D, M694I, Cas9 (SpCas9) E1219V xCas9 30166441 R1335V, L1111R, D1135V, G1218R, Cas9 (SpCas9) E1219F, A1322R, T1337R SpCas9-NG 26098369 D1135V, R1335Q, T1337R; Cas9 (SpCas9) D1135V/G1218R/R1335E/T1337R VQR/VRER 26524662 E782K/N968K/R1015H (SaCas9)-KKH enAsCas12a USSN One or more of: E174R, S170R, S542R, K548R, 15/960,271 K548V, N551R, N552R, K607R, K607H, e.g., E174R/S542R/K548R, E174R/S542R/K607R, E174R/S542R/K548V/N552R, S170R/S542R/K548R, S170R/E174R, E174R/S542R, S170R/S542R, E174R/S542R/K548R/N551R, E174R/S542R/K607H, S170R/S542R/K607R, or S170R/S542R/K548V/N552R enAsCas12a-HF USSN One or more of: E174R, S542R, K548R, e.g., 15/960,271 E174R/S542R/K548R, E174R/S542R/K607R, E174R/S542R/K548V/N552R, S170R/S542R/K548R, S170R/E174R, E174R/S542R, S170R/S542R, E174R/S542R/K548R/N551R, E174R/S542R/K607H, S170R/S542R/K607R, or S170R/S542R/K548V/N552R, with the addition of one or more of: N282A, T315A, N515A and K949A enLbCas12a(HF) USSN One or more of T152R, T152K, D156R, D156K, 15/960,271 Q529K, G532R, G532K, G532Q, K538R, K538V, D541R, Y542R, M592A, K595R, K595H, K595S or K595Q, e.g., D156R/G532R/K538R, D156R/G532R/K595R, D156R/G532R/K538V/Y542R, T152R/G532R/K538R, T152R/D156R, D156R/G532R, T152R/G532R, D156R/G532R/K538R/D541R, D156R/G532R/K595H, T152R/G532R/K595R, T152R/G532R/K538V/Y542R, optionally with the addition of one or more of: N260A, N256A, K514A, D505A, K881A, S286A, K272A, K897A enFnCas12a(HF) USSN One or more of T177A, K180R, K180K, E184R, 15/960,271 E184K, T604K, N607R, N607K, N607Q, K613R, K613V, D616R, N617R, M668A, K671R, K671H, K671S, or K671Q, e.g., E184R/N607R/K613R, E184R/N607R/K671R, E184R/N607R/K613V/N617R, K180R/N607R/K613R, K180R/E184R, E184R/N607R, K180R/N607R, E184R/N607R/K613R/D616R, E184R/N607R/K671H, K180R/N607R/K671R, K180R/N607R/K613V/N617R, optionally with the addition of one or more of: N305A, N301A, K589A, N580A, K962A, S334A, K320A, K978A *predicted based on UniRule annotation on the UniProt database. EXAMPLES
Example 1. Adenine Base Editors (ABEmax) Comprised of
Total numbers of A-to-I RNA edits induced by ABEmax overexpression Cell Guide Replicate A-to-I mutations in RNA Line RNA No. % of detected variants total 293T RNF2, #1 99.76 37,061 site1 #2 99.79 31,821 #3 99.83 28,752 Example 2. ABE Variants with Reduced RNA Editing Activities
Example 3. CRISPR Adenine and Cytosine Base Editors with Reduced Self-Editing and RNA Off-Target Activities
Extended Data Table 1. Summary of numbers of RNA edits observed in all RNA-seq experiments A-to-I (for A )
or C-to-U A-to-I or FIG. Cell gRNA Sort Replicate (for C )
Other C-to-U (%) FIG. 1b HEK293T ABEmax HEK site 2 Top 5% Rep. 1 37,061 8
99.763 FIG. 1c & d HEK299T ABEmax A site 16
All GFP Rep. 1 29,099 197 99.904 Rep. 2 26,571 29 99.112 Rep. 3 25,948 298 99.125 A site 16
All GFP Rep. 1 23,187 216 99.077 Rep. 2 12,997 202 99. 87
Rep. 3 19,907 232 96.948 miniABEmax-V82G A site 16
All GFP Rep. 1 1,376 292 82.494 Rep. 2 1,33 291 82.136 Rep. 3 1,896 295 86. miniABEmax-V82G A site 16
All GFP Rep. 1 928 243 79.299 Rep. 2 1,224 336 76.402 Rep. 3 1,159 209 81.162 HEK299T ABEmax HEK site 2 All GFP Rep. 1 16,049 201 96.709 Rep. 2 27,706 246 99.120 Rep. 3 29,597 199 9 .9
2
miniABEmax HEK site 2 All GFP Rep. 1 10,047 291 97.752 Rep. 2 2 ,552
251 99.064 Rep. 3 29,941 177 99.410 minABEmax-K20A/R21A HEK site 2 All GFP Rep. 1 1,000 298 81.842 Rep. 2 2,202 83
85.189 Rep. 3 2,069 315 96.297 minABEmax-V82G HEK site 2 All GFP Rep. 1 971 218 81. 0
Rep. 2 1,654 323 80.241 Rep. 3 1,172 279 0.9
9
HEK299T ABEmax NT All GFP Rep. 1 15,909 202 96.74 Rep. 2 31,521 229 99.279 Rep. 3 24,326 196 99.201 minABEmax NT All GFP Rep. 1 9,748 379 95.647 Rep. 2 29,540 244 90.1 1
Rep. 3 25,426 261 96.984 miniABEmax-K20A/R21A NT All GFP Rep. 1 690 206 77.009 Rep. 2 2,102 25
9 .00
Rep. 3 2,191 205 89.210 minABEmax-V82G NT All GFP Rep. 1 762 143 84.199 Rep. 2 1, 34
304 54.314 Rep. 3 1,592 282 84.871 FIG. 2b HEK299T GFP — All GFP Rep. 1 423 2 67.680 Rep. 2 270 175 80.674 Rep. 3 63
158 96.362 GFP — MFI-matched to Rep. 1 31 1 1
19.136 top 5% expression Rep. 1 0,425
6 99.974 hA3A- RNF2 Top 5% Rep. 2 27,190 8 99.971 Rep. 3 32,402 11 99.966 Rep. 1 99 101 49.246 A3A-
RNF2 Top 5% Rep. 2 72 67 45.2 0
Rep. 3 11 79 59.162 Rep. 1 45 201 19.299 hAID- RNF2 Top 5% Rep. 2 34 144 19.101 Rep. 3 70 234 23.026 indicates data missing or illegible when filed
ABEmax ABEmax ABEmax miniABEmax miniABEmax DNA vs miniABEmax vs K20A/R21A vs V82G vs K20A/R21A vs V82G ABE site14 0.78501 0.59378 0.23183 0.45035 0.29955 ABE site16 0.58244 0.65370 0.6298 9
0.32954 0.90884 ABE site19 0.27139 0.57921 0.45482 0.11499 0.05184 HEK site2 0.16276 0.00451 0.00737 0.01031 0.01829 ABEmax nCas9 Control ABEmax ABEmax miniABEmax miniABEmax RNA vs miniABEmax vs miniABEmax vs K20A/R21A vs V82G vs K20A/R21A vs V82G RNA site1 0.00001 0.00003 0.00001 0.00001 0.00002 0.00004 RNA site2 0.00067 0.00215 0.00054 0.00061 0.00523 0.00215 RNA site3 0.00011 0.01602 0.00011 0.00011 0.01714 0.19767 RNA site4 0.00063 0.00691 0.00051 0.00060 0.00865 0.00824 RNA site5 0.00267 0.00115 0.00253 0.00239 0.00746 0.00419 RNA site6 0.00162 0.01755 0.00178 0.00178 0.06036 0.05016 p-values generated with two-tailed-t-test (type 3) indicates data missing or illegible when filed
Numbers of RNA self-edit alterations by CBEs and ABEs expected to generate synonymous, missense, and nonsense mutations Cell gRNA Sort Replicate Synonmous Total HEK299T WT 2
All GFP Rep. 1 11 48 70 129 Rep. 2 8 27 51 6
Rep. 3 10 40 56 106 WT All GFP Rep. 1 3 36 55 97 Rep. 2 9 31 45 95 Rep. 3 6 32 51 89 HEK299T WT RNF2 Top 5% Rep. 1 14 67 80 151 Rep. 2 15 73 95 19 Rep. 3 15 62 79 156 -R21A
RNF2 Top 5% Rep. 1 0 0 0 0 Rep. 2 0 0 0 0 Rep. 3 0 0 0 0 RNF2 Top 5% Rep. 1 0 0 0 0 Rep. 2 0 0 0 0 Rep. 3 0 0 0 0 HepG2 WT 3
RNF2 Top 5% Rep. 1 10 3 6 114 Rep. 2 10 37 55 102 Rep. 3 9 48 61 118 -R21A
RNF2 Top 5% Rep. 1 0 0 0 0 Rep. 2 0 0 0 0 Rep. 3 0 0 0 0 HEKK299T -R21A
RNF2 Top 5% Rep. 1 0 0 0 0 Rep. 2 0 0 0 0 Rep. 3 0 0 0 0 -
RNF2 Top 5% Rep. 1 6 7 16 29 Rep. 2 6 7 15 29 Rep. 3 17 21 A3A-
RNF2 All GFP Rep. 1 0 0 0 0 Rep. 2 0 0 0 0 Rep. 3 0 0 0 0 HEK299T ABEmax ABE site 10 All GFP Rep. 1 — 41 0 41 Rep. 2 — 3 0 Rep. 3 — 9
0 9
minABEmax ABE site 10 All GFP Rep. 1 — 43 1 44 Rep. 2 — 33 1 34 Rep. 3 — 39 1 40 minABEmax-K20A/R21A ABE site 10 All GFP Rep. 1 — 0 1 1 Rep. 2 — 0 1 1 Rep. 3 — 1 1 2 minABEmax-V82G ABE site 10 All GFP Rep. 1 — 0 0 0 Rep. 2 — 0 0 0 Rep. 3 — 0 0 0 HEK299T ABEmax HEK site 2 All GFP Rep. 1 — 30 0 030 Rep. 2 — 45 0 45 Rep. 3 — 47 0 47 minABEmax HEK site 2 All GFP Rep. 1 — 35 1 36 Rep. 2 — 57 1 59 Rep. 3 — 63 1 64 minABEmax-K20A/R21A HEK site 2 All GFP Rep. 1 — 0 1 1 Rep. 2 — 2 1 3 Rep. 3 — 1 1 2 minABEmax V82G HEK site 2 All GFP Rep. 1 — 0 0 0 Rep. 2 — 0 0 0 Rep. 3 — 0 0 0 HEK299T ABEmax NT All GFP Rep. 1 — 29 0 29 Rep. 2 — 42 0 42 Rep. 3 — 48 0 48 minABEmax NT All GFP Rep. 1 — 30 1 31 Rep. 2 — 66 1 67 Rep. 3 — 65 1 66 minABEmax-K20A/R21A NT All GFP Rep. 1 — 1 1 2 Rep. 2 — 2 1 3 Rep. 3 — 2 1 3 minABEmax-V82G NT All GFP Rep. 1 — 0 0 0 Rep. 2 — 0 0 0 Rep. 3 — 0 0 0 indicates data missing or illegible when filed
Example 4: Additional SECURE-ABE Variants
REFERENCES
SEQUENCES LISTINGS SEQ ID: 1 MSEVEFSHEYWMRHALTLAKRAWDEREVPVGAVLVHNNRVIGEGWNRPIGRHDPTAHAEIMA LRQGGLVMQNYRLIDATLYVTLEPCVMCAGAMIHSRIGRVVFGARDAKTGAAGSLMDVLHHPG MNHRVEITEGILADECAALLSDFFRMRRQEIKAQKKAQSSTD SEQ ID: 2 MTNDIYFMTLAIEEAKKAAQLGEVPIGAIITKDDEVIARAHNLRETLQQPTAHAEHIAIERAAKVLG SWRLEGCTLYVTLEPCVMCAGTIVMSRIPRVVYGADDPKGGCSGSLMNLLQQSNFNHRAIVDK GVLKEACSTLLTTFFKNLRANKKSTN SEQ ID: 3 MPYSLEEQTYFMQEALKEAEKSLQKAEIPIGCVIVKDGEIIGRGHNAREESNQAIMHAEMMAINE ANAHEGNWRLLDTTLFVTIEPCVMCSGAIGLARIPHVIYGASNQKFGGADSLYQILTDERLNHRV QVERGLLAADCANIMQTFFRQGRERKKIAKHLIKEQSDPFD SEQ ID: 4 MSDVELDHEYWMRHALTLAKRAWDEREVPVGAVLVHNHRVIGEGWNRPIGRHDPTAHAEIMA LRQGGLVLQNYRLLDTTLYVTLEPCVMCAGAMVHSRIGRVVFGARDAKTGAAGSLIDVLHHPG MNHRVEIIEGVLRDECATLLSDFFRMRRQEIKALKKADRAEGAGPAV SEQ ID: 5 MGKEYFLKVALREAKRAFEKGEVPVGAIIVKEGEIISKAHNSVEELKDPTAHAEMLAIKEACRRL NTKYLEGCELYVTLEPCIMCSYALVLSRIEKVIFSALDKKHGGVVSVFNILDEPTLNHRVKWEYY PLEEASELLSEFFKKLRNNII SEQ ID: 6 MAGDSVKSAIIGIAGGPFSGKTQLCEQLLERLKSSAPSTFSKLIHLTSFLYPNSVDRYALSSYDIE AFKKVLSLISQGAEKICLPDGSCIKLPVDQNRIILIEGYYLLLPELLPYYTSKIFVYEDADTRLERCV LQRVKAEKGDLTKVLNDFVTLSKPAYDSSIHPTRENADIILPQKENIDTALLFVSQHLQDILAEMN KTSSSNTVKYDTQHETYMKLAHEILNLGPYFVIQPRSPGSCVFVYKGEVIGRGFNETNCSLSGI RHAELIAIEKILEHYPASVFKETTLYVTVEPCLMCAAALKQLHIKAVYFGCGNDRFGGCGSVFSIN KDQSIDPSYPVYPGLFYSEAVMLMREFYVQENVKAPVPQSKKQRVLKREVKSLDLSRFK SEQ ID: 7 MVSCQGTRPCIVNLLTMPSEDKLGEEISTRVINEYSKLKSACRPIIRPSGIREWTILAGVAAINRD GGANKIEILSIATGVKALPDSELQRSEGKILHDCHAEILALRGANTVLLNRIQNYNPSSGDKFIQH NDEIPARFNLKENWELALYISRLPCGDASMSFLNDNCKNDDFKIEDSDEFQYVDRSVKTILRGR LNFNRRNVVRTKPGRYDSNITLSKSCSDKLLMKQRSSVLNCLNYELFEKPVFLKYIVIPNLEDET KHHLEQSFHTRLPNLDNEIKFLNCLKPFYDDKLDEEDVPGLMCSVKLFMDDFSTEEAILNGVRN GFYTKSSKPLRKHCQSQVSRFAQWELFKKIRPEYEGISYLEFKSRQKKRSQLIIAIKNILSPDGWI PTRTDDVK SEQ ID: 8 MQHIKHMRTAVRLARYALDHDETPVACIFVHTPTGQVMAYGMNDTNKSLTGVAHAEFMGIDQI KAMLGSRGVVDVFKDITLYVTVEPCIMCASALKQLDIGKVVFGCGNERFGGNGTVLSVNHDTC TLVPKNNSAAGYESIPGILRKEAIMLLRYFYVRQNERAPKPRSKSDRVLDKNTFPPMEWSKYLN EEAFIETFGDDYRTCFANKVDLSSNSVDWDLIDSHQDNIIQELEEQCKMFKFNVHKKSKV SEQ ID: 9 MEEDHCEDSHNYMGFALHQAKLALEALEVPVGCVFLEDGKVIASGRNRTNETRNATRHAEME AIDQLVGQWQKDGLSPSQVAEKFSKCVLYVTCEPCIMCASALSFLGIKEVYYGCPNDKFGGCG SILSLHLGSEEAQRGKGYKCRGGIMAEEAVSLFKCFYEQGNPNAPKPHRPVVQRERT SEQ ID: 10 MEPLQITEEIQNWMHKAFQMAQDALNNGEVPVGCLMVYGNQVVGKGRNEVNETKNATQHAE MVAIDQVLDWCEMNSKKSTDVFENIVLYVTVEPCIMCAGALRLLKIPLVVYGCRNERFGGCGSV LNVSGDDIPDTGTKFKCIGGYQAEKAIELLKTFYKQENPNAPKSKVRKKE SEQ ID: 11 MTEEIQNWMHKAFQMAQDALNNGEVPVGCLMVYDNQVVGKGRNEVNETKNATRHAEMVAID QVLDWCEKNSKKSRDVFENIVLYVTVEPCIMCAGALRLLKIPLVVYGCRNERFGGCGSVLNVA GDNIPDTGTEFKYIGGYQAEKAVELLKTFYKQENPNAPRSKVRKKE SEQ ID: 12 MQEVGVDPEKNDFLQPSDSEVQTWMAKAFDMAVEALENGEVPVGCLMVYNNEIIGKGRNEV NETKNATRHAEMVALDQVLDWCRLREKDCKEVCEQTVLYVTVEPCIMCAAALRLLRIPFVVYG CKNERFGGCGSVLDVSSDHLPHTGTSFKCIAGYRAEEAVEMLKTFYKQENPNAPKPKVRKDSI NPQDGAAVIQVMRGPPDEETETIAHLS SEQ ID: 13 MEAKAGPTAATDGAYSVSAEETEKWMEQAMQMAKDALDNTEVPVGCLMVYNNEVVGKGRN EVNQTKNATRHAEMVAIDQALDWCRRRGRSPSEVFEHTVLYVTVEPCIMCAAALRLMRIPLVV YGCQNERFGGCGSVLDIASADLPSTGKPFQCTPGYRAEEAVEMLKTFYKQENPNAPKSKVRK KECHKS SEQ ID: 14 MEEKVESTTTPDGPCVVSVQETEKWMEEAMRMAKEALENIEVPVGCLMVYNNEVVGKGRNE VNQTKNATRHAEMVAIDQVLDWCHQHGQSPSTVFEHTVLYVTVEPCIMCAAALRLMKIPLVVY GCQNERFGGCGSVLNIASADLPNTGRPFQCIPGYRAEEAVELLKTFYKQENPNAPKSKVRKKD CQKS SEQ ID: 15 MEAKAAPKPAASGACSVSAEETEKWMEEAMHMAKEALENTEVPVGCLMVYNNEVVGKGRNE VNQTKNATRHAEMVAIDQVLDWCRQSGKSPSEVFEHTVLYVTVEPCIMCAAALRLMKIPLVVY GCQNERFGGCGSVLNIASADLPNTGRPFQCIPGYRAEEAVEMLKTFYKQENPNAPKSKVRKKE CQKS ABE6.3, SEQ ID: 16 MSEVEFSHEYWMRHALTLAKRAWDEREVPVGAVLVHNNRVIGEGWNRPIGRHDPTAHAEIMA LRQGGLVMQNYRLIDATYVTLEPCVMCAGAMIHSRIGRVVFGARDAKTGAAGSLMDVLHHPG MNHRVEITEGILADECAALLSDFFRMRRQEIKAQKKAQSSTDSGGSSGGSSGSETPGTSESAT PESSGGSSGGSSEVEFSHEYVVMRHALTLAKRAWDEREVPVGAVLVLNNRVIGEGWNRSIGLH DPTAHAEIMALRQGGLVMQNYRLIDATLYVTFEPCVMCAGAMIHSRIGRVVFGVRNAKTGAAG SLMDVLHYPGMNHRVEITEGILADECAALLCYFFRMRRQVFNAQKKAQSSTDSGGSSGGSSG SETPGTSESATPESSGGSSGGS DKKYSIGLAIGTNSVGWAVITDEYKVPSKKFKVLGNTDRHSIKKNLIGALLFDSGETAEATRLKRT ARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFGNIVDEVAYHEK YPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNSDVDKLFIQLVQTYNQLFE ENPINASGVDAKAILSARLSKSRRLENLIAQLPGEKKNGLFGNLIALSLGLTPNFKSNFDLAEDAK LQLSKDTYDDDLDNLLAQIGDQYADLFLAAKNLSDAILLSDILRVNTEITKAPLSASMIKRYDEHH QDLTLLKALVRQQLPEKYKEIFFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTEELLVKLN REDLLRKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKIEKILTFRIPYYVGPLARGNS RFAWMTRKSEETITPWNFEEVVDKGASAQSFIERMTNFDKNLPNEKVLPKHSLLYEYFTVYNEL TKVKYVTEGMRKPAFLSGEQKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVEDRFNA SLGTYHDLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYAHLFDDKVMKQLKRRR YTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQLIHDDSLTFKEDIQKAQVSGQGDS LHEHIANLAGSPAIKKGILQTVKVVDELVKVMGRHKPENIVIEMARENQTTQKGQKNSRERMKRI EEGIKELGSQILKEHPVENTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDHIVPQSFLK DDSIDNKVLTRSDKNRGKSDNVPSEEVVKKMKNYWRQLLNAKLITQRKFDNLTKAERGGLSEL DKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLIREVKVITLKSKLVSDFRKDFQFYKVR EINNYHHAHDAYLNAVVGTALIKKYPKLESEFVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSN IMNFFKTEITLANGEIRKRPLIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVKKTEVQTGGFS KESILPKRNSDKLIARKKDWDPKKYGGFDSPTVAYSVLVVAKVEKGKSKKLKSVKELLGITIMER SSFEKNPIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRMLASAGELQKGNELALPSKYVNFL YLASHYEKLKGSPEDNEQKQLFVEQHKHYLDEIIEQISEFSKRVILADANLDKVLSAYNKHRDKPI REQAENIIHLFTLTNLGAPAAFKYFDTTIDRKRYTSTKEVLDATLIHQSITGLYETRIDLSQLGGDS GGSPKKKRKV ABE7.8, SEQ ID: 17 MSEVEFSHEYWMRHALTLAKRAWDEREVPVGAVLVHNNRVIGEGWNRPIGRHDPTAHAEIMA LRQGGLVMQNYRLIDATLYVTLEPCVMCAGAMIHSRIGRVVFGARDAKTGAAGSLMDVLHHPG MNHRVEITEGILADECAALLSDFFRMRRQEIKAQKKAQSSTDSGGSSGGSSGSETPGTSESAT PESSGGSSGGSSEVEFSHEYWMRHALTLAKRALDEREVPVGAVLVLNNRVIGEGWNRAIGLH DPTAHAEIMALRQGGLVMQNYRLIDATLYVTFEPCVMCAGAMIHSRIGRVVFGVRNAKTGAAG SLMDVLHYPGMNHRVEITEGILADECNALLCYFFRMRRQVFNAQKKAQSSTDSGGSSGGSSG SETPGTSESATPESSGGSSGGSDKKYSIGLAIGTNSVGWAVITDEYKVPSKKFKVLGNTDRHSI KKNLIGALLFDSGETAEATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHRLEESFLVE EDKKHERHPIFGNIVDEVAYHEKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDL NPDNSDVDKLFIQLVQTYNQLFEENPINASGVDAKAILSARLSKSRRLENLIAQLPGEKKNGLFG NLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIGDQYADLFLAAKNLSDAILLSDIL RVNTEITKAPLSASMIKRYDEHHQDLTLLKALVRQQLPEKYKEIFFDQSKNGYAGYIDGGASQE EFYKFIKPILEKMDGTEELLVKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDN REKIEKILTFRIPYYVGPLARGNSRFAVVMTRKSEETITPWNFEEVVDKGASAQSFIERMTNFDK NLPNEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRKPAFLSGEQKKAIVDLLFKTNRKVTVKQL KEDYFKKIECFDSVEISGVEDRFNASLGTYHDLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREM IEERLKTYAHLFDDKVMKQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQ LIHDDSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQTVKVVDELVKVMGRHKPENIVI EMARENQTTQKGQKNSRERMKRIEEGIKELGSQILKEHPVENTQLQNEKLYLYYLQNGRDMYV DQELDINRLSDYDVDHIVPQSFLKDDSIDNKVLTRSDKNRGKSDNVPSEEVVKKMKNYWRQLL NAKLITQRKFDNLTKAERGGLSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLIRE VKVITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIKKYPKLESEFVYGDYKVYD VRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRKRPLIETNGETGEIVWDKGRDFATV RKVLSMPQVNIVKKTEVQTGGFSKESILPKRNSDKLIARKKDWDPKKYGGFDSPTVAYSVLVVA KVEKGKSKKLKSVKELLGITIMERSSFEKNPIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRM LASAGELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKHYLDEIIEQISEFSK RVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNLGAPAAFKYFDTTIDRKRYTSTKEVLD ATLIHQSITGLYETRIDLSQLGGDSGGSPKKKRKV ABE7.9, SEQ ID: 18 MSEVEFSHEYWMRHALTLAKRAWDEREVPVGAVLVHNNRVIGEGWNRPIGRHDPTAHAEIMA LRQGGLVMQNYRLIDATLYVTLEPCVMCAGAMIHSRIGRVVFGARDAKTGAAGSLMDVLHHPG MNHRVEITEGILADECAALLSDFFRMRRQEIKAQKKAQSSTDSGGSSGGSSGSETPGTSESAT PESSGGSSGGSSEVEFSHEYWMRHALTLAKRALDEREVPVGAVLVLNNRVIGEGWNRAIGLH DPTAHAEIMALRQGGLVMQNYRLIDATLYVTFEPCVMCAGAMIHSRIGRVVFGVRNAKTGAAG SLMDVLHYPGMNHRVEITEGILADECNALLCYFFRMPRQVFNAQKKAQSSTDSGGSSGGSSG SETPGTSESATPESSGGSSGGSDKKYSIGLAIGTNSVGWAVITDEYKVPSKKFKVLGNTDRHSI KKNLIGALLFDSGETAEATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHRLEESFLVE EDKKHERHPIFGNIVDEVAYHEKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDL NPDNSDVDKLFIQLVQTYNQLFEENPINASGVDAKAILSARLSKSRRLENLIAQLPGEKKNGLFG NLIALSLGLTPNF KSNFDLAEDAKLQLSKDTYDDDLDNLLAQIGDQYADLFLAAKNLSDAILLSDILRVNTEITKAPLS ASMIKRYDEHHQDLTLLKALVRQQLPEKYKEIFFDQSKNGYAGYIDGGASQEEFYKFIKPILEKM DGTEELLVKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKIEKILTFRIPY YVGPLARGNSRFAWMTRKSEETITPWNFEEVVDKGASAQSFIERMTNFDKNLPNEKVLPKHSL LYEYFTVYNELTKVKYVTEGMRKPAFLSGEQKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSV EISGVEDRFNASLGTYHDLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYAHLFDD KVMKQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQLIHDDSLTFKEDIQ KAQVSGQGDSLHEHIANLAGSPAIKKGILQTVKVVDELVKVMGRHKPENIVIEMARENQTTQKG QKNSRERMKRIEEGIKELGSQILKEHPVENTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYD VDHIVPQSFLKDDSIDNKVLTRSDKNRGKSDNVPSEEVVKKMKNYWRQLLNAKLITQRKFDNLT KAERGGLSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLIREVKVITLKSKLVSDFR KDFQFYKVREINNYHHAHDAYLNAVVGTALIKKYPKLESEFVYGDYKVYDVRKMIAKSEQEIGK ATAKYFFYSNIMNFFKTEITLANGEIRKRPLIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVKK TEVQTGGFSKESILPKRNSDKLIARKKDWDPKKYGGFDSPTVAYSVLVVAKVEKGKSKKLKSVK ELLGITIMERSSFEKNPIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRMLASAGELQKGNELA LPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKHYLDEIIEQISEFSKRVILADANLDKVLSA YNKHRDKPIREQAENIIHLFTLTNLGAPAAFKYFDTTIDRKRYTSTKEVLDATLIHQSITGLYETRI DLSQLGGDSGGSPKKKRKV ABE7.10, SEQ ID: 19 MSEVEFSHEYWMRHALTLAKRAWDEREVPVGAVLVHNNRVIGEGWNRPIGRHDPTAHAEIMA LRQGGLVMQNYRLIDATLYVTLEPCVMCAGAMIHSRIGRVVFGARDAKTGAAGSLMDVLHHPG MNHRVEITEGILADECAALLSDFFRMRRQEIKAQKKAQSSTDSGGSSGGSSGSETPGTSESAT PESSGGSSGGSSEVEFSHEYWMRHALTLAKRARDEREVPVGAVLVLNNRVIGEGWNRAIGLH DPTAHAEIMALRQGGLVMQNYRLIDATLYVTFEPCVMCAGAMIHSRIGRVVFGVRNAKTGAAG SLMDVLHYPGMNHRVEITEGILADECAALLCYFFRMPRQVFNAQKKAQSSTDSGGSSGGSSG SETPGTSESATPESSGGSSGGSDKKYSIGLAIGTNSVGWAVITDEYKVPSKKFKVLGNTDRHSI KKNLIGALLFDSGETAEATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHRLEESFLVE EDKKHERHPIFGNIVDEVAYHEKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDL NPDNSDVDKLFIQLVQTYNQLFEENPINASGVDAKAILSARLSKSRRLENLIAQLPGEKKNGLFG NLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIGDQYADLFLAAKNLSDAILLSDIL RVNTEITKAPLSASMIKRYDEHHQDLTLLKALVRQQLPEKYKEIFFDQSKNGYAGYIDGGASQE EFYKFIKPILEKMDGTEELLVKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDN REKIEKILTFRIPYYVGPLARGNSRFAVVMTRKSEETITPWNFEEVVDKGASAQSFIERMTNFDK NLPNEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRKPAFLSGEQKKAIVDLLFKTNRKVTVKQL KEDYFKKIECFDSVEISGVEDRFNASLGTYHDLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREM IEERLKTYAHLFDDKVMKQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQ LIHDDSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQTVKVVDELVKVMGRHKPENIVI EMARENQTTQKGQKNSRERMKRIEEGIKELGSQILKEHPVENTQLQNEKLYLYYLQNGRDMYV DQELDINRLSDYDVDHIVPQSFLKDDSIDNKVLTRSDKNRGKSDNVPSEEVVKKMKNYWRQLL NAKLITQRKFDNLTKAERGGLSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLIRE VKVITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIKKYPKLESEFVYGDYKVYD VRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRKRPLIETNGETGEIVWDKGRDFATV RKVLSMPQVNIVKKTEVQTGGFSKESILPKRNSDKLIARKKDWDPKKYGGFDSPTVAYSVLVVA KVEKGKSKKLKSVKELLGITIMERSSFEKNPIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRM LASAGELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKHYLDEIIEQISEFSK RVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNLGAPAAFKYFDTTIDRKRYTSTKEVLD ATLIHQSITGLYETRIDLSQLGGDSGGSPKKKRKV ABEmax, SEQ ID: 20 MKRTADGSEFESPKKKRKVSEVEFSHEYWMRHALTLAKRAWDEREVPVGAVLVHNNRVIGEG WNRPIGRHDPTAHAEIMALRQGGLVMQNYRLIDATLYVTLEPCVMCAGAMIHSRIGRVVFGAR DAKTGAAGSLMDVLHHPGMNHRVEITEGILADECAALLSDFFRMRRQEIKAQKKAQSSTDSGG SSGGSSGSETPGTSESATPESSGGSSGGSSEVEFSHEYWMRHALTLAKRARDEREVPVGAV LVLNNRVIGEGWNRAIGLHDPTAHAEIMALRQGGLVMQNYRLIDATLYVTFEPCVMCAGAMIHS RIGRVVFGVRNAKTGAAGSLMDVLHYPGMNHRVEITEGILADECAALLCYFFRMPRQVFNAQK KAQSSTDSGGSSGGSSGSETPGTSESATPESSGGSSGGSDKKYSIGLAIGTNSVGWAVITDEY KVPSKKFKVLGNTDRHSIKKNLIGALLFDSGETAEATRLKRTARRRYTRRKNRICYLQEIFSNEM AKVDDSFFHRLEESFLVEEDKKHERHPIFGNIVDEVAYHEKYPTIYHLRKKLVDSTDKADLRLIYL ALAHMIKFRGHFLIEGDLNPDNSDVDKLFIQLVQTYNQLFEENPINASGVDAKAILSARLSKSRRL ENLIAQLPGEKKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIGDQYA DLFLAAKNLSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLLKALVRQQLPEKYKEIFFDQ SKNGYAGYIDGGASQEEFYKFIKPILEKMDGTEELLVKLNREDLLRKQRTFDNGSIPHQIHLGEL HAILRRQEDFYPFLKDNREKIEKILTFRIPYYVGPLARGNSRFAVVMTRKSEETITPWNFEEVVDK GASAQSFIERMTNFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRKPAFLSGEQKKAI VDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVEDRFNASLGTYHDLLKIIKDKDFLDNEENED ILEDIVLTLTLFEDREMIEERLKTYAHLFDDKVMKQLKRRRYTGWGRLSRKLINGIRDKQSGKTIL DFLKSDGFANRNFMQLIHDDSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQTVKVVD ELVKVMGRHKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGSQILKEHPVENTQLQNE KLYLYYLQNGRDMYVDQELDINRLSDYDVDHIVPQSFLKDDSIDNKVLTRSDKNRGKSDNVPSE EVVKKMKNYWRQLLNAKLITQRKFDNLTKAERGGLSELDKAGFIKRQLVETRQITKHVAQILDS RMNTKYDENDKLIREVKVITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIKKYP KLESEFVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRKRPLIETNGE TGEIVWDKGRDFATVRKVLSMPQVNIVKKTEVQTGGFSKESILPKRNSDKLIARKKDWDPKKY GGFDSPTVAYSVLVVAKVEKGKSKKLKSVKELLGITIMERSSFEKNPIDFLEAKGYKEVKKDLIIK LPKYSLFELENGRKRMLASAGELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVE QHKHYLDEIIEQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNLGAPAAFKYF DTTIDRKRYTSTKEVLDATLIHQSITGLYETRIDLSQLGGDSGGSKRTADGSEFEPKKKRKV miniABEmax, SEQ ID: 31 NLS-tadA(7.10)-32AA linker*-hSpCas9n(D10A)-NLS-P2A-EGFP-NLS MKRTADGSEFESPKKKRKVSEVEFSHEYWMRHALTLAKRARDEREVPVGAVLVLNNRVIGEG WNRAIGLHDPTAHAEIMALRQGGLVMQNYRLIDATLYVTFEPCVMCAGAMIHSRIGRVVFGVR NAKTGAAGSLMDVLHYPGMNHRVEITEGILADECAALLCYFFRMPRQVFNAQKKAQSSTDSG GSSGGSSGSETPGTSESATPESSGGSSGGSDKKYSIGLAIGTNSVGWAVITDEYKVPSKKFKV LGNTDRHSIKKNLIGALLFDSGETAEATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFH RLEESFLVEEDKKHERHPIFGNIVDEVAYHEKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFR GHFLIEGDLNPDNSDVDKLFIQLVQTYNQLFEENPINASGVDAKAILSARLSKSRRLENLIAQLPG EKKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIGDQYADLFLAAKNL SDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLLKALVRQQLPEKYKEIFFDQSKNGYAGY IDGGASQEEFYKFIKPILEKMDGTEELLVKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQED FYPFLKDNREKIEKILTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFEEVVDKGASAQSFIE RMTNFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRKPAFLSGEQKKAIVDLLFKTNR KVTVKQLKEDYFKKIECFDSVEISGVEDRFNASLGTYHDLLKIIKDKDFLDNEENEDILEDIVLTLT LFEDREMIEERLKTYAHLFDDKVMKQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGF ANRNFMQLIHDDSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQTVKVVDELVKVMG RHKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGSQILKEHPVENTQLQNEKLYLYYL QNGRDMYVDQELDINRLSDYDVDHIVPQSFLKDDSIDNKVLTRSDKNRGKSDNVPSEEVVKKM KNYWRQLLNAKLITQRKFDNLTKAERGGLSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKY DENDKLIREVKVITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIKKYPKLESEFV YGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRKRPLIETNGETGEIVWD KGRDFATVRKVLSMPQVNIVKKTEVQTGGFSKESILPKRNSDKLIARKKDWDPKKYGGFDSPT VAYSVLVVAKVEKGKSKKLKSVKELLGITIMERSSFEKNPIDFLEAKGYKEVKKDLIIKLPKYSLFE LENGRKRMLASAGELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKHYLDE IIEQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNLGAPAAFKYFDTTIDRKRY TSTKEVLDATLIHQSITGLYETRIDLSQLGGDSGGSKRTADGSEFEPKKKRKVGSGATNFSLLK QAGDVEENPGPMVSKGEELFTGVVPILVELDGDVNGHKFSVSGEGEGDATYGKLTLKFICTTG KLPVPWPTLVTTLTYGVQCFSRYPDHMKQHDFFKSAMPEGYVQERTIFFKDDGNYKTRAEVKF EGDTLVNRIELKGIDFKEDGNILGHKLEYNYNSHNVYIMADKQKNGIKVNFKIRHNIEDGSVQLA DHYQQNTPIGDGPVLLPDNHYLSTQSALSKDPNEKRDHMVLLEFVTAAGITLGMDELYKSGGS PKKKRKV MiniABEmax K20A/R21A, SEQ ID: 32 NLS-tadA(K20A/R21A)-32AA linker*-hSpCas9n(D10A)-NLS-P2A-EGFP-NLS MKRTADGSEFESPKKKRKVSEVEFSHEYWMRHALTLAAAARDEREVPVGAVLVLNNRVIGEG WNRAIGLHDPTAHAEIMALRQGGLVMQNYRLIDATLYVTFEPCVMCAGAMIHSRIGRVVFGVR NAKTGAAGSLMDVLHYPGMNHRVEITEGILADECAALLCYFFRMPRQVFNAQKKAQSSTDSG GSSGGSSGSETPGTSESATPESSGGSSGGSDKKYSIGLAIGTNSVGWAVITDEYKVPSKKFKV LGNTDRHSIKKNLIGALLFDSGETAEATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFH RLEESFLVEEDKKHERHPIFGNIVDEVAYHEKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFR GHFLIEGDLNPDNSDVDKLFIQLVQTYNQLFEENPINASGVDAKAILSARLSKSRRLENLIAQLPG EKKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIGDQYADLFLAAKNL SDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLLKALVRQQLPEKYKEIFFDQSKNGYAGY IDGGASQEEFYKFIKPILEKMDGTEELLVKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQED FYPFLKDNREKIEKILTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFEEVVDKGASAQSFIE RMTNFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRKPAFLSGEQKKAIVDLLFKTNR KVTVKQLKEDYFKKIECFDSVEISGVEDRFNASLGTYHDLLKIIKDKDFLDNEENEDILEDIVLTLT LFEDREMIEERLKTYAHLFDDKVMKQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGF ANRNFMQLIHDDSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQTVKVVDELVKVMG RHKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGSQILKEHPVENTQLQNEKLYLYYL QNGRDMYVDQELDINRLSDYDVDHIVPQSFLKDDSIDNKVLTRSDKNRGKSDNVPSEEVVKKM KNYWRQLLNAKLITQRKFDNLTKAERGGLSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKY DENDKLIREVKVITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIKKYPKLESEFV YGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRKRPLIETNGETGEIVWD KGRDFATVRKVLSMPQVNIVKKTEVQTGGFSKESILPKRNSDKLIARKKDWDPKKYGGFDSPT VAYSVLVVAKVEKGKSKKLKSVKELLGITIMERSSFEKNPIDFLEAKGYKEVKKDLIIKLPKYSLFE LENGRKRMLASAGELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKHYLDE IIEQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNLGAPAAFKYFDTTIDRKRY TSTKEVLDATLIHQSITGLYETRIDLSQLGGDSGGSKRTADGSEFEPKKKRKVGSGATNFSLLK QAGDVEENPGPMVSKGEELFTGVVPILVELDGDVNGHKFSVSGEGEGDATYGKLTLKFICTTG KLPVPWPTLVTTLTYGVQCFSRYPDHMKQHDFFKSAMPEGYVQERTIFFKDDGNYKTRAEVKF EGDTLVNRIELKGIDFKEDGNILGHKLEYNYNSHNVYIMADKQKNGIKVNFKIRHNIEDGSVQLA DHYQQNTPIGDGPVLLPDNHYLSTQSALSKDPNEKRDHMVLLEFVTAAGITLGMDELYKSGGS PKKKRKV MiniABEmax V82G, SEQ ID: 33 NLS-tadA(V82G)-32AA linker*-hSpCas9n(D10A)-NLS-P2A-EGFP-NLS MKRTADGSEFESPKKKRKVSEVEFSHEYWMRHALTLAKRARDEREVPVGAVLVLNNRVIGEG WNRAIGLHDPTAHAEIMALRQGGLVMQNYRLIDATLYGTFEPCVMCAGAMIHSRIGRVVFGVR NAKTGAAGSLMDVLHYPGMNHRVEITEGILADECAALLCYFFRMPRQVFNAQKKAQSSTDSG GSSGGSSGSETPGTSESATPESSGGSSGGSDKKYSIGLAIGTNSVGWAVITDEYKVPSKKFKV LGNTDRHSIKKNLIGALLFDSGETAEATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFH RLEESFLVEEDKKHERHPIFGNIVDEVAYHEKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFR GHFLIEGDLNPDNSDVDKLFIQLVQTYNQLFEENPINASGVDAKAILSARLSKSRRLENLIAQLPG EKKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIGDQYADLFLAAKNL SDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLLKALVRQQLPEKYKEIFFDQSKNGYAGY IDGGASQEEFYKFIKPILEKMDGTEELLVKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQED FYPFLKDNREKIEKILTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFEEVVDKGASAQSFIE RMTNFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRKPAFLSGEQKKAIVDLLFKTNR KVTVKQLKEDYFKKIECFDSVEISGVEDRFNASLGTYHDLLKIIKDKDFLDNEENEDILEDIVLTLT LFEDREMIEERLKTYAHLFDDKVMKQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGF ANRNFMQLIHDDSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQTVKVVDELVKVMG RHKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGSQILKEHPVENTQLQNEKLYLYYL QNGRDMYVDQELDINRLSDYDVDHIVPQSFLKDDSIDNKVLTRSDKNRGKSDNVPSEEVVKKM KNYWRQLLNAKLITQRKFDNLTKAERGGLSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKY DENDKLIREVKVITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIKKYPKLESEFV YGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRKRPLIETNGETGEIVWD KGRDFATVRKVLSMPQVNIVKKTEVQTGGFSKESILPKRNSDKLIARKKDWDPKKYGGFDSPT VAYSVLVVAKVEKGKSKKLKSVKELLGITIMERSSFEKNPIDFLEAKGYKEVKKDLIIKLPKYSLFE LENGRKRMLASAGELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKHYLDE IIEQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNLGAPAAFKYFDTTIDRKRY TSTKEVLDATLIHQSITGLYETRIDLSQLGGDSGGSKRTADGSEFEPKKKRKVGSGATNFSLLK QAGDVEENPGPMVSKGEELFTGVVPILVELDGDVNGHKFSVSGEGEGDATYGKLTLKFICTTG KLPVPWPTLVTTLTYGVQCFSRYPDHMKQHDFFKSAMPEGYVQERTIFFKDDGNYKTRAEVKF EGDTLVNRIELKGIDFKEDGNILGHKLEYNYNSHNVYIMADKQKNGIKVNFKIRHNIEDGSVQLA DHYQQNTPIGDGPVLLPDNHYLSTQSALSKDPNEKRDHMVLLEFVTAAGITLGMDELYKSGGS PKKKRKV SEQ ID: 34 MSEVEFSHEYWMRHALTLAKRARDEREVPVGAVLVLNNRVIGEGWNRAIGLHDPTAHAEIMAL RQGGLVMQNYRLIDATLYVTFEPCVMCAGAMIHSRIGRVVFGVRNAKTGAAGSLMDVLHYPG MNHRVEITEGILADECAALLCYFFRMPRQVFNAQKKAQSSTD Other Embodiments