2025-06-03
In recent years, the field of gene therapy has achieved breakthrough progress through the synergistic development of viral and non-viral vectors. Among them, adeno-associated virus (AAV) has emerged as a core delivery tool for clinical translation, owing to its high safety profile, strong tissue tropism, and remarkable molecular plasticity. Recombinant AAV (rAAV), which supplies rep and cap genes in trans, eliminates the risk of wild-type viral genome integration and has already supported more than 200 clinical trials. Several FDA- and EMA-approved therapies, such as Luxturna and Zolgensma, have been realized in areas including inherited retinal diseases, hemophilia, and neuromuscular disorders. However, the limited capsid packaging capacity of AAV (≤4.8 kb) remains a major constraint on the broader application and advancement of gene therapy.
Genevoyager (Wuhan) Co., Ltd. (Genevoyager) is a leading provider of one-stop CRO/CDMO services for viral vectors, proteins, and vaccines. With a globally-leading proprietary technology platform for large-scale AAV production and cGMP-grade viral vector/protein/vaccine production facilities, we are committed to advancing healthcare by supporting our partners to develop safe, effective, affordable, and accessible gene therapy products, protein-based drugs, and therapeutic vaccines to address unmet medical needs.
In 2023, Genevoyager established a cGMP-compliant manufacturing facility, with the first phase covering an area of 6,400 m², and officially put it into operation in October of the same year. The facility includes four P2-level seed bank areas, five protein/viral production lines, and one filling line. The protein/viral production lines operate at scales of 200 L and 500 L, with an annual maximum capacity of up to 50 batches. To date, multiple IIT and IND projects have already been successfully delivered.
In March 2024, Alexander S. Malogolovkin and his research team from Martsinovsky Institute of Medical Parasitology, Russia, published a review article titled” Optimization strategies and advances in the research and development of AAV-based gene therapy to deliver large transgenes” in Clinical and Translational Medicine. The article systematically integrates the core strategies that address current technological bottlenecks, providing a comprehensive analysis of the diversified developmental pathways of AAV technology and focusing on the clinical translation progress of AAV-mediated large-fragment gene therapies in the pharmaceutical field. In addition, it prospectively proposes the use of neural networks and computational modeling to predict efficient, compact transgenes, offering an intelligent solution to overcome the physical packaging limitations of AAV.
Minigenes – minimal functional gene variants
Only genomes up to ∼4.8 kb can be efficiently packaged and produced with AAV, and the current consensus is that rAAV optimally accommodates transgenes that are up to 3.5 kb. This boundary pushes forward the progress towards the design of novel, shorter gene variants while maintaining their function. In this review, the researchers highlighted some tools and main steps to design a desired protein, depending on the targeted protein structure (Figure 1).
Figure 1: Schematic representation of bioengineering approaches for adeno‐associated virus (AAV)‐based gene therapy targeting big genes.
(Source: Kolesnik VV et al, Clin Transl Med., 2024)
The main pillar for minigene design is structural bioinformatics methods and rational design involving protein and gene databases. Several minigenes compatible with AAV vectors’ packaging capacity have been rationally designed and functionally tested. Mini‐otoferlins (OTOF) partially restored the physiological functions of AAV8 transduced auditory hair cells of OTOF knock‐out mice. The AAV9-PHP.B was used to transfer mini-versions of the protocadherin-15 (mini-PCDH15) to rescue hearing loss in Myo15-Cre conditional knock-out mice, showing great potential as a future gene therapy for inherited deafness (Usher syndrome type 1F). Similarly, the minigene‐4 is a shortened variant of the USH2A gene that was proposed for AAV‐mediated gene therapy for hereditary vision and hearing loss – Usher syndrome type 2. Another rational protein miniaturization was performed with cilia‐centrosomal protein encoded by the CEP290 gene mutations, which are frequently associated with the autosomal recessive childhood blindness disorder LCA. The MiniCEP290 gene was designed and delivered by the AAV2/8 vector, showing a delay in retinal degeneration in the Cep290rd16 mice. The compact form of tuberin encoded by the ‘condensed’ сTSC2 gene was proposed for gene therapy of tuberous sclerosis complex. A vivid example of the successful design of a minimal functioning copy of a gene is the dystrophin minigene (microdystrophin) for DMD therapy. A similar minigene strategy has been applied for the 2023 marketed Valoctocogene roxaparvovec (ROCTAVIAN) from BioMarin Pharmaceutical Inc. for haemophilia A treatment. ROCTAVIAN is an AAV5‐based therapeutic carrying human Factor VIII (FVIII) driven by tissue‐specific promoters to liver cells.
Modifications of genetic regulatory elements of AAV expression cassettes
Four key elements of AAV expression cassettes are essential for successful gene transfer and expression (ITRs, promoter, transgene, and polyA signal) (Figure 2). All other accessory regulatory compounds may significantly enhance or modulate the expression profiles of transgenes if the AAV vector capacity allows.
Figure 2: Adeno‐associated virus (AAV) expression cassette elements and modifications
(Source: Kolesnik VV et al, Clin Transl Med., 2024)
Core elements of typical AAV expression cassettes are promoters. Promoters are cis‐acting regulatory elements that drive, regulate, and enable transcription of the transgene (s) that they are linked with. Transgene transcription mediated by RNA polymerase II is strictly defined by the promoter, accompanied by various cell‐specific transcription factors. The length of the promoters (∼100 to −1500 nucleotides) may significantly decrease the capacity of AAV vectors. Constitutive ubiquitous and strong promoters are often used to control the expression of any transgenes in AAV cassettes. Nevertheless, constitutive promoters provide a high expression level of a transgene may result in the overexpression of AAV‐delivered cargo and may trigger an immune response to the transgene that may eventually halt its functionality. Tissue‐specific promoters, in this case, are viable and safe alternatives. Various muscle‐specific promoters (e.g., creatine kinase promoters (CK6, CK8, and MHCK7)), desmin promoters, human α‐myosin heavy chain gene (αMHC) promoter, the myosin light‐chain promoter (MLC2v), and the cardiac troponin T promoter (cTnT) are attractive options for AAV‐based therapeutics to treat inherited muscular dystrophies (i.e., SMA, DMD, and Pompe disease). Liver‐specific promoter (LP1) was successfully used to develop AAV‐based therapeutics to treat factor IX (FIX) deficiency in haemophilia B patients (Hemgenix). Several small, the so‐called micro‐promoters, only 84 (MP‐84) and 135 bp (MP‐135) in length, were designed and demonstrated comparable activity with the much larger CAG promoter in human islet endocrine cells, hepatocytes, brain, and muscle tissues. Short promoters may be helpful to accommodate a large transgene in AAV expression cassettes and include additional regulatory elements.
AAV ITRs (palindromic ∼125(+20) bp sequence) are genuine replication and packaging signals that also promote long‐term genome persistence and, as a result, prolonged transgene expression. The canonical T‐shaped hairpin loop of AAV ITRs is essential for AAV Rep protein binding and initiation of AAV genome concatemerization. Modifications of AAV ITRs have been applied to generate self‐complementary AAV vectors (scAAV). scAAV expression cassettes have mutated ITRs and a DNA genome that has already folded into transcriptionally active double‐stranded form through intra‐molecular annealing, significantly speeding up AAV genome replication and transgene transcription. It should be particularly pointed out that scAAVs have a decreased capacity to ∼2.5 kb compared to conventional single‐stranded AAV, which drastically limits their wide application.
Other components of AAV expression cassettes, such as PRE and poly(A) signals, are also objects of optimization to expand the capacity of AAV vectors. The SV40 virus, human growth hormone, and bovine growth hormone polyadenylation signals are the most commonly used in AAV expression cassettes. However, the shortened poly(A) sequence does not always fully recapitulate the full‐size analogue.
Exon skipping
Exon skipping can be initiated by using antisense oligonucleotides (ASOs) or clustered regularly interspaced short palindromic repeats with Cas9 protein (CRISPR/Cas9) targeting exonic or intronic sequences important for RNA splicing. As a result of minimizing the number of exons in full‐length genes, a functional minigene variant is transcribed (Figure 1). Using a dual-AAV system, a metabolic liver disease in mice (ornithine transcarbamylase [OTC] deficiency) was corrected by intravenously infusing two AAVs, one expressing Cas9 and the other expressing a guide RNA, and the donor OTC DNA sequence. SaCas9 (1 kb shorter than commonly used SpCas9) and gRNA can be packaged into one AAV8 vector, demonstrating>40% of Pcsk9 gene modification in mouse liver, showing significant reduction of serum Pcsk9 and decrease of total cholesterol level. In addition, AAV vectors carrying the modified U7snRNA gene, from which ASO could be transcribed, are an alternative therapeutic approach for traditional AAV gene replacement therapy. U7snRNA functions as a splicing modulator and, together with small nuclear ribonucleoprotein particles, shields ASO from degradation. Thereby, the AAV9 vector expressing U7 small nuclear RNAs targeting DMD exon 2 (scAAV9.U7snRNA.ACCA) has been successfully tested in the Dup2 mouse model and in vivo neonatal studies. Similarly, the long‐term efficacy of AAV9‐U7snRNA‐mediated Exon 51 skipping in mdx52 mice was confirmed with the restoration of dystrophin expression. However, this technology faces several limitations, including incomplete functional restoration, the high cost of individualized ASO customization, significant variability in efficacy among patients, and the potential neurotoxicity of ASOs. Moreover, the ethical framework for in vivo CRISPR editing remains under development, leaving full gene replacement therapy as a necessary approach for genes with broadly distributed mutations.
Multiple AAV vectors
Trans‐splicing and overlapping
Dual or triple AAV vector delivery is a technology that allows the division of cDNA fragments into multiple parts and encoding each into its own AAV vector, thus having advantages in capacity over a single AAV. This technology was developed due to the properties of the AAV genome to be concatemerized in the head‐to‐tail direction. Dual AAV vectors have been extensively studied for different experimental and disease modalities. The approaches to assembling large transgenes in dual AAV vectors include trans‐splicing (TS) and overlapping (OV).
In the trans‐splicing strategy, the SD (splice donor signal) and SA (splice acceptor signal) are placed at the ends of the split cDNA (Figure 1B) packaged in an individual AAV capsid. Upon co‐transfection of cells with dual AAV vectors, the ITRs are concatemerization according to the head‐to‐tail orientation. The SD and SA are located at the ends of cDNA in each AAV vector and are trans‐spliced, followed by the production of full‐sized mRNA and protein. The key to this technology is that the canonical SD sequence typically contains the nucleotide motif ‘AG’ at the 5′ end of the intron, whereas the SA site contains the ‘AG’ dinucleotide at the 3′ end. Bioinformatic tools are often employed to predict potential SD and acceptor sites within the target genes. Following computational analysis, experimental validation is conducted to confirm the efficacy of the selected splice sequences in promoting trans‐splicing. Dual‐AAV approach has recently been advanced to UshTher clinical trials, aiming to treat retinitis pigmentosa. However, the results of clinical trials have not been published yet.
An alternative to the TS is a design of overlapping sequences (OV) at the 5′ end of one half of the cDNA and the 3′ end of the second half of the cDNA (Figure 1); thus, the two halves of the cDNA share one OV region, and these overlaps are connected by homologous recombination, which also results in a full‐size gene product. The maximum length of OV sequences is limited by the size of the cDNA, and it needs to be experimentally tested for the highest potential region for homologous recombination. Although OV enables the delivery of larger genes than traditional AAV vectors, the successful recombination and accurate reassembly of the split gene segments within the target cells may not always be more efficient than is achieved with trans‐splicing vectors, leading to suboptimal or variable expression of the full‐length gene.
Inteins
Intein‐mediated splicing is another method of delivering large genes that do not fit into an AAV particle. Inteins are genetic elements that are typically around 200–300 amino acids in length and can be transcribed and translated. Overall, inteins are removed through a highly specific and regulated process known as protein splicing, resulting in the production of functional, mature proteins. Splicing with the help of inteins does not require energy consumption, specific proteases, or cofactors. There are several types of inteins, and split‐inteins are trans‐splicing inteins, which means that two subtypes of split‐inteins, N‐intein and C‐intein, are needed for trans‐splicing. Studies have reported that AAV intein-mediated systems exhibit higher efficiency in reconstituting ABCA4 and CEP290 proteins compared to dual AAV vectors. Successful delivery of the FVIII-N6 variant has also been achieved using the Npu DnaE split‐inteins in combination with AAV vectors. Furthermore, a hybrid approach using recombination and TS allows transgenes to be broken up and packaged into two independent AAV vectors, which have been shown to yield favorable transduction outcomes. Additional studies suggest that such a hybrid approach, together with trans-splicing, offers certain advantages for in vivo delivery of large genes such as ABCA4 and MYO7A to the retina.
However, TS requires careful splicing sites and sequence optimization for effective transgene reassembly into a full‐size transgene, and the multiple AAV vectors approach is not favourably accepted for future commercial products. Translation of truncated proteins from non‐trans‐spliced polypeptides and their role in cellular metabolism remain to be thoroughly explored. It has been found that cells transduced by an AAV vector loaded with only one part of a transgene with appropriate genetic regulatory elements can initiate gene expression. Moreover, Inteins operate at the protein level, but being of non‐mammalian origin, they may hold cryptic immunogenic epitopes and trigger unwanted immune reactions. This safety concern is particularly relevant if split inteins are combined with Cas‐based systems, as the co‐expression could increase undesirable immune response. Finding a proper and efficient spit site for a transgene might not always be successful. Further, the need to produce multiple AAV vectors for one gene target is usually associated with higher costs. For the OV and hybrid strategies, the homology sequence is an essential element in determining a split‐gene reconstitution. In the OV strategies, sequences are gene‐specific, and the discovery of better homology sequences could help improve reconstitution levels and efficiency for hybrid approaches.
Circular permutation for minigenes design
Circular permutation (CP) is a relatively novel method for protein engineering that has been adapted from nature by scientists. Schematically, CP is a directed evolutionary process of a protein based on the covalent peptide linkage of the amino and carboxyl termini of a peptide chain to introduce new termini elsewhere in the protein (Figure 3). Circularly permuted proteins often retain conserved three‐dimensional structures and functions. It has great potential in the development of catalytic activity, thermal stability, biosensors, optogenetics, and so on. CP may help to design novel minigens for targeted gene therapy. Performing conformational rearrangements of the sensory domain associated with ligand interaction may create a functional short permutant copy (miniprotein) of a large gene. Circularly permuted protein libraries can be created, and their viability can be subsequently tested by potency tests using relevant cellular models.
Figure 3: Rational design (A and B) and computational approaches (С) for minimizing protein structures
(Source: Kolesnik VV et al, Clin Transl Med., 2024)
Research has demonstrated that several split versions of GFP can be created using CP to reconstitute protein function. A similar approach might be applied for minigenes/miniprotein design to get insight into the crucial structural elements of a protein. Rational protein design can be substantially shaped and boosted by computational approaches. Software‐designed miniproteins with target‐specific binding capacity have been proposed by many groups. Until now, most of them represent nanobinders (antibody‐like structures) to a specific region on a protein surface. Another example of the translational application of CP is found in optogenetic engineering. Novel cpLOV2 possessed unique caging capabilities and enabled the design of light‐inducible necroptosis via mixed lineage kinase domain‐like protein. AAV vectors are also compatible with optogenetic modules delivery, and some of them have already entered clinical trials (NCT02556736, Allergan; NCT03326336, GenSight Biologics).
It is worth noting that AI‐based computational tools can craft entirely novel proteins that have never existed in nature. Nevertheless, its biological significance, as well as stability and functionality, remain a hard task.
Gene therapy and landscape
By November 2023, we found 321 AAV vector‐based products designed to treat inherited genetic diseases in various stages of development. Among them, 154 products (48%) have already entered different phases of clinical trials (Figure 5). But only eight products were approved for treatment by the FDA or EMA which is 2.5% of the total number of drugs in development. In addition, safety concerns also pose a major limitation. Treating patients with X‐linked myotubular myopathy (XLMTM) with AT‐132 (resamirigene bilparvovec) caused tragic death of four boys in ASPIRO clinical trial, revealing insufficient understanding of immune mechanisms and highlighted the urgent need for dose-optimization strategies. Three main approaches are applied by pharmaceutical companies to circumvent AAV capsid limitations: (i) multiple transduction with two or three AAV vectors (such as ABCA4 and OTOF, and a trail of the OTOF gene showed promising results in 2024); (ii) mini‐version of a functional gene (minigene) that can be packaged in viral capsid (such as MicroDMD and miniATP7B, 25 projects in total) and (iii) alternatively, mutations in big genes can be corrected by genome editors or silenced with ASOs or RNA interference (such as using AAV to deliver siRNA to treat Huntington’s disease). Notably, the DMD gene, despite being one of the biggest genes in the human genome, is a primary target for commercialization chosen by many biotech developers. This fact is partly associated with the results of decades long basic research dedicated to the DMD gene structure and functions, accounted for 10 out of 55 large-fragment gene therapy drugs. The second popular target for gene addition is an AAV vector carrying a short version of blood FVIII, with 9 research pipelines. ABCA4 minigene is also of interest to at least two companies focusing on Stargardt disease (inherited retinal degeneration). Although most of the products (62%) in the research and clinical pipelines are aimed at novel targets to be ‘first‐in‐class’ drugs, a meta‐analysis of 255 clinical trials done in 2022 counted 30 clinical trials on hold, where 18 were due to toxicity. Coupled with the complexity of multi-vector manufacturing processes and the challenges of dose standardization in vivo, these factors collectively underscore the critical bottlenecks in technological innovation and translational efficiency that remain to be overcome.
Figure 4: Research and development (RnD) landscape of adeno‐associated virus (AAV)‐based gene therapeutics
(Source: Kolesnik VV et al, Clin Transl Med., 2024)
Figure 5: The status and progress of research pipelines from the research and development (RnD) companies using adeno‐associated virus (AAV) as a vector for gene delivery
(Source: Kolesnik VV et al, Clin Transl Med., 2024)
Figure 6: Advantages and limitations of adeno‐associated virus (AAV)‐based gene therapy
(Source: Kolesnik VV et al, Clin Transl Med., 2024)
Perspectives
Multiple unique genetic and antigenic characteristics of AAV make it a number one vector of choice for gene delivery: no pathogenic to humans, editable genomes, long‐term transgene expression, and large‐scale production. But several obstacles hurdle its clinical applications: beyond the capsid size limitation, clinical durability is constrained by both immunological and non‐immunological, the strict dosing regimen, and pre-existing neutralizing antibodies pose significant safety and efficacy concerns (Figure 6). Future directions for overcoming these hurdles include optimizing vector design via artificial intelligence to reduce immunogenicity, developing low-dose long-term expression strategies, establishing personalized therapeutic approaches (e.g., mutation-specific gene editing), and refining risk management systems such as antibody screening and plasma exchange. These efforts aim to ultimately achieve precise, safe, and durable treatments for genetic diseases.
US: 3675 Market Street, Suite 200, Philadelphia, PA19104 Tel: +1 (215) 205-6963 | +086 027-65023363
E-mail: hui.wang@genevoyager.com
China: No128, Guanggu 7th Rd, East Lake High-tech Development Zone, Wuhan, China Tel: 17720522078
E-mail: marketing@genevoyager.com