A unique aspect of retroviruses is their ability to inextricably integrate a DNA copy of the viral genome into the host cell genome and establish a permanent viral reservoir. While antiviral therapies can block active virus replication, and the immune system can recognize and destroy cells expressing viral antigens, those cells carrying silent proviral DNA can persist and serve as a source of virus upon activation of expression even years later. This is a significant barrier to a cure for infections by retroviruses such as HIV-1.
While the viral integrase (IN) protein alone is capable of catalyzing the insertion of the viral genome into the host cell genome, it gets significant help from host cell factors. My research career began with identifying host cell factors involved in avian leukosis virus (ALV) integration in the lab of Dr. Karen Beemon (Winans, et al. 2017). ALV integration sites are largely random, but other retroviral families exhibit strongly biased integration preferences within the host genome. For instance, HIV-1 exhibits preferential integration into actively transcribed genes whereas murine leukemia virus (MLV) prefers to integrate near transcription start sites. In both cases, this preferential integration has been attributed to binding of the viral IN protein to host cell factors. Perhaps the most well studied example is the binding of HIV-1 IN to the host factor LEDGF which plays a role in directing integration to particular genomic locations by acting as a bimodal tether.
Biased integration is a conserved feature amongst retrotransposons as well. The yeast retrotransposon, Ty5, exhibits strongly targeted integration into silent chromatin such as telomeres and the mating type loci. It was discovered that this was due to binding of the Ty5 IN protein to the host silencing factor Sir4. Interestingly, the interaction between IN and Sir4 is regulated by phosphorylation. Under certain conditions, when the IN protein becomes de-phosphorylated, integration of the Ty5 element loses selective targeting and instead integrates throughout the entirety of the yeast genome.
I found this particularly interesting because the HIV-1 IN protein is known to be heavily post-translationally modified by at least four post-translational modifications (PTMs): ubiquitination, acetylation, phosphorylation and SUMOylation. Ubiquitination of IN is believed to play a role in regulating the degradation of the IN protein after integration. Phosphorylation has been proposed to be important for efficient integration in vivo. Acetylation of the IN protein increases IN affinity for viral cDNA, enhances strand transfer activity in vitro and may regulate integration in vivo. Lastly, SUMOylation of IN has previously been shown to cause reduced infectivity and decreased integration. Despite the recent work on IN PTMs, the exact functions of these various IN PTMs in the retroviral life cycle is not well understood.
In my postdoctoral work in the Goff lab I undertook a comprehensive study of all known PTMs of IN to determine their potential effects on the totality of the viral life cycle. We generated conservative mutations at all acetylated residues (K258/264/266/273R) either singly or in combination. To assess the importance of phosphorylation we generated either mutations to ablate phosphorylation (S24/57/255A) or to mimic phosphorylation (S24/57/255D) either singly or in combination. Lastly, we generated conservative mutations to ablate SUMOylation sites (E48/138/246Q) within the IN protein. With all generated mutant viruses, we performed infections and assayed all steps of the viral life cycle.
We found that SUMOylation mutants exhibited an early defect in infection or reverse transcription with significantly lower levels of viral DNA being produced. Mutations blocking phosphorylation reduced integration efficiency, and phosphomimetic mutations had the opposite effect. The acetylation mutations, however, turned out to be the most interesting.
The combinatorial mutation ablating acetylation of all four residues had only modest effects on integration efficiency. Instead, we discovered a novel role for the HIV-1 IN protein in regulating proviral transcription at early times post-integration. Our data suggests that IN may be retained on proviral DNA at early times after integration and promote viral gene expression by altering chromatin modifications at the viral transcriptional promoter (Winans & Goff, 2020).
We wanted to determine if the effect of the acetylation mutations in IN on transcription were due to altered integration site preferences. If the mutated virus was integrating into silent chromatin as opposed to the typical active gene regions this would explain the lack of viral transcripts. To this end we determined the integration site profiles for all acetylation mutants, either single point mutants or the combinatorial quadruple acetylation mutant using Next-Generation Sequencing (NGS). Remarkably, we found that one of the point mutations alone, K258R, had a profound effect on targeting of viral integration. Across multiple experiments, more than 8% of K258R mutant viral integrations were detected in centromeres, a 10-fold increase compared to wild-type. Remarkably in one trial, nearly 80% of all K258R mutant proviruses were found to be integrated into centromeres. This was particularly distinctive given that centromeres are known to be actively disfavored targets for WT integration.
We confirmed this phenotype via multiple NGS experiments with various algorithms for mapping. We also performed a qPCR-based assay to show preferential integration into or near centromeric regions in which we observed 30-400-fold more K258R mutant proviruses in the alpha repeat sequences of centromeres than WT proviruses. Lastly, this phenotype was verified via immunofluorescence studies which showed 12-times more co-localization of mutant virus with centromeres than WT.
We hypothesized that this altered preference might be due to altered host factor binding. To this end, we performed a co-IP followed by mass spectrometry of WT or K258R mutant virus to identify unique binding partners. While we did identify several proteins that preferentially immunoprecipitated with K258R IN vs. WT, the preferential binding by the mutant IN could not be confirmed via WB. Thus, it remains unclear what is causing the dramatic alteration in integration site preference induced by the K258R point mutation.
In the short run, evolutionary pressure would typically be thought to select for integration into active chromatin and vigorous expression of the provirus, as is true of WT HIV-1. But it is possible that in the long run virus survival could be promoted by latency – laying low and reappearing at later times – and in this scenario mutations like K258R might be beneficial. Surveys of viral DNAs in latent reservoirs indeed show some proviruses in heterochromatin, and reveal some examples of the K258R mutation, albeit rarely. Whether integration into centromeres is a key feature of latency remains to be determined, but it seems remarkable that one small mutation can create a major bias for these sequences.
Winans S, Goff SP (2020) Mutations altering acetylated residues in the CTD of HIV-1 integrase cause defects in proviral transcription at early times after integration of viral DNA. PLoS Path. 16(12): e1009147
Winans S, Larue R, Abraham CM, Shkriabai N, Skopp A, Winkler D, Kvaratskhelia M, Beemon KL (2017) The FACT Complex Promotes Avian Leukosis Virus DNA Integration. JVI. 91(7): doi.org/10/1128/JVI.00082-17