Cervical cancer (CC) causes more than 31 percent worldwide each year10,000 people died. Integration of human papillomavirus (HPV) is an important genetic event leading to the development of cervical cancer. Although HPV DNA integration is known to disrupt the genomic structure of host and viral genomes in CC, the complexity of this process remains largely unexplored.
Recently,Huang Xiaoyuan's team from Tongji Medical College, Huazhong University of Science and TechnologyPublished an article in the prestigious journal "BMC Genomics" "Long-Read Sequencing Reveals the Structural Complexity of Genomic Integration of HPV DNA in Cervical Cancer Cell Lines", which utilized the PacBio Sequel II sequencing platform, Whole genome sequencing (WGS) was performed in SIHA and HeLa cells, revealing the complexity of HPV integration through comprehensive analysis of sequence data. The results show that long-read sequencing effectively identifies HPV integration breakpoints with accuracy comparable to that of high-throughput sequencing (NGS) methods. The paper also constructs detailed models of complex integrated genome structures, including regions near the HPV genome and the human genome. In addition, the sequencing results revealed a wide range of genome-wide structural variants (SVS) in SIHA and HELA cells. The study further found that SVS changes on chromosome 13 in SIHA cells may be associated with changes in gene expression levels. In this study, a complex model of HPV-integrated genomic structure in SIHA and HeLa cells was successfully constructed by PacBio long-read sequencing. This result strongly demonstrates the valuable ability of long-read sequencing to detect and characterize the integrated structure of the HPV genome in human cells. In addition, these findings provide key insights into the complex process of HPV16 and HPV18 integration and their potential contribution to the development of cervical cancer!
Fraser Gene undertook the resequencing of three generations of human genomes for the study.
Figure 1 Article publication information.
Research ideas
The results of the study are presented
1. High-throughput sequencing (NGS) detected HPV integration breakpoints
Previous studies have utilized high-throughput sequencing technology to analyze HPV integration. The results showed that HPV16 was integrated into chromosome 13 in SIHA cells, specifically in the intergenic region closest to the KLF5 and LINC00392 genes. In addition, HPV18 integration into chromosome 8 was observed in HeLa cells, specifically in the intergenic region closest to the CCAT1 and CASC21 genes. These results are consistent with those of other studies.
However, although HPV high-throughput sequencing has shown efficacy and economics in detecting HPV integration, there are some limitations in determining the structure of the integrated genome. First, high-throughput sequencing has a short read length (150 bp), which presents challenges in determining the complex structure of the integrated genome. Second, high-throughput sequencing strategies result in the loss of a large number of fragments of the human genome, resulting in an insufficient number of human reads for accurately identifying complex structural variants associated with the human genome. Specifically, high-throughput sequencing for HPV16 and HPV18 accounted for 65 of the total sequencing reads, respectively87% and 7984%。These findings suggest that high-throughput sequencing is primarily used to identify integration breakpoints between the human genome and the HPV genome, but does not provide a comprehensive understanding of the complex integrated genome structure.
Fig.2 Summary table of HPV integration sites identified by NGS data.
2. Long-read sequencing reveals the complex HPV integrated genome structure
Long-read sequencing technology enables the generation of long-read segments containing one or more HPV DN** segments and adjacent to the human genome at both ends. This feature improves reliability and facilitates direct mapping of reads at HPV breakpoints, helping to elucidate genomic integration. Therefore, the genome of Siha cells was sequenced using PacBio Sequel II. Sequencing of SIHA cells yielded a total of 19418 GB bases and 8,381,208 reads, filtered out low-quality reads. N50 reads 3481 kb。Consistent with the results of HPV target capture next-generation sequencing, PacBio long-read sequencing results also confirmed two integration sites in SIHA cells: HPV16:3134-CHR13:74,087,562 and HPV16:3384-CHR13:73,788,866. These results suggest that PacBio long-read sequencing can effectively identify HPV integration breakpoints and is comparable to the accuracy of high-throughput sequencing.
In addition, the results revealed the complete integrated fragments of the HPV16 L1, L2, E1, E4, E5, E6, and E7 genes, as well as a partial sequence of the E2 gene in the SIHA cell genome (Figure 3). HPV16 integration occurred twice, with a fragment of HPV16 (coordinates from 3384 to 7906 1-3134) integrated onto chromosome 13, with coordinates 73,788,866-74,087,562 for the human genome (Figure 3). Deletions of the HPV16 genome have also been observed at positions 3460-3508 and 7757-7793. In addition, changes in the human genome near the HPV16 integration site were analyzed. The results of the study showed that the HPV16 fragment was integrated into CHR13:73,788,866 and CHR13:74,087,562 in the reverse direction (Figure 3). In addition, the chromosome row adjacent to the integration site in SIHA cells was directly confirmed (CHR13: 73,255,335-73,464,522) (Figure 3). Taken together, these results suggest that the integration of HPV16 may lead to the instability of genomic structures near the integration site.
Fig.3 The HPV16 complex integrates the genome structure of SIHA cells.
3. Microhomology (MHS) was found between the HPV genome and the human genome near the breakpoint of integration
Previous studies have suggested that micro-homologous recombination (MMR) may play a role in the mechanisms of HPV integration. Therefore, the researchers performed analyses to examine the characteristics of HPV and human genome sequences near the integration site in the SIHA and HeLa cell lines to determine if the integration event was associated with MMR. Two integration sites were identified in SIHA cells: HPV16:3134-CHR13:74,087,562 and HPV16:3384-CHR13:73,788,866. Micro-homologous was observed at the integration site of HPV16:3134-CHR13:74,087,562"atgc"Fragment. The integration sites of HPV16:3384-CHR13:73,788,866 exhibit micro-homologous"tatt"Fragment (Figure 4A). Four integration sites were identified in HeLa cells: HPV18:2498-CHR8:128,241,546, HPV18:3101-CHR8:128,233,367, HPV18:5735-CHR8:128,230,629, and HPV18:7857-CHR8:128,234,255. Micro-homologous was observed at the integration sites of HPV18:2498-CHR8:128,241,546"taac/taca"Fragment. HPV18:5735-CHR8:128,230,629 integration sites exhibit micro-homologous"ataa"Fragment. Observed at the integration site of HPV18:7857-CHR8:128,234,255"tact/taca"fragments, while no micro-homologous fragments were found at the integration sites of HPV18:3101-CHR8:128,233,367 (Figure 4B). These results further support the idea that MMR may be a mechanism for HPV integration.
Fig.4 Mechanism of HPV integration in SIHA and HeLa cells.
Summary
PacBio long-read sequencing technology was used to successfully construct a complex model of HPV integrated genome structure in SIHA and HeLa cells. The results reveal the complex effects of HPV integration events on the genome of cervical cancer cells and uncover the potential correlation between structural variants on chromosome 13 and gene expression levels in SIHA cells. These findings provide important insights into the integration process of HPV16 and HPV18, and provide a basis for further understanding of the molecular mechanisms and potential targets for cervical cancer development. This study demonstrates the effectiveness and value of long-read sequencing technology in detecting and characterizing the integrated structure of the HPV genome, and provides an important tool and method for in-depth study of the occurrence and development of HPV-related tumors.
This work was supported by the Hubei Provincial Department of Science and Technology (2021BCA108), the National Natural Science Grant Project (82100344), and the Wuhan Department of Science and Technology (2019030703011518).
Original link: