Hai-Yu Shen, Fang-Ze Wei, Qian Liu
Hai-Yu Shen, Fang-Ze Wei, Qian Liu, Department of Colorectal Surgery, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing 100021, China
Abstract BACKGROUND Colorectal cancer (CRC) is one of the most malignant gastrointestinal cancers worldwide. The liver is the most important metastatic target organ, and liver metastasis is the leading cause of death in patients with CRC. Owing to the lack of sensitive biomarkers and unclear molecular mechanism, the occurrence of liver metastases cannot be predicted and the clinical outcomes are bad for liver metastases. Therefore, it is very important to identify the diagnostic or prognostic markers for liver metastases of CRC.AIM To investigate the highly differentially expressed genes (HDEGs) and prognostic marker for liver metastases of CRC.METHODS Data from three NCBI Gene Expression Omnibus (GEO) datasets were used to show HDEGs between liver metastases of CRC and tumour or normal samples.These significantly HDEGs of the three GEO datasets take the interactions. And these genes were screened through an online tool to explore the prognostic value.Then, TIMER and R package were utilized to investigate the immunity functions of the HDEGs and gene set enrichment analysis was used to explore their potential functions.RESULTS Based on the selection criteria, three CRC datasets for exploration (GSE14297,GSE41258, and GSE49355) were chosen. Venn diagrams were used to show HDEGs common to the six groups and 47 HDEGs were obtained. The HDEGs were shown by using STRING and Cytoscape software. Based on the TCGA database, APOC1 showed significantly different expression between N2 and N0,and N2 and N1. And there was also a significant difference in expression between T2 and T4, and between T2 and T3. In 20 paired CRC and normal tissues,quantitative real-time polymerase chain reaction illustrated that the APOC1 mRNA was strongly upregulated in CRC tissues (P = 0.014). PrognoScan and GEPIA2 revealed the prognostic value of APOC1 for overall survival and diseasefree survival in CRC (P 〈 0.05). TIMER showed that APOC1 has a close relationship with immune infiltration (P 〈 0.05).CONCLUSION APOC1 is a biomarker that is associated with both the diagnosis and prognosis of liver metastases of CRC.
Key Words: APOC1; Liver metastases; Colorectal cancer; Differentially expressed genes;Marker
Colorectal cancer (CRC) is one of the most malignant gastrointestinal cancers worldwide[1]. The liver is the most important metastatic target organ, and liver metastasis (LM) is the leading cause of death in patients with CRC[2,3]. Approximately 30%-50% of patients are confirmed with postoperative liver metastases and 80%-90% of these are initially unable to achieve radical resection[4-8]. Owing to the lack of sensitive biomarkers and unclear molecular mechanism, the occurrence of liver metastases cannot be predicted, and the incidence and mortality due to CRC continue to increase[9]. Detection and monitoring of liver metastases of CRC depend on the imaging examination, serum biomarker detection, and other examinations; however,these methods have some limitations. Some patients are not willing to undergo colonoscopy, thus CRC and the occurrence of CRC liver metastases cannot be timely detected[10]. In the recent 20 years, many new technologies have been applied to explore the gene expression and functions in human malignant tumours[11].
In our study, we explored data from three microarray datasets from the NCBI Gene Expression Omnibus (GEO)[12] to identify highly differentially expressed genes(HDEGs). These significantly highly expressed genes of the three GEO datasets take the interactions. GO[13]and KEGG[14] analyses were utilized to show the functions of the HDEGs. The GEPIA2 and PrognoScan online tools were used to validate the prognostic value of these genes in CRC[15].APOC1was found to be a significant gene that is associated with the prognostic value of CRC and CRC LM (Figure 1). The online tool TIMER[16] and R package were used to show the functions of APOC1 with regard to immunity. Also, we applied gene set enrichment analysis (GSEA)[17] to inspect the possible functions of APOC1 in CRC. In addition, we used quantitative real-time polymerase chain reaction (PCR) to detect the mRNA expression level ofAPOC1in 20 CRC and paired normal adjacent tissue samples.

Figure 1 Analysis workflow of this study. DEGs: Differentially expressed genes; CRC: Colorectal cancer.
We downloaded the RNA sequencing data from GEO (http://www.ncbi.nlm.nih.gov/geo/). The datasets included data for normal tissues, CRC tissues, and LM tissues, and each dataset had a minimum of five tumour and normal tissues. Based on the aforementioned criteria, three GEO datasets were chosen: GSE14297[18],GSE41258[19], and GSE49355[20,21].
Three series matrix files were from GEO and each GEO matrix were divided into two groups: LM-tumour tissue group (L-T) and LM-normal tissue group (L-N). Then, they were screened using R package ‘limma’ for normalisation and HDEG identification.[log(foldchange)] > 1 andPvalue 〈 0.05 were used to select DEGs. After obtaining the six sets of highly expressed genes, we used the R package ‘‘Venn’ to identify genes shared among three GEO datasets. We ultimately obtained 48 HDEGs.
The connections among the DEGs were analyzed based on the STRING database[22] (https://string-db.org/cgi/input.pl/). And the Cytoscape software[23] was used to visualize the connections through constructing the protein-protein interaction network.
R packages ‘clusterProfiler’, ‘org.Hs.eg.db’, ‘enrichplot’, and ‘ggplot2’ were used for GO enrichment analysis. KEGG pathway analysis was conducted by using the R packages ‘clusterprofiler’[24], ‘org.Hs.eg.db’, ‘enrichplot’, and ‘ggplot2’. In both the two analyses, statistical significance was considered as an adjustedPvalue of 〈 0.05.
We utilized R packages ‘limma’ and ‘beeswarm’ to analyse the expression differences between normal and tumour tissues. We also explored the clinical significance of APOC1 using R packages ‘limma’ and ‘ggpubr’. We utilized the online tools PrognoScan (http://www.prognoscan.org/) to validate the prognostic value of APOC1 in GEO datasets GSE17537[25,26] and GSE14333[27]. We also validated the prognostic value in the TCGA database using the online tool GEPIA2 (http://gepia2.cancer-pku.cn/), which involved TCGA and GTEx data.
TISIDB[28] (http://cis.hku.hk/TISIDB/) was utilized to research the correlation between the gene expression and tumour-infiltrating immune cells. Additionally, we utilized R package ‘estimate’ to research the correlation between the gene expression and three kinds of scores, including the immune score, stromal score, and ESTIMATE score.
The functions of the hub genes were explored by the GSEA. The COAD and READ datasets from TCGA were downloaded and 482 samples were divided into two groups: High and low expression group. The ‘c2.cp.kegg.v6.2.symbols.gmt’ was utilized andP〈 0.001 was considered as statistical significance. The R packages ‘plyr’,‘ggplot2’, ‘grid’, and ‘gridExtra’ were used to show different significant pathways.
RNA was extracted from CRC tissue using the TRIzol method. cDNA was obtained after reverse transcription and used as the template for real-time PCR detection. Then,quantitative real-time PCR was performed to detect the relative mRNA level ofAPOC1. The primers for APOC1 are: 5-GTCCTGGTGGTGGTTCTGTC-3 (forward) and 5- TCTCTGAAAACCACTCCCGC-3 (reverse).
Based on the selection criteria, three CRC datasets were selected for exploration and all characteristics are summarized in Table 1. The 308 genes in the L-N group and 178 genes in the L-T group in the GSE14297; 300 genes in the L-N group and 89 genes in the L-T group in the GSE41258; and 810 genes in the L-N group and 120 genes in the LT group in the GSE49355 were significantly highly expressed (Table 2). Volcano plots(Figure 2) show the DEGs.
We combined the highly expressed genes from GSE14297, GSE41258, and GSE49355 and applied Venn diagrams to get significantly highly expressed genes among the six groups (Figure 3A). STRING and Cytoscape software were used to visualise these HDEGs, as shown in Figure 3B.
GO enrichment was performed following pFilter 〈 0.05 and adjPfilter 〈 1, and the top five terms were chosen: Acute inflammatory response (P= 7.42 × 10-21), negative regulation of response to wounding (P= 5.20 × 10-20), platelet degranulation (P= 6.68 ×10-20), negative regulation of blood coagulation (P= 2.26 × 10-19), and negative regulation of haemostasis (P= 7.54 × 10-17) (Figure 4A). The top five pathways in the KEGG analysis (satisfied pFilter 〈 0.05 and adjPfilter 〈 1) were: Complement and coagulation cascades (P= 7.89 × 10-20), cholesterol metabolism (P= 1.07 × 10-9), African trypanosomiasis (P= 0.00048), platelet activation (P= 0.0017), and drug metabolismcytochrome P450 (P= 0.0033) (Figure 4B).
As shown Figure 5, the APOC1 expression was strongly associated with the clinical features. APOC1 showed significantly different expression between N2 and N0 and between N2 and N1 (Figure 5C). There was also a significant difference in expression between T2 and T4 and between T2 and T3 (Figure 5B). No significance differences were found across different ages and sexes (Figure 5D and E). To confirm the different expression levels in cancer, we further examined 20 paired CRC and normal tissues by using quantitative PCR. As shown in Figure 5F, quantitative real-time PCR illustrated that APOC1 mRNA was strongly upregulated in CRC tissues compared to that in normal colorectal samples.

Table 1 Characteristics of datasets

Table 2 Differentially highly expressed genes in Gene Expression Omnibus datasets
We explored the prognostic value in GSE17537 and GSE14333 using online tools:PrognoScan:P= 0.0094 andP= 0.018 for overall survival in GSE17537 (Figure 6A and B);P= 0.016 andP= 0.013 for disease-free survival in GSE14333 (Figure 6C and D). We also validated the prognostic value in GEPIA2:P= 0.026 for overall survival (P=0.046) (Figure 6E and F).
APOC1 has a close relationship with immune infiltration (Figure 7): In the colon:CD4+ T cells (P= 1.76 × 10-13), CD8+ T cells (P= 2.84 × 10-17), B cells (P= 6.75 × 10-8),neutrophils (P-0), macrophage cells (P= 1.21 × 10-37), and dendritic cells (P= 0); in the rectum: CD4+ T cells (P= 0.00121), CD8+ T cells (P= 0.00016), B cells (P= 0.0223),neutrophils (P= 1.1 × 10-6), macrophage cells (P= 8.33 × 10-13), and dendritic cells (P=1.48 × 10-11). We also utilized the R package ‘estimate’ to calculate different scores:Immune score, stromal score, and ESTIMATE score (Figure 8).
GSEA showed that APOC1 was enriched in ‘Toll-like receptor signalling pathway’,‘Cytokine receptor interaction’, ‘Cell adhesion molecules CAMs’, ‘Chemokine signalling pathway’, and ‘Intestinal immune network for IGA production’ in the high expression group; and enriched in ‘Lysine degradation’, ‘Peroxisome’, ‘Pyruvate metabolism’, ‘Glycerolipid metabolism’, and ‘Fatty acid metabolism’ in the low expression group (Figure 9).
CRC is a common gastrointestinal tumour, and its metastasis is one of the main causes of death in patients with CRC. The liver is the most important target organ of metastasis and approximately 30%-50% of patients have LM at diagnosis or after surgery[4-8]. However, the vast majority of liver metastases cannot initially undergo radical resection and demonstrate a poorer prognosis than those with other types of metastases. Therefore, it is critical to identify markers to predict and diagnose colorectal liver metastases.
To the best of our knowledge, our study is the first to divide one GEO dataset into two groups: LM-normal and LM-tumour to explore the HDEGs in CRC. We integrated three datasets to identify HDEGs. GO and KEGG analyses were performed to research the functions in the three datasets. GO analysis indicated that the acute inflammatory response, negative regulation of response to wounding, platelet degranulation,negative regulation of blood coagulation, and negative regulation of haemostasis were closely related to the development and growth of cancer. For KEGG pathway analysis,complement and coagulation cascades were closely related to immune functions,platelet activation was associated with tumour metastasis, and cholesterol metabolism was associated with the pathogenesis of CRC.

Figure 2 Volcano plots of Gene Expression Omnibus data. A: Volcano plot depicting the differential expression and distribution of GSE14297 in liver metastasis (LM)-tumor group; B: Volcano plot depicting the differential expression and distribution of GSE14297 in LM-normal group; C: Volcano plot depicting the differential expression and distribution of GSE41258 in LM-tumor group; D: Volcano plot depicting the differential expression and distribution of GSE41258 in LMnormal group; E: Volcano plot depicting the differential expression and distribution of GSE49355 in LM-tumor group; F: Volcano plot depicting the differential expression and distribution of GSE49355 in LM-normal group.

Figure 3 Venn diagram and protein-protein interaction network of differentially expressed genes. A: Venn diagram showing the numbers of differentially expressed genes; B: Protein-protein interaction network of 47 highly expressed genes.
We utilized online tools to research the prognostic value of HDEGs. According to the results, APOC1 was associated with the survival time of CRC patients. For further exploring, we analysed APOC1 differential expression between tumour and normal tissues, with regard to different ages, genders, and T and N stages. Significant differences were observed between tumour and normal tissues in addition to our clinical samples, which is consistent with the analysis results. This reveals that APOC1 levels increase from the initial development to LM of CRC.
For exploring the mechanisms of APOC1 in CRC, we utilized the TIMER online tool to assess immune infiltration, immune score, stromal score, estimate score, and GSEA for biological functions of APOC1. TIMER result revealed that APOC1 had a strong correlation with lymphocyte expression, which provides a new perspective to explore the mechanism of colorectal LM. With the development of tumour research, many studies have showed the important role of tumour microenvironment (TME) in tumorigenesis and therapy[29-33]. The TME might have an impact on the therapy and clinical outcome of patients with cancer. In our study, through the analysis of the tumours of the colon and rectum, the results indicate a close relationship between TME and APOC1 expression in CRC.
Several analyses indicate that stromal cells contributed to tumour angiogenesis and extracellular matrix remodelling[34-36]. Meanwhile, a few studies have focused on the influence of immune cells in TME on tumorigenesis and development. Several studies revealed that tumour-infiltrating immune cells might be a potential marker of therapeutic effects. In our results, we analysed the stromal score and immune score in the colon and rectum, respectively. APOC1 has a strong relationship with structural components and immune cells in CRC, which can reveal the relationship with immune cell scores.
The GSEA of APOC1 indicated that in the high expression group, APOC1 was enriched in ‘Cytokine receptor interaction’[37], ‘Cell adhesion molecules CAMs’[38],and ‘Chemokine signalling pathway’[39], suggesting that APOC1 can influence CRC development. ‘Intestinal immune network for IGA production’[40] indicates that APOC1 has a close relationship with the immune response. Moreover, in the low expression group, APOC1 was enriched in ‘Lysine degradation’, ‘Peroxisome’,‘Pyruvate metabolism’, ‘Glycerolipid metabolism’, and ‘Fatty acid metabolism’, which indicate that APOC1 might have an impact on tumorigenesis and development of CRC in different metabolic pathways[41,42]. These findings might present a new perspective on the molecular mechanism of CRC. However, this study still has some limitations. Because our work mainly relies on the bioinformatic analysis of datasets,more basic research experiments should be performed to confirm these results.
In conclusion, by combining three GEO datasets, we identified and characterised some significantly HDEGs in the liver metastases of CRC. Among 48 DEGs, APOC1 is a biomarker that is associated with both the diagnosis and prognosis of liver metastases of CRC. Furthermore, analysis of the relationship with immune infiltration and TME,and gene set enrichment analysis demonstrated that APOC1 was strongly associated with CRC development.

Figure 4 GO and KEGG enrichment. A: Chord plot depicting the relationships between the genes and GO terms; B: Chord plots depicting the functions of the genes in KEGG pathways.

Figure 5 Visualization of correlations between APOC1 expression levels and clinical features. A: Differences in APOC1 expression between control tissues and colorectal cancer tissues based on TCGA database; B: Differences in APOC1 expression between different T stages based on TCGA database;C: Differences in APOC1 expression between different N stages based on TCGA database; D: Differences in CLCA1 expression between different ages based on TCGA database; E: Differences in APOC1 expression between different gender based on TCGA database; F: Quantitative real-time polymerase chain reaction assay showed the mRNA expression of APOC1 in 20 paired colorectal cancer tissues and normal samples. aP 〈 0.05. T: Tumor; N: Normal.

Figure 6 Validation of prognostic value of APOC1. A and B: Overall survival of APOC1 in GSE17537; C and D: Disease-free survival of APOC1 in GSE14333; E: Overall survival of APOC1 in TCGA; F: Disease-free survival of APOC1 in TCGA. OS: Overall survival; DFS: Disease-free survival.

Figure 7 Relationship with immune infiltration. A: Immune cell expression of APOC1 in colon cancer; B: Immune cell expression of APOC1 in rectal cancer.

Figure 8 Relationship with tumor microenvironment. A: Estimated score of APOC1 in colon cancer; B: Estimated score of APOC1 in rectum cancer; C:Immune score of APOC1 in colon cancer; D: Immune score of APOC1 in rectum cancer; E: Stromal score of APOC1 in colon cancer; F: Stromal score of APOC1 in rectum cancer.

Figure 9 Gene set enrichment analysis of APOC1.
Colorectal cancer (CRC) is one of the most malignant gastrointestinal cancers worldwide. The liver is the most important metastatic target organ, and liver metastasis is the leading cause of death in CRC patients.
There is still a lack of diagnostic or prognostic markers for liver metastasis (LM) of CRC. Therefore, it is very important to identify the diagnostic or prognostic markers for LM of CRC to improve the clinical outcomes.
This study aimed to explore the highly differentially expressed genes (HDEGs) and prognostic marker for LM of CRC.
Three NCBI Gene Expression Omnibus (GEO) datasets were utilized to identify a set of HDEGs. These significantly HDEGs of the three GEO datasets take the intersection genes and these intersection genes were screened through an online tool to explore their prognostic value. TIMER and R package were utilized to investigate potential immune functions of HDEGs and gene set enrichment analysis was performed to explore their possible impact on CRC.
APOC1 is one of 47 HDEGs in three GEO datasets for LM of CRC and showed significantly different expression between different N and T stages in the TCGA database. APOC1 mRNA was strongly upregulated in cancer tissues compared with normal tissues, as confirmed by quantitative real-time polymerase chain reaction. The prognostic value of APOC1 for overall survival and disease-free survival in CRC was revealed with PrognoScan and GEPIA2. APOC1 also has a close relationship with immune infiltration showed with TIMER.
APOC1 is a potential biomarker that is associated with both the diagnosis and prognosis of liver metastases of colorectal cancer.
Future work and basic research should be performed to confirm these findings of APOC1 and to verify the related potential regulatory mechanismsin vitroandin vivo.
World Journal of Clinical Cases2021年16期