中圖分類號:S821 文獻標志碼:A 文章編號:1001-4330(2025)04-0993-09
0 引言
【研究意義近年來,新一代測序技術的迅速發(fā)展為遺傳性狀分析提供了重要的技術和工具,為研究動物的重要經濟性狀提供了新的機會[1-3]。【前人研究進展】全基因組重測序數據研究已經出現家禽家畜經濟性狀重要候選基因的研究。例如,基于全基因組重測序數據鑒定虹魚熱適應性狀的候選基因和基因組間隔[4],鑒定山羊產仔數的候選基因[5],鑒定綿羊腰椎[6],牦牛種群對極端環(huán)境適應能力的候選基因的研究[,以及大量使用全基因組重測數據探索豬[8]狗[9]、雞[10]和鴨[1]的種群基因組特征和候選基因的研究。對于泌乳量在牛和山羊等物種研究較多,對東弗里西亞綿羊種群的遺傳和基因組變異進行分析,并確定與泌乳性狀相關的候選基因組區(qū)域和候選基因[12]。【本研究切入點】馬乳營養(yǎng)成分與人乳相近。然而哈薩克馬泌乳量相關重測序數據很少被研究。【擬解決的關鍵問題】12匹哈薩克馬進行全基因組重測序,確定調控哈薩克馬高低泌乳量的候選基因,為后續(xù)馬屬動物泌乳相關研究提供分子基礎。
1 材料與方法
1.1 材料
試驗馬匹來自阿勒泰市阿克庫勒馬業(yè)自有馬場,馬匹在 10:00,13:00,16:00,19:00 四個時間段機械擠奶,每次擠奶結束使用手提電子秤稱重,測定試驗馬匹產后 30~105d 泌乳量。白天馬匹擠奶期間正常補飼,晚上擠奶結束天然草場放牧,馬匹自由采食飲水,試驗馬匹 24h 泌乳量由 W= G×24/h 換算得出,其中W為 24h 泌乳量,G為實際泌乳量,h為擠奶時間。根據試驗馬匹產后30~60d 日均泌乳量數據,選取12匹年齡、胎次、體況相近,飼養(yǎng)一致的健康哈薩克馬,分為高泌乳量組(HW)和低泌乳量組(LW),每組各6匹,選擇HW組泌乳量大于 2.26kg 時的血樣6份和LW組泌乳量小于 1.05kg 時的血樣6份, -80°C 超低溫速凍,用于后續(xù)檢測。表1
表1 馬匹樣品采集
1.2 方法
1. 2.1 DNA提取質檢
使用CWE96OOMagbeadBloodDNAKit試劑盒進行DNA的提取,并使用NanoDrop2000核酸蛋白測定儀檢測DNA純度,使用dsDNAHSAs-sayKitforQubit試劑盒,檢測DNA濃度, 1.5% 瓊脂糖凝膠, 150V 電泳 25min ,檢測DNA完整性,質檢合格后開始建庫測序。
1. 2.2 文庫構建及測序
使用通用型文庫構建試劑盒(forMGI)Nad-Prep°ledast DNA經破碎儀CovarisTM隨機打斷成300-350bp 左右的片段,經過末端修復、加A尾并連接測序接頭,用 NadPrep@SPBeads篩選300 -350bp 左右的DNA,進行PCR擴增并再次使用NadPrepB SPBeads純化PCR產物,最終獲得測序文庫,文庫構建完成后,先使用Qubit2.0進行初步定量,隨后使用Bioanalyzer?(Agilent)對文庫的插入片段檢測,符合預期后,上機測序。
1.2.3 測序數據過濾
使用fastp對原始數據(rawreads)進行質量控制(QC),數據質控的標準如下:
去除帶接頭(adapter)的reads;
去除N含量大于 1% 的reads;
去除低質量堿基( Q?5 )含量大于 50% 序列。
1.2.4與參考基因組比
通過BWA0.7.17(men)軟件將CleanReads比對到參考基因組上,samtools1.7排序并構建索引,使用GATK4.1.8.0軟件自帶的模塊對Bam
文件進行去重,然后基于Bam文件進而統(tǒng)計各樣品的測序深度、基因組覆蓋度等信息
1.2.5 SNP/InDEL檢測和注釋
根據有效數據在參考基因組的比對結果,使用軟件GATK4.1.8.0對SNP位點進行評估,后采用VariantFiltration模塊對SNP和InDEL進行嚴格過濾,根據QD,FS,MQ,SORMQRankSum,ReadPosRankSum等指標過濾。
1.2.6 基因功能富集
通過R包clusterProfiler實現基因特異性富集分析。在GO和KEGG數據庫中注釋的哈薩克馬基因作為背景基因進行富集分析,選擇符合 P lt;0.05 的GO術語和KEGG通路進行濃縮聚類,確定與泌乳量相關的生理過程。
2 結果與分析
2.1 測序數據質控
研究表明,對哈薩克馬高、低泌乳量2組共12個樣本進行全基因組測序,共得到20234241個突變位點;去除帶接頭序列、低質量序列后得到9804918個有效位點;泌乳量性狀表型數據無異常;質控后,有效SNP位點在各條染色體均有分布,1、2、3、4、8、7、5、20、14和16號染色體突變位點較多,其中突變位點分布在1號染色體數量最多,質控后有748461個有效突變位點,平均距離為0.25;22、24、27、26、13、28、29、25、30和31號染色體突變位點較少,其中突變位點分布在31號染色體數量最少,質控后有126485個有效突變位點,平均距離為0.21。表2\~3,圖1
表3 SNP密度及分布匯總
2.2 哈薩克馬高低泌乳量組表型描述統(tǒng)計
研究表明,高、低泌乳量2組樣本表型整體符合正態(tài)分布,12個樣本分為兩類,正態(tài)分布結果準確,可用于后續(xù)分析。表4,圖2\~3
2.3 群體結構主成分
研究表明,獲得各個PC的方差解釋率及樣本在各個PC中的得分矩陣,從SNPs信息中提取的關鍵信息按照效應從大到小分為PC1、PC2、PC3,得到影響最大的2個特征向量,進而反映分析群體的群體結構。圖4
2.4 顯著位點篩選及通路富集
選擇Case-controlmodel中的卡方檢驗模型,分析哈薩克馬泌乳量相關SNPs關聯(lián)性。
y=Xβ+ZkYk+e
y 為表型向量, Xβ 為固定效應(本次分析無), ZkYk 為待檢驗標記效應, e~N(0,1σ2) 為殘差效應。
選擇Top前0.00001的位點,閾值是0.0001478,共88個突變位點,位于40個基因。圖5,表5
對40個顯著基因進行GO和KEGG功能富集分析,GO富集分析共富集到30個GO術語( P lt;0.05 ),其中生物過程3條,包括跨膜受體蛋白酪氨酸激酶活性、蛋白質絲氨酸/蘇氨酸/酪氨酸激酶活性、細胞粘附分子結合;細胞組分5條,包括突觸后膜的組成部分、谷氨酸能突觸、突觸后膜、樹突棘、神經元棘;分子功能22條包括細胞粘附、參與分化的細胞形態(tài)發(fā)生、細胞成分形態(tài)發(fā)生、細胞發(fā)育、節(jié)律過程、發(fā)育過程的調控、神經元投射發(fā)育的調控、ERK1和ERK2級聯(lián)的調節(jié)、參與分化的細胞形態(tài)發(fā)生的正調控、血管內皮生長因子信號通路、基質粘附依賴性細胞擴散的調控、MAPK級聯(lián)的調節(jié)、基因表達的晝夜節(jié)律調節(jié)、細胞對血管內皮生長因子刺激的反應、成纖維細胞增殖、成纖維細胞增殖的調控、細胞間粘附、神經元發(fā)育、神經元投射發(fā)育的負調控、內分泌系統(tǒng)發(fā)育、細胞激素代謝過程、對鈣離子的反應、中樞神經系統(tǒng)神經元分化。圖6
KEGG富集分析表明顯著基因主要富集到19條通路,涉及PI3K-Akt信號通路、Axon引導、ECM受體相互作用、造血細胞譜系、嘌呤代謝、調節(jié)干細胞多能性的信號通路、磷脂酶D信號通路、運動蛋白、Rap1信號通路、肌動蛋白細胞骨架的調控、Ras信號通路、鈣信號通路、MAPK信號通路、乳頭瘤病毒感染、核苷酸代謝、甘油磷脂代謝、黑色素生成、TGF-β信號通路、蛋白質的消化和吸收、松弛素信號通路。圖7
量表達[13],并參與多樣化組織的發(fā)育[14,15]。在皮膚上皮細胞中,脂肪細胞前體細胞參與驅動再生[16]。譜系追蹤研究已將PDGFRA確立為脂肪細胞祖細胞的標記[17,18]。乳腺細胞發(fā)育、繁殖、分化、生存、凋亡對家畜泌乳密切相關,研究GO和KEGG富集分析表明,通路主要富集在細胞增殖、分化、存活相關過程,如細胞粘附、參與分化的細胞形態(tài)發(fā)生、細胞成分形態(tài)發(fā)生、細胞發(fā)育、PI3K-Akt信號通路等。類固醇性激素在上皮中誘導PDGF信號,這些信號會動員PDGFRA以趨化方式刺激其遷移,對乳腺早期發(fā)育產生積極影響[19,20]。同時在小鼠乳腺中PDGFRA 的過度表達會導致正常小鼠乳腺上皮細胞的異常增殖[21]不僅如此,PDGFRA的過度表達、點突變、缺失和易位在惡性表型的腫瘤發(fā)生和維持中起著重要作用[22]。在研究中,通路富集弦圖表明PDGFRA調控PI3K-Akt信號通路,因此推測PDGFRA過表達調控PI3K-Akt信號通路,促使乳腺上皮細胞增殖,分化為乳腺腺泡,促進哈薩克馬泌乳。PDGFRA可以減緩山羊上皮細胞周期的G1/S階段停止,并通過調節(jié)PI3K-Akt信號通路來促進乳腺上皮細胞增殖分化乳腺腺泡,促進山羊乳房泌乳[23]。
所有富集通路可大致分為與細胞發(fā)育、增殖、分化、粘附相關的細胞過程,物質運輸,中樞神經系統(tǒng)、內分泌系統(tǒng)發(fā)育,晝夜節(jié)律調控、消化吸收相關生理過程,其中細胞發(fā)育、增殖、分化、粘附相關和中樞神經系統(tǒng)、內分泌系統(tǒng)發(fā)育富集通路數量最多,可能對哈薩克馬泌乳影響作用更加重要。
3討論
3.1研究通過對高低泌乳量哈薩克馬進行全基因組重測序,選擇Top前0.00001的位點,閾值是0.0001478,共88個突變位點,位于40個基因。其中KIT、EPHA4、PDGFRA、MYH3基因富集通路最多,且主要富集到細胞發(fā)育、增殖、分化、粘附相關和中樞神經系統(tǒng)、內分泌系統(tǒng)富集通路數量最多,因此推測細胞相關過程和神經內分泌系統(tǒng)可能對哈薩克馬泌乳影響作用更加重要。
3.2血小板衍生生長因子受體Alpha(PDGFRA),在正常的間質成纖維細胞和乳腺細胞中大
3.3酪氨酸-蛋白激酶A型受體2(EphA2),是受體酪氨酸激酶(RTKs)的最大家族Eph受體成員。在研究中,GO和KEGG富集分析發(fā)現EphA2被富集在細胞粘附、參與分化的細胞形態(tài)發(fā)生、細胞成分形態(tài)發(fā)生、細胞發(fā)育、ERK1和ERK2級聯(lián)的調節(jié)、MAPK級聯(lián)的調節(jié)等通路,這些通路與乳腺發(fā)育和泌乳密切相關[24-27]。乳腺上皮中 E phA2 的發(fā)育調控表達也有報道[28-31],多項研究表明受體酪氨酸激酶是分支形態(tài)發(fā)生的關鍵調節(jié)劑之一,EphA2通過調控內分泌激素和發(fā)育中的上皮管與其鄰近的間充質間質間質之間的局部旁分泌相互作用,影響乳腺分支形態(tài)發(fā)生,泌乳期的乳腺上皮會分化并急劇擴張,以滿足整個哺乳期間的牛奶生產需求;EphA2受體丟失導致乳腺上皮細胞對脂肪墊的穿透減少,上皮增殖減少,上皮分支受到抑制,泌乳能力大幅降低[32]。除此之外,研究通路富集弦圖表明,EphA2調控的中樞神經系統(tǒng)、神經突觸發(fā)育、分化也占有較大比重。因此推測EphA2通過參與機體中樞神經內分泌系統(tǒng),調控哈薩克馬泌乳。前人研究發(fā)現EphA2通過MAPK信號通路的磷酸化,參與調節(jié)神經細胞增殖和調亡[33] 。
3.4整合素亞基Alpha1(ITGA1),參與細胞粘附[34];膠原蛋白IV型Alpha1( COLAA1 ),調控包括膠原蛋白鏈三分化和整合素等通路[35]。本研究KEGG富集分析和富集通路弦圖發(fā)現ITGA1和COL4A1共同調控ECM受體相互作用、乳頭瘤病毒感染兩條通路,ECM受體相互作用、乳頭瘤病毒感染都與乳腺細胞重塑和乳產量顯著相關[36]并在山羊乳腺基因表達圖譜[37]、牛妊娠期乳腺基因[38表達得到驗證。研究合理推測ITGA1和COL4A1通過調控哈薩克馬妊娠期乳腺重塑和乳腺細胞發(fā)育影響哈薩克馬泌乳量。
4結論
通過對6匹高泌乳量哈薩克馬和6匹低泌乳量哈薩克馬進行全基因組重測序,得到9804918個有效位點,選擇Top前0.00001的位點,確定閾值為0.0001478,共88個突變序列,位于40個候選基因。乳腺上皮細胞增殖、分化、存活相關細胞過程,中樞神經系統(tǒng)、內分泌系統(tǒng)多生物學過程協(xié)調作用,影響哈薩克馬泌乳量。確定了KIT、EPHA4、PDGFRA、MYH3、ITGA1、COL4A1可能影響哈薩克馬泌乳量的候選基因。
參考文獻(References)
[1]YanT,Yao Y,WuDZ,et al.BnaGVD:a genomic variation databaseof rapeseed(Brassica napus)[J].Plant amp; Cell Physiology,2021:pcaa169.
[2]LiuXX,ZhangYL,LiYF,etal.EPAS1gain-of-function mutationcontributes tohigh-altitudeadaptationin Tibetan horses[J]. MolecularBiologyand Evolution,2019,36(11) : 2591 -2603.
[3]KijasJW,LenstraJA,HayesB,etal.Genome-wideanalysis oftheworld’ssheepbreedsrevealshighlevelsofhistoricmixture and strongrecent selection[J].PLoSBiology,2012,10(2): e1001258.
[4]Chen ZQ,Narum SR. Whole genome resequencing revealsgenomic regionsassociatedwiththermal adaptationinredband trout [J].MolecularEcology,2021,30(1):162-174.
[5]WangK,Liu XF,QiT,et al.Whole-genome sequencing to identifycandidategenesforlittersizeandtouncoverthevariant functionin goats(Capra hircus)[J].Genomics,2021,113 (1):142-150.
[6]LiCY,Liu KP,Dai JH,et al.Whole-genome resequencing toinvestigatethedeterminantsofthemulti-lumbarvertebrae traitin sheep[J].Gene,2022,809:146020.
[7]LanDL,JiWH,XiongXR,etal.Populationgenome of the newlydiscovered Jinchuan yak to understand its adaptive evolutionin extreme environmentsand generation mechanism of the multirib trait[J]. Integrative Zoology,2021,16(5):685- 695.
[8]Li W T, Zhang M M,Li Q G, et al. Whole - genome resequencing reveals candidate mutations for pig prolificacy[J]. Proceedings Biological Sciences,2017,284(1869) : 20172437.
[9]Amiri Ghanatsaman Z,Wang GD,Asadollahpour Nanaei H,et al.Whole genome resequencing of the Iranian native dogs and wolves to unravel variome during dog domestication[J]. BMC Genomics,2020,21(1):207.
[10] Lawal R A,Al-Atiyat R M, Aljumaah R S, et al. Whole - genome resequencing of red junglefowl and indigenous village chicken reveal new insights on the genome dynamics of the species[J]. Frontiers in Genetics,2018,9:264.
[11]Li X L, Yuan L F, Wang W M,et al. Whole genome re- sequencing reveals artificial and natural selection for milk traits in EastFriesian sheep[J].Frontiersin Veterinary Science,2022, 9: 1034211.
[12]Heldin C H. Targeting the PDGF signaling pathway in tumor treatment[J]. Cell Communication and Signaling, 2013,11: 97.
[13]Andrae J,GalliniR,Betsholtz C.Role of platelet-derived growth factors in physiology and medicine[J].Genesamp; Development,2008,22(10): 1276-1312.
[14]Hoch RV,Soriano P.Roles of PDGFin animal development [J].Development,2003,130(20): 4769-4784.
[15]FestaE,FretzJ,BerryR,etal.Adipocyte lineagecellscontribute to the skin stem cell niche to drive hair cycling[J]. Cell, 2011,146(5):761-771.
[16]LeeYH,PetkovaAP,MotilloEP,etal.In vivo identificationof bipotential adipocyte progenitors recruited by β3-adrenoceptor activation and high -fat feeding[J]. Cell Metabolism, 2012,15(4): 480-491.
[17]Berry R,Rodeheffer M S. Characterization of the adipocyte cellular lineage invivo[J].Nature Cell Biology,2013,15(3): 302-308.
[18]Jechlinger M,SommerA,MorigglR,etal.AutocrinePDGFR signaling promotes mammary cancer metastasis[J].The Journal ofClinical Investigation,2006,116(6):1561-1570.
[19]Joshi P A,Waterhouse P D, Kasaian K, et al. PDGFRα(+) stromal adipocyte progenitors transition into epithelial cels during lobulo -alveologenesis in the murine mammary gland[J]. Nature Communications,2019,10(1):1760.
[20]Nilson L A,DiMaio D.Platelet-derived growth factor receptor can mediate tumorigenic transformation by the bovine papillomavirus E5 protein[J]. Molecular and Cellular Biology,1993,13 (7):4137-4145.
[21]Camorani S, Esposito CL,Rienzo A,et al.Inhibition of receptorsignalingand of glioblastoma-derived tumor growthbyanovel PDGFRβ aptamer[J].Molecular Therapy,2014,22((4): 828-841.
[22]XuanR,Wang JM,Zhao XD,et al. Transcriptome analysis of goat mammary gland tissue reveals the adaptive strategies and molecular mechanisms of lactation and involution[J].International Journal ofMolecularSciences,2022,23(22):14424.
[23]SlepickaPF,Somasundara AVH,dosSantosCO.The molecularbasis of mammary gland development and epithelial differentiation[J]. Seminars in Cellamp; Developmental Biology,2021, 114: 93 -112.
[24]LiuSJ,CaoHR,GuoD,et al.Pou2F3silencingenhanced the proliferation of mammary epithelial cels in dairy goat via PI3K/AKT/mTOR signalingpathway[J].Animal Biotechnology,2022,33(2):321-329.
[25]FreyRS,SingletaryKW.Genisteinactivatesp38mitogenactivatedproteinkinase,inactivatesERK1/ERK2and decreases Cdc25C expression inimmortalized human mammary epithelial cells[J].The Journal ofNutrition,2003,133(1):226-231.
[26]AlexA,BhandaryE,McGuireKP.Anatomy and physiologyof thebreast during pregnancy and lactation[J].Advances in Experimental Medicineand Biology,2020,1252:3-7.
[27]Biswas SK,Banerjee S,Baker G W,et al.The mammary gland:basic structure and molecular signaling during development[J].InternationalJournal ofMolecularSciences,2O22,23 (7):3883.
[28]IeguchiK,MaruY.Eph/ephrin signalingin the tumor microenvironment[J].AdvancesinExperimentalMedicineand Biology ,2021,1270:45-56.
[29]FonsecaPAS,Suárez-Vega A,Esteban-Blanco C,et al. Epigenetic regulation of functional candidate genes for milk productiontraitsindairysheepsubjected toproteinrestrictioninthe prepubertal stage[J].BMC Genomics,2023, 24(1):511 :
[30]Feigman MJ,MossMA,ChenC,et al.Pregnancy reprograms the epigenome of mammary epithelial cells and blocks the development of premalignant lesions[J].Nature Communications, 2020,11(1):2649.
[31]Bussell KN.Regulationand functional analysis of EphA2 in normaland Ras-transformed mammaryepithelium[D].London,UK:UniversityofLondon,UniversityCollegeLondon, 2001.
[32]VaughtD,ChenJ,Brantley-SiedersDM.Regulationof mammary gland branching morphogenesis by EphA2 receptor tyrosine kinase[J].MolecularBiologyof theCell,2009,20 (10):2572-2581.
[33]SalemAF,GambiniL,BilletS,etal.Prostate cancermetastasesare stronglyinhibitedbyagonistic Epha2 ligands inanorthotopic mousemodel[J].Cancers,2020,12(10):2854.
[34]Brieher WM,Yap AS. Cadherin junctionsand their cytoskeleton(s)[J].CurrentOpinioninCell Biology,2013,25(1):39 -46.
[35]SandJMB,Madsen SF,Karsdal M A.TypeIV collagen [M]//Biochemistry of collagens,laminins and elastin. AcademicPress,2024:37-53.
[36]SongXY,ZhaoM,CaoQQ,etal.Transcriptome provides insightsinto bovine mammary regulatory mechanisms during the lactationcycle[J]. Journal ofAppliedAnimal Research,2022, 50(1):275-288.
[37]WuXY,ZhouXL,XiongL,et al.Transcriptome analysisreveals the potential role of long non-coding RNAs in mammary glandof yakduringlactationand dryperiod[J].FrontiersinCell and Developmental Biology,2020,8:579708.
[38]ChenWH,LvXY,WangY,etal.Transcriptional profilesof longnon-codingRNA and mRNA in sheep mammary gland duringlactationperiod[J].Frontiers inGenetics,2020,11:946.
Abstract:【Objective】 To perform whole genome resequencing of jugular vein blood of Kazakh horses to identify candidate genes regulating differences in lactation in Kazakh horses and to provide data support for the selection of Kazakh horses for dairy performance.【Methods】According to the lactation volume data of 30- 105 dof test horses,12 healthy Kazakh horses with similar age,parity and consistent feeding background,that was,6 in the high lactation group(HW)and 6in the low lactation group were selected.Altogether,a total of 288 blood samples were collected during the whole monitoring cycle,and 48 blood samples were selected for whole genome resequencing,6 blood samples of lactation greater than 2. 26kg in the HW group and 6 blood samples of lactation less than 1.O5 kg in the LW group were selected for whole genome resequencing,and a total of 288 blood samples were obtained.A total of 88 blood samples were collected during the whole monitoring period,and 48 blood samples were collected at the actual peak lactation date.6bloodsamples from the HW group with lactation greater than 2.26 kg and 6 blood samples from the LW group with lactation less than 1.05 kg were selected for whole - genome resequencing,which yielded a total of 20,234,241 mutated sites; 980,491,918 valid sites were obtained after removing the sequences with junctions and low-quality sequences.The loci in the first 0.000,01 of Top were selected,and the threshold value was determined to be O. 000,147,8,with a total of 88 mutation loci located in 4O candidate genes.The candidate genes were analysed for GO and KEGG pathway enrichment,and the significantly enriched pathways were condensed and classified to locate the biological processs affcting high and low lactation in Kazakh horses;the pathway enrichment chordal maps were drawn to further identify the genes related to lactation in Kazakh horses.【Results】 GO and KEGG enrichment analyses found the largest number in cell development,proliferation,diferentiation,adhesion-related and CNS and endocrine system developmental enrichment pathways,and the pathways of cell adhesion molecule binding,morphogenesis of diferentiated cells,morphogenesis ofcellular components,cellular development,PI3K-Akt signalling pathway,Axon guidance,and ECM receptor interactions affecting lactation.【Conclusion】 It is determined that KIT,EPHA4,PDGFRA,MYH3,ITGA1,and COL4A1 may be candidate genes affecting lactation in Kazakh horses.
Key words:Kazakh horse; lactation;whole genome resequencing;physiological process