陳一唐等
摘要對因子分析法在質(zhì)譜成像數(shù)據(jù)分析中的應(yīng)用進行了研究。本方法分析的質(zhì)譜成像數(shù)據(jù)來源于空氣動力輔助離子源質(zhì)譜成像技術(shù),所用樣品為含有3種不同顏料(紅色、藍(lán)色、黑色)的筆跡樣品。對該樣品的成像數(shù)據(jù)進行因子分析后,將成像數(shù)據(jù)分為了背景、黑色、藍(lán)色和紅色因子。分析結(jié)果顯示, m/z 4432, 4784, 3222(3442)分別在紅色、藍(lán)色、黑色因子中的貢獻值遠(yuǎn)大于其它質(zhì)荷比,因此是3種顏料的特征質(zhì)荷比。此結(jié)果與實際情況相符,證明使用因子分析方法對質(zhì)譜成像數(shù)據(jù)進行分析和特征提取是可行的。對因子分析與主成分分析的成像數(shù)據(jù)處理結(jié)果進行了比較,結(jié)果顯示,因子分析可以更簡單和定量地對特征質(zhì)荷比進行取舍,在生物標(biāo)志物提取、疾病診斷、藥理分析等方面有較大的應(yīng)用潛力。
關(guān)鍵詞因子分析; 質(zhì)譜成像; 空氣動力輔助離子源; 多元統(tǒng)計
1引言
近年來,質(zhì)譜成像技術(shù)(Imaging mass spectrometry, IMS)作為質(zhì)譜研究中的熱點領(lǐng)域迅速發(fā)展,在了解組織病理特征、疾病診斷、藥物療效及發(fā)現(xiàn)生物標(biāo)志物等臨床應(yīng)用中發(fā)揮越來越重要的作用\[1~5\]。
隨著質(zhì)譜成像技術(shù)的不斷發(fā)展\[6~8\],其質(zhì)量分辨率和空間分辨率都不斷提高,這導(dǎo)致原始成像的數(shù)據(jù)量變得非常龐大,通過人工篩選的方式對其進行處理已經(jīng)越來越難。近年來,研究人員開始使用多元統(tǒng)計的方法\[9~12\],對質(zhì)譜成像數(shù)據(jù)進行降維和特征提取。多元統(tǒng)計是一類數(shù)學(xué)方法的統(tǒng)稱,如何從中找出一個適合質(zhì)譜成像數(shù)據(jù)分析應(yīng)用的具體模型,成為質(zhì)譜成像領(lǐng)域的研究內(nèi)容之一\[13,14\]。
目前,常用的應(yīng)用于質(zhì)譜成像數(shù)據(jù)處理的多元統(tǒng)計方法包括主成分分析(Principal component analysis,PCA)\[15,16\]、聚類分析(Hierarchical cluster analysis, HCA)\[17\],偏最小二乘判別分析(Partial least square discriminate analysis,PLSDA)\[18\]等,這些方法成功地對大量質(zhì)譜數(shù)據(jù)進行了降維和特征提取,推進了質(zhì)譜成像技術(shù)在各領(lǐng)域的應(yīng)用。但是作為統(tǒng)計學(xué)的方法,這些常用方法所得到的結(jié)果數(shù)學(xué)意義偏多,往往較難對其給出符合實際意義的解釋。另外,相比使用其它技術(shù)確立的生物標(biāo)志物,這些方法提取的標(biāo)志物(質(zhì)荷比)通常較少,有可能遺漏掉有重要意義的特殊質(zhì)荷比。
本研究基于空氣動力輔助離子源質(zhì)譜成像技術(shù)(Air flowassisted ionization imaging mass spectrometry,AFAIIMS)\[19\],對因子分析(Factor analysis,F(xiàn)A)在質(zhì)譜成像數(shù)據(jù)分析中應(yīng)用的方法進行了研究。選取一組混合筆跡樣品進行了質(zhì)譜成像分析,獲得了原始質(zhì)譜成像數(shù)據(jù),使用因子分析法對該數(shù)據(jù)進行統(tǒng)計分析,將成像數(shù)據(jù)分為了背景、黑色、藍(lán)色和紅色因子。分析結(jié)果顯示, m/z 4432, 4784, 3222(3442)分別在紅色、藍(lán)色、黑色因子中的貢獻值遠(yuǎn)大于其它質(zhì)荷比,因此是3種顏料的特征質(zhì)荷比。此結(jié)果與實際情況相符,證明使用因子分析方法對質(zhì)譜成像數(shù)據(jù)進行分析和特征提取是可行的。
本研究還對因子分析與主成分分析的成像數(shù)據(jù)處理結(jié)果進行了對比,結(jié)果表明,因子分析可以更簡單和定量地對質(zhì)荷比進行正確和全面的取舍,判斷和提取出多個質(zhì)荷比作為目標(biāo)樣品成分的綜合標(biāo)志物。相比目前常用的多元統(tǒng)計方法,因子分析法可以有效地對特殊因子進行提取和反應(yīng),在生物標(biāo)志物提取、疾病診斷、藥理分析等方面有較大的應(yīng)用潛力。
3結(jié)果與討論
31對樣品進行因子分析
對樣品進行AFAIIMS質(zhì)譜成像數(shù)據(jù)采集,并對采集到數(shù)據(jù)進行因子分析。根據(jù)上文所述,由于需要預(yù)先設(shè)定將原始數(shù)據(jù)分類為多少個因子,因此,對不同數(shù)量因子的分析結(jié)果進行了初步計算。結(jié)果顯示,將原始數(shù)據(jù)分類為4個因子將保留996%的信息,而設(shè)置更多的因子,保留信息增加的幅度較小,因此,將成像數(shù)據(jù)分類為4個因子。
應(yīng)用因子分析方法,原始質(zhì)譜成像數(shù)據(jù)經(jīng)過處理后可以獲得4個因子,為了探索不同因子所代表的含義,以達到使用這4個因子解釋原始質(zhì)譜數(shù)據(jù)基本結(jié)構(gòu)的目的,計算了不同因子在樣品所有采樣點上的得分值。根據(jù)因子分析的數(shù)學(xué)特性,該得分值越大,說明該因子對該樣品點的影響越大。
類似于質(zhì)譜成像以某個質(zhì)荷比在樣品點上獲得的離子信號強度作為質(zhì)譜成像圖的顏色值,本研究以對應(yīng)樣品點的因子得分值作為顏色值,完成不同因子在不同樣品點上的因子得分圖,如圖1(E~H)所示。
對比圖1A和圖1E可以發(fā)現(xiàn),因子1得分值大的樣品點的分布同有筆跡的樣品點的分布恰好相反,即同背景的分布一致。根據(jù)因子得分的數(shù)學(xué)意義,因子1對背景樣品點的影響大,對有筆跡的樣品點影響小,這說明因子1主要影響了背景成分,因此,可以命名因子1為“背景因子”。
使用因子分析得到的每個因子在數(shù)學(xué)上是一個1×n的矩陣,n與質(zhì)譜掃描范圍內(nèi)的質(zhì)荷比的個數(shù)相同。此因子矩陣中的每個值與不同的質(zhì)荷比一一對應(yīng),代表了該質(zhì)荷比在該因子中的影響大小。
32因子分析與主成分分析的對比
主成分分析是目前最常用的對質(zhì)譜成像數(shù)據(jù)進行多元數(shù)據(jù)統(tǒng)計方法。本研究對樣品的原始質(zhì)譜成像數(shù)據(jù)進行了主成分分析,并與因子分析結(jié)果對比,所得結(jié)果如圖2所示。在主成分分析中,選擇在主成分上得分值大的點作為特征點,該點對應(yīng)的質(zhì)荷比為特征質(zhì)荷比。如4結(jié)論
對因子分析方法在質(zhì)譜成像數(shù)據(jù)分析中的應(yīng)用進行了研究,證明因子分析可以對質(zhì)譜成像數(shù)據(jù)進行降維和特征提取。所用原始質(zhì)譜成像數(shù)據(jù)由AFAIIMS技術(shù)獲得,使用因子分析對該數(shù)據(jù)進行分析后,質(zhì)譜成像數(shù)據(jù)可以使用4個因子進行分類。每個樣品成分,即每種顏料樣品依賴一種因子的影響,能清晰地觀察各個因子在整個樣品上的作用。確定不同因子的意義后,通過觀察不同質(zhì)荷比在因子中的貢獻值大小,成功提取出了樣品成分的特征質(zhì)荷比。
與目前常用的主成分分析等多元統(tǒng)計方法相比,因子分析能得到符合實際背景和意義的結(jié)果。因子分析法可以對不同質(zhì)荷比在因子數(shù)組中的比重進行定量分析,并據(jù)此對特征質(zhì)荷比進行正確和全面的取舍,有利于提取影響較低, 但不可忽略的特征質(zhì)荷比。使用因子分析的方法,可以提取多種質(zhì)荷比作為樣品成分的綜合標(biāo)志物,在癌癥標(biāo)志物提取等樣品成分復(fù)雜的領(lǐng)域中有較大的應(yīng)用潛力。
2Pevsner P H, Melamed J, Remsen T, Kogos A, Francois F, Kessler P, Stern A, Anand S Biomakers Med, 2009, 3(1): 55-69
3Seeley E H, Caprioli R M Trends Biotechnol, 2011, 29(3): 136-143
4YANG ShuiPing, CHEN HuanWen, YANG YuLing, HU Bin, ZHANG Xie, ZHOU YuFang, ZHANG LiLi, GU HaiWei Chinese J Anal Chem, 2009, 37(3): 315-318
楊水平, 陳煥文, 楊宇玲, 胡 斌, 張 燮, 周瑜芬, 張麗麗, 顧海威 分析化學(xué), 2009, 37(3): 315-318
5WEI KaiHua, ZHANG XueMin,YANG SongCheng Journal of Instrumental Analysis, 2007, 26(S1): 12-14
魏開華, 張學(xué)敏, 楊松成 分析測試學(xué)報, 2007, 26(S1): 12-14
6Ifa D R, Wiseman J M, Song Q, Cooks R G Int J Mass Spectrom, 2007, 259(1): 8-15
7Harris G A, Nyadong L, Fernandez F M Analyst, 2008, 133(10): 1297-1301
8YANG ShuiPing, HU Bin, LI JianQiang, HAN Jing, ZHANG Xie, CHEN HuanWen Chinese J Anal Chem, 2009, 37(5): 691-694
楊水平, 胡 斌, 李建強, 韓 京, 張 燮, 陳煥文 分析化學(xué), 2009, 37(5): 691-694
9Jones E A, Remoortere A, Zeijl R J M, Hogendoorn P C W, Bovée J V M G, Deelder A M, McDonnell L A PloS one, 2011, 6(1): 1-14
10Bonnel D, Longuespee R, Franck J, Roudbaraki M, Gosset P, Day R, Salzet M, Fournier I Anal Bioanal Chem, 2011, 401(1): 149-165
11Reindl W, Bowen B P, Balamotis M A, Greenc J E, Northen T R Integr Biol, 2011, 3(4): 460-467
12Dill A L, Eberlin L S, Zheng C, Costa A B, Ifa D R, Cheng L, Masterson T A, Koch M O, Vitek O, Cooks R G Anal Bioanal Chem, 2010, 398(7): 2969-2978
13Fonville J M, Carter C, Cloarec O, Nicholson J K, Lindon J C, Bunch J, Holmes E Anal Chem, 2012, 84(3): 1310-1319
14Trede D, Kobarg J H, Oetjen J, Thiele H, Maass P, Alexandrov T J Integrative Bioinformatics, 2012, 9(1): 189
15Pan Z Z, Gu H W, Talaty N, Chen H W, Shanaiah N, Hainline B E, Cooks R G, Raftery D Anal Bioanal Chem, 2007, 387(2): 539-549
16Gu H W, Pan Z Z, Xi B W, Asiago V, Musselman B, Raftery D Anal Chim Acta, 2011, 686(1): 57-63
17Bonnel D, Longuespee R, Franck J, Roudbaraki M, Gosset P, Day R, Salzet M, Fournier I Anal Bioanal Chem, 2011, 401(1): 149-165
18Pirro V, Eberlin L S, Oliveri P, Cooks R G Analyst, 2012, 137(10): 2374-2380
19Luo Z, He J, Chen Y, He J, Gong T, Tang F, Wang X, Zhang R, Huang L, Zhang L, Lv H, Ma S, Fu Z, Chen X, Yu S, Abliz Z Anal Chem, 2013, 85(5): 2977-2982
20He J, Tang F, Luo Z, Chen Y, Xu J, Zhang R, Wang X, Abliz Z Rapid Commun Mass Spectrom, 2011, 25(7): 843-850AbstractThe factor analysis method applied in imaging mass spectrometry data analysis was studied The imaging mass spectrometric data were obtained by air flowassisted ionization imaging mass spectrometry method The sample contained some symbols which were drawn on slides using three different inks (red, blue, black) The imaging data analyzed by factor analysis method were divided into the background, black, blue and red factor The results showed that the scores of m/z=4432, 4784, 3222(3442) in red, blue, black factor respectively were much larger than others Therefore, they were markers of three inks The results accorded with actual condition well and proved that the application of factor analysis in imaging mass spectrometric data analysis was feasible The data analysis results of factor analysis and principal component analysis were compared The results showed that the target sample markers could be extracted by factor analysis simply and quantitatively It was of great potential in biomarker extraction, diseases diagnose and pharmacological analysis
KeywordsFactor analysis; Imaging mass spectrometry; Air flowassisted ionization; Multiple statistical analysis
20He J, Tang F, Luo Z, Chen Y, Xu J, Zhang R, Wang X, Abliz Z Rapid Commun Mass Spectrom, 2011, 25(7): 843-850AbstractThe factor analysis method applied in imaging mass spectrometry data analysis was studied The imaging mass spectrometric data were obtained by air flowassisted ionization imaging mass spectrometry method The sample contained some symbols which were drawn on slides using three different inks (red, blue, black) The imaging data analyzed by factor analysis method were divided into the background, black, blue and red factor The results showed that the scores of m/z=4432, 4784, 3222(3442) in red, blue, black factor respectively were much larger than others Therefore, they were markers of three inks The results accorded with actual condition well and proved that the application of factor analysis in imaging mass spectrometric data analysis was feasible The data analysis results of factor analysis and principal component analysis were compared The results showed that the target sample markers could be extracted by factor analysis simply and quantitatively It was of great potential in biomarker extraction, diseases diagnose and pharmacological analysis
KeywordsFactor analysis; Imaging mass spectrometry; Air flowassisted ionization; Multiple statistical analysis
20He J, Tang F, Luo Z, Chen Y, Xu J, Zhang R, Wang X, Abliz Z Rapid Commun Mass Spectrom, 2011, 25(7): 843-850AbstractThe factor analysis method applied in imaging mass spectrometry data analysis was studied The imaging mass spectrometric data were obtained by air flowassisted ionization imaging mass spectrometry method The sample contained some symbols which were drawn on slides using three different inks (red, blue, black) The imaging data analyzed by factor analysis method were divided into the background, black, blue and red factor The results showed that the scores of m/z=4432, 4784, 3222(3442) in red, blue, black factor respectively were much larger than others Therefore, they were markers of three inks The results accorded with actual condition well and proved that the application of factor analysis in imaging mass spectrometric data analysis was feasible The data analysis results of factor analysis and principal component analysis were compared The results showed that the target sample markers could be extracted by factor analysis simply and quantitatively It was of great potential in biomarker extraction, diseases diagnose and pharmacological analysis
KeywordsFactor analysis; Imaging mass spectrometry; Air flowassisted ionization; Multiple statistical analysis