李娜 孫樂 胡一楠 李笑 王亞南


摘 要:支持向量機是一種基于結構風險的機器學習方法,克服了傳統(tǒng)學習方法僅采用經驗風險最小化原理的不合理性,為此,研究人員將樣本的隸屬度引入到支持向量機中,以此解決支持向量機所存在的問題。在此基礎上,文章通過分析研究模糊支持向量機FSVM、v型模糊支持向量機v-FSVM、模糊孿生支持向量機FTSVM,提出了v型模糊孿生支持向量機v-FTSVM;實驗中選擇了UCI數(shù)據(jù)集,驗證了模糊型孿生支持向量機的性能。最后,將這些不同的支持向量機應用于入侵檢測數(shù)據(jù)集,進一步檢驗模糊型支持向量機的有效性。
關鍵詞:支持向量機;模糊孿生支持向量機;實驗數(shù)據(jù)分析
中圖分類號:TP393 文獻標志碼:A 文章編號:2095-2945(2018)11-0154-04
Abstract: Support vector machine (SVM) is a kind of machine learning method based on structural risk, which overcomes the irrationality of traditional learning method based on empirical risk minimization, in order to solve the problem of support vector machine. On this basis, this paper analyzes FSVM, v-FSVM and FTSVM, puts forward v-FTSVM. The performance of fuzzy twin support vector machines is verified by choosing UCI data sets. Finally, these different support vector machines are applied to intrusion detection data sets to further test the effectiveness of fuzzy support vector machines.
Keywords: support vector machine (SVM); fuzzy twin support vector machine; experimental data analysis
4 實驗分析
為了檢驗不同模糊支持向量機的性能,選擇了UCI數(shù)據(jù)庫中的7個數(shù)據(jù)集以及KDD CUP1999數(shù)據(jù)進行了實驗研究,并與傳統(tǒng)支持向量機SVM進行了比較,實驗中采用了五折交叉驗證方法,計算樣本的隸屬度選擇了樣本類中心法,核函數(shù)為高斯核。根據(jù)線性核函數(shù)與高斯核函數(shù)得到的結果可以看到v-FTSVM的性能要優(yōu)于其他支持向量機。另測試時間上v-FTSVM 比v-FSVM和FSVM所用時間要少。
如圖1至圖3所示表明樣本的模糊隸屬值在分類中起到作用,可以對噪聲數(shù)據(jù)起到影響。
5 結束語
本文主要基于孿生支持向量機TWSVM思想,結合v型支持向量機v-SVM及樣本隸屬度,對v型模糊孿生支持向量機進行了研究,提出了v型模糊孿生支持向量機v-FTSVM。實驗中選取了UCI標準數(shù)據(jù)集與入侵檢測數(shù)據(jù)的進行,得出v-FTSVM幾乎不受噪聲的影響。
參考文獻:
[1]C. F. Lin, S. D. Wang. Fuzzy support vector machines[J].IEEE Transaction on Neural Networks, 2002,13:464-471.
[2]C. F. Lin, S. D. Wang. Fuzzy support vector machines with automatic membership setting[J].Studies in Fuzziness and soft computing, 2005,177:233-254.
[3]X. F. Jiang, Z. Yi and J. C. Lv. Fuzzy SVM with a new fuzzy membership function. Neural Computing[J].Application, 2006,15(3-4):268-276.
[4]Y. Q. Wang, S. Y. Wang and K. K. Lai. A new fuzzy support vector machine to evaluate credit risk[J].IEEE Transaction Fuzzy System, 2005,13(6):820-831.
[5]Jayadeva, R. Khemchandani. Twin support vector machines for pattern recognition[J].IEEE Transaction on Pattern Analysis and Machine Intelligence,2007,29(5):905-910.
[6]X. J. Peng. A v-twin support vector machine (v-TSVM) classifier and its geometric algorithms[J].Information Sciences, 2010,180:3863-3875.
[7]丁勝峰.一種改進的雙支持向量機[J].遼寧石油化工大學學報,2012,32(4):76-79+82.
[8]C. F. Tsai, Y. F. Hsu, C. Y. Lin, W. Y. Lin. Intrusion detection by machine learning: A review[J].Expert Systems with Applications,2009,36(10):11994-12000.
[9]T. Shona,J. Moon. A hybrid machine learning approach to network anomaly detection[J].Information Sciences,2007,177(18):3799-3821.
[10]田新廣,高立志,張爾揚.新的基于機器學習的入侵檢測方法[J]通信學報,2006,27(6):108-114.
[11]饒鮮,董春曦,揚紹全.基于支持向量機的入侵檢測系統(tǒng)[J].軟件學報,2003,14(4):798-803.
[12]陳友,沈華偉,李洋,等.一種高效的面向輕量級入侵檢測系統(tǒng)的特征選擇算法[J].計算機學報,2007,30(8):1398-1408.
[13]丁勝鋒,孫勁光,陳東莉,等.基于模糊雙支持向量機的遙感圖像分類研究[J].遙感技術與應用,2012,27(3):353-358.
[14]李凱,翟璐璐.基于樣本權重的v-支持向量機[J].河北大學學報(自然科學版),2018.
[15]Blake C. L., Merz C. J. UCI Repository for Machine Learning databases IrvineCA: University of California,Department of Information and Computer Sciences[EB/OL]. http://www.ics.uci.edu/mlearn/MLRepository.html 1998.
[16]http://kdd.ics.uci.edu/databases/kddcup99/task.htm[EB/OL].