袁泉 常偉鵬
關(guān)鍵詞: Hadoop; 云計(jì)算; 圖書推薦; DAG; Apriori算法; 推薦算法
中圖分類號(hào): TN911.1?34 ? ? ? ? ? ? ? ? ? ? ? ? ? 文獻(xiàn)標(biāo)識(shí)碼: A ? ? ? ? ? ? ? ? ? ? ?文章編號(hào): 1004?373X(2019)01?0180?03
Abstract: An Apriori optimization algorithm based on Hadoop platform is proposed to improve the accuracy of book recommendation service. On the basis of distributed Hadoop framework, the directed acyclic graph (DAG) is used to analyze the implementation steps of parallel Map Reduce based on Hadoop platform. The Map Reduce is optimized for the traditional association rule Apriori algorithm to reduce the connection times of database, and generation of useless candidate items as much as possible, so as to shorten the task processing time. The experimental results show that, in comparison with traditional LDA recommendation algorithm, the proposed algorithm has higher accuracy, and can recommend more suitable books for borrowers.
Keywords: Hadoop; cloud computing; book recommendation; DAG; Apriori algorithm; recommendation algorithm
隨著科技的不斷進(jìn)步,傳統(tǒng)圖書館的發(fā)展模式已經(jīng)不能滿足社會(huì)大眾對(duì)圖書服務(wù)的各種需求。因此,需要實(shí)現(xiàn)圖書館的數(shù)字化和信息化,需要合適的個(gè)性化推薦技術(shù)為用戶提供感興趣和有意義的信息,例如應(yīng)用于圖書管理的個(gè)性化圖書推薦[1?2]。用戶如果想從海量的書籍中尋找自己想要的書籍[3?4],就需要花費(fèi)大量的時(shí)間和精力進(jìn)行查詢和檢索,而具有圖書推薦的圖書管理信息化系統(tǒng)能夠解決用戶的此類需求問(wèn)題。
在解決此類大數(shù)據(jù)挖掘問(wèn)題時(shí),Hadoop云平臺(tái)表現(xiàn)出了優(yōu)秀的性能,但是,由于數(shù)據(jù)越來(lái)越復(fù)雜且數(shù)據(jù)庫(kù)的規(guī)模變得越來(lái)越大,集中式處理方法很容易造成網(wǎng)絡(luò)擁塞問(wèn)題[5]。因此,傳統(tǒng)的云計(jì)算系統(tǒng)已經(jīng)無(wú)法有效解決大數(shù)據(jù)處理任務(wù)。目前,分布式Hadoop平臺(tái)下的并行Map Reduce作業(yè)流處理技術(shù)成為當(dāng)今的研究主流[5]。為了在分布式Hadoop平臺(tái)上有效實(shí)現(xiàn)圖書推薦并進(jìn)一步提高推薦的精確度,本文提出一種基于Hadoop平臺(tái)的Apriori優(yōu)化算法。實(shí)驗(yàn)結(jié)果顯示,相比傳統(tǒng)算法,所提出的算法具有較高的準(zhǔn)確度,能夠有效實(shí)現(xiàn)圖書數(shù)據(jù)挖掘。
具有圖書推薦的圖書管理信息化系統(tǒng)能夠自動(dòng)地向借閱者推薦符合其興趣的圖書[5]。通過(guò)使用圖書推薦,圖書管理系統(tǒng)能夠合理、及時(shí)地向借閱者推薦潛在感興趣的圖書。解決類此大數(shù)據(jù)挖掘問(wèn)題時(shí),Hadoop云平臺(tái)表現(xiàn)出了優(yōu)秀的性能。Hadoop作為三大分布式計(jì)算系統(tǒng)之一,可以輕松完成不同結(jié)構(gòu)類型數(shù)據(jù)的集合,它可以提供跨計(jì)算機(jī)集群的分布式存儲(chǔ)計(jì)算環(huán)境[5]。Hadoop在數(shù)據(jù)分析方面有獨(dú)特的優(yōu)勢(shì)。大數(shù)據(jù)環(huán)境下的信息資源具有開(kāi)放性特點(diǎn)。此外,由于大數(shù)據(jù)的上傳下載較為頻繁,特別適用于在Hadoop平臺(tái)管理。而且考慮到大數(shù)據(jù)吞吐量的問(wèn)題,在用戶行為數(shù)據(jù)挖掘過(guò)程中,資源交互的流暢性尤為重要。


從圖3中可以看出,隨著圖書管理系統(tǒng)中推薦書籍的總數(shù)不斷增加,三種算法得到的準(zhǔn)確度都隨之不斷提高。其中LDA算法的提高速度最慢,傳統(tǒng)關(guān)聯(lián)規(guī)則算法的提高速度次之,本文提出方法的提高速度最快。驗(yàn)證了本文提出算法的有效性和可行性,能夠有效地完成用戶圖書推薦,并且在相同條件下,相比其他兩種算法,本文提出算法的準(zhǔn)確度更高。
本文提出一種基于Hadoop平臺(tái)的Apriori優(yōu)化算法,能夠在分布式Hadoop平臺(tái)上有效實(shí)現(xiàn)圖書推薦并進(jìn)一步提高推薦的精確度。實(shí)驗(yàn)結(jié)果顯示,相比傳統(tǒng)算法,本文提出的算法能夠有效地實(shí)現(xiàn)圖書數(shù)據(jù)挖掘任務(wù),并滿足圖書推薦的要求;相比于關(guān)聯(lián)規(guī)則算法與LDA算法,本文方法的圖書推薦準(zhǔn)確度更高。
參考文獻(xiàn)
[1] CHEN C M. An intelligent mobile location?aware book recommendation system that enhances problem?based learning in libraries [J]. Interactive learning environments, 2013, 21(5): 469?495.
[2] LI K C, LIANG Z Y. Personalized book recommendation algorithm based on multi?feature [J]. Computer engineering, 2012, 38(11): 34?37.
[3] YANG S T, HUNG M C. A model for book inquiry history ana?lysis and book?acquisition recommendation of libraries [J]. Library collections acquisitions & technical services, 2012, 36(3/4): 127?142.
[4] SOHAIL S S, SIDDIQUI J, ALI R. A novel approach for book recommendation using fuzzy based aggregation [J]. Indian journal of science & technology, 2017, 10(19): 1?30.
[5] YANG S T. An active recommendation approach to improve book?acquisition process [J]. International journal of electronic business management, 2012, 10(2): 108?115.
[6] 徐飛.大數(shù)據(jù)流的實(shí)時(shí)處理研究[D].無(wú)錫:江南大學(xué),2015.
XU Fei. Real?time processing of big data streams [D]. Wuxi: Jiangnan University, 2015.
[7] KHAN M, JIN Y, LI M, et al. Hadoop performance modeling for job estimation and resource provisioning [J]. IEEE transactions on parallel & distributed systems, 2016, 27(2): 441?454.
[8] 劉麗娟.改進(jìn)的Apriori算法的研究及應(yīng)用[J].計(jì)算機(jī)工程與設(shè)計(jì),2017(12):3324?3328.
LIU Lijuan. Research and application of improved Apriori algorithm [J]. Computer engineering and design, 2017(12): 3324?3328.
[9] RAJAGOPAL S, KWAN A. Book recommendation system using data mining for the University of Hong Kong Libraries [J]. ITEC journal, 2012, 58(4): 393?401.