基于深度學習與圖數據庫構建中文商業知識圖譜的探索研究

2016-03-29 07:22:46王仁武袁毅袁旭萍

圖書與情報 2016年1期

王仁武袁毅袁旭萍

摘要：將知識圖譜應用到商業領域是大數據時代企業的迫切需求。文章通過引入深度學習算法中的深度置信網絡，自動提取領域信息中蘊含的知識單元及單元之間的關系，以此解決知識單元提取這一難點。同時，采用Neo4j圖形數據庫來存儲知識圖譜中包含的知識單元及其關系。當需要對知識圖譜中包含的知識單元進行查詢時，可以采用該圖形數據庫的Cypher查詢語言進行查詢。文章的研究方法可為商業領域快速構建知識圖譜提供借鑒。

關鍵詞：知識圖譜；深度學習；圖數據庫；深度置信網絡

中圖分類號： G203 文獻標識碼： A DOI：10.11968/tsyqb.1003-6938.2016017

Study on the Construction of Chinese Knowledge Graph Based on Deep Learning and Graph Database

Abstract Application of Knowledge graph to business areas is the urgent need of the enterprises in big data era. In order to solve the knowledge element extraction difficulties， the author tries to automatically extract the knowledge units and its relationships contained in the given field by introducing the deep belief network learning algorithm. At the same time， the knowledge unit and its relationship in the knowledge graph are stored by using the Neo4j graphics database. When you need to query the knowledge unit in the knowledge graph， the Cypher query language of the graph database can be used. The research method of this paper can provide reference for the rapid construction of knowledge graph in the commercial field.

Key words knowledge graph； deep learning； graph database； deep belief networks

1 引言

近些年，隨著大數據時代的到來，傳統的用于學科研究的科學知識圖譜[1]也開始在其他領域有所應用。Google早在2012年就發布了其知識圖譜產品—Google Knowledge Graph[2]。2013年2月，百度也推出了自己的知識圖譜。“打開手機百度，用戶搜索‘王菲的時候不僅可以查到她的歌曲，還能知道她的前夫是李亞鵬，李亞鵬的前女友是周迅，周迅和湯唯恰好是同鄉”，這就是基于大數據技術的知識圖譜，百度為用戶編織了三維知識網絡，滿足其對日益增長的知識獲取需求。近年來，還涌現了一些較有影響的知識圖譜，包括YAGO[3] 、DBpedia[4] 、NELL[5] 、Freebase[6] 等，這些知識圖譜包含數以百萬計的節點和數十億的邊。另外，在社交網絡領域，Facebook和Twitter則推出了社交圖譜和興趣圖譜。知識圖譜在商業領域的應用，擴展了原先科學知識圖譜的內涵，也使得它的應用場景得到了延伸。

商業領域中的信息不同于學科領域的信息，以往對學科領域的知識圖譜研究多基于文獻來進行研究，關鍵詞、摘要等信息可以作為繪制知識圖譜的重要信息來源。……

登錄APP查看全文