基于多注意力的多變量時間序列特征選擇方法

2020-01-05 07:00:06胡紫音桂寧

軟件導刊 2020年11期

胡紫音　桂寧

摘要：特征選擇是避免維度詛咒的一種數據預處理技術。在多變量時間序列預測中，為了同時找到與問題相關性最大的變量及其對應時延，提出一種基于多注意力的有監督特征選擇方法。該方法利用帶有注意力模塊和學習模塊的深度學習模型，將原始二維時間序列數據正交分割成兩組一維數據，分別輸入兩個不同維度的注意力生成模塊，得到特征維度和時間維度的注意權重。兩個維度的注意力權值點積疊加作為全局注意力得分進行特征選擇，作用于原始數據后輸入隨學習模塊訓練不斷更新至收斂。實驗結果表明，所提出的方法在特征數小于10時可達到全量數據訓練效果，與現有幾種基線方法相比實現了最佳準確率。

關鍵詞：特征選擇;時間序列;注意力機制;多維數據;深度學習

DOI：10. 11907/rjdk. 201206

中圖分類號：TP301 ??? 文獻標識碼：A?????? 文章編號：1672-7800（2020）011-0021-04

A Multi-attention-based Feature Selection Method for Multivariate Time Series

HU Zi-yin1，GUI Ning 2

（1. School of Information Science and Technology， Zhejiang Sci-Tech University， Hangzhou 310018，China;

2. School of Computer Science， Central South University， Changsha 410006，China）

Abstract：Feature selection is a data preprocessing technique that reduces model complexity and avoids the curse of dimensionality. In order to find the variable that is most relevant to the problem and its corresponding delay simultaneously in multivariate time series prediction， this paper proposes a multi-attention based supervised feature selection method. This method uses a deep learning model with an attention module and a learning module. The original two-dimensional time series data is orthogonally divided into two sets of one-dimensional data and input into the attention module of two different dimensions respectively to generate the attention weights of the feature dimension and the time dimension. Then the attention weights of the two dimensions are dotted with the product operation， used as a global attention score for feature selection， applied to the original data and updated continuously with the training process until the model converges. Experimental results show that the proposed method can achieve the effect of full data training when the number of features is less than 10， and achieves the best accuracy compared with several existing baseline methods.

Key Words： feature selection; time series; attention mechanism; multidimensional data; deep learning

0 引言

隨著物聯網的發展，越來越多的領域，包括工業[1]、生物學[2]、社交媒體[3]等，積累了大量按時間順序排列的高維數據，即多元時間序列（MTS）。借助機器學習和深度學習手段可從這些時間序列中挖掘出大量有價值的信息供專業人員決策。然而，時間序列中存在的大量無關、冗余特征，不僅對學習器的學習造成極大困擾，還會增加計算開銷[4]。特征選擇通過從數據集中選擇出與目標變量相關的特征，有效減輕維數災難問題，被視為機器學習中至關重要的數據預處理步驟[5]。為了建立更準確、更易理解的時間序列模型，確定與監督目標相關的最相關變量及其最合適的時間步長非常重要，這對理解底層系統的物理、化學模型有很大幫助。

隨著深度學習的發展，注意力機制被提出并廣泛應用于圖像處理和自然語言處理領域。注意力模型借鑒了人類視覺的大腦信號處理機制，通過快速掃描全局圖像，獲得需要重點關注的目標區域，對這一區域投入更多資源以獲取更多關注目標的細節信息，抑制無用信息。