999精品在线视频,手机成人午夜在线视频,久久不卡国产精品无码,中日无码在线观看,成人av手机在线观看,日韩精品亚洲一区中文字幕,亚洲av无码人妻,四虎国产在线观看 ?

Fault monitoring based on mutual information feature engineering modeling in chemical process☆

2019-02-09 06:41:02
Chinese Journal of Chemical Engineering 2019年10期

College of Chemical Engineering,Qingdao University of Science&Technology,Qingdao 266042,China

Keywords:Big data Fault diagnosis Mutual information TE process Process modeling

ABSTRACT A large amount of information is frequently encountered when characterizing the sample model in chemical process.A fault diagnosis method based on dynamic modeling of feature engineering is proposed to effectively remove the nonlinear correlation redundancy of chemical process in this paper.From the whole process point of view,the method makes use of the characteristic of mutual information to select the optimal variable subset.It extracts the correlation among variables in the whitening process without limiting to only linear correlations.Further,PCA(Principal Component Analysis)dimension reduction is used to extract feature subset before fault diagnosis.The application results of the TE(Tennessee Eastman)simulation process show that the dynamic modeling process of MIFE(Mutual Information Feature Engineering)can accurately extract the nonlinear correlation relationship among process variables and can effectively reduce the dimension of feature detection in process monitoring.

1.Introduction

In recent years,the industrial system is developing in the direction of large-scale installations and complex process.How to use process monitoring and fault diagnosis technology to guarantee its safety,long cycle,and high quality operation is of great importance.Therefore,the concept of feature engineering in machine learning field has gradually developed into a hotspot of process monitoring technology[1,2].For feature engineering,its difficulty lies in how to find new variables that can characterize the main feature of the data because it is essentially a complex combinatorial optimization problem[3].

In the feature modeling process,each feature variable has two possible states:reserved or excluded.The element number in the state set can be obtained using probability analysis algorithms such as Bayesian network and neural network[4,5].Generally,if there are N feature variables,the number of elements in the state set is 2N.Therefore,from the algorithm point of view,the time complexity of solving this problem is exponential in an exhaustive way.When N is large enough,feature selection will cost a lot of time and computing resources[6].In the field of information theory,mutual information is a very useful measure to explain the correlation information for a random variable in a vector.The aim to obtain the correlation of mutual information is to measure the shared information between two variables,not limited to a simple linear relationship only.This is a nonparametric and nonlinear measure index to analyze the main factors in multiple variables.At present,it has attracted more and more attention in the field of data analysis and numerical modeling[7-10].There also are some applications in the field of multivariate statistical process monitoring based on mutual information analysis and process data[11].

In recent years,the research of multi-variable feature extraction method based on PCA has received extensive attentions from industrial and academic fields[12-15].Its basic idea is to dig out potential information that can reflect the running status of process based on the data collected in industrial processes.This method usually conducts a projection of original data to the low dimension subspace after analyzing the characteristics of the high dimensional process data,thus providing the information to the operator in the form of statistics.Therefore,it is suitable for monitoring modern large-scale complex industrial processes.Generally,the classic feature extraction method like PCA determines the direction of the reduced dimension projection by calculating the covariance matrix or the correlation matrix of the data set.The latent variables extracted(i.e.,principal components)can comprise the correlation characteristics of the original data space.Although the PCA method is concise and easy to analyze process data,the covariance or correlation coefficient can only reflect the linear correlation among variables,not the nonlinear relationship[16].Therefore,there are great limitations for the traditional PCA method to extract data features.The nonlinear process monitoring method based on the modified PCA methods like kernel PCA(KPCA)can effectively deal with the nonlinear process data.However,for the transformed correlation among the high dimensional spatial data,the KPCA method still calculates the correlation coefficient without involving the nonlinear term[17-20].

The normal operation of chemical process must follow the first law,i.e.,many variables often affect each other and change dynamically with time.In such a case,there are often complex correlation characteristics among the process variables.Fault diagnosis(including fault monitoring)is a concept emerging in the field of computer,which is based on the dynamic process modeling of characteristic engineering.Compared with the traditional dimensionality reduction methods like PCA,ICA(Independent Component Correlation Algorithm),etc.[21-25],the relevant literature on fault diagnosis in the chemical industry domain is not much.Feature engineering finds the best feature subset by reducing the number of samples,thereby simplifying the complex association features between variables.Therefore,this paper proposes a mutual information feature based fault monitoring method for chemical processes.The concept of mutual information features is obtained by analyzing the nonlinear correlation between variables in chemical process to optimize the characteristics of variable set.Mutual information is used to find the correlation of process variables,construct and select the optimal feature subset.The selected features are called mutual information features.In this method,feature selection and feature extraction are performed by mutual information and dimension reduction,which makes up for the deficiency of traditional feature selection and extraction algorithms.

This paper is organized as follows.Firstly,the proposal is described in detail.It preprocesses data based on the characteristics of variable correlation,and selects the best variable subset using mutual information with the nonlinear and linear correlation information of the process.Afterwards,the process features are extracted by the PCA dimensionality reduction method and the corresponding PCA detection model is established hereafter.Secondly,through the application to the Tennessee-Eastman process,the validity and feasibility of the fault monitoring method based on the mutual information feature engineering are verified.

2.Fault Monitoring Based on Feature Engineering in Chemical Process

2.1.Description of fault monitoring based on feature engineering

Feature engineering is an essential activity to extract features from the original data to a maximum extent by algorithm and model.The method of fault monitoring for chemical process based on feature engineering is illustrated in Fig.1.

In feature engineering,the data is preprocessed first to select the optimal subset for different dimensionless features with missing values of data processed[26].

After data preprocessing,feature selection(also-called FSS,feature subset selection),is conducted from two aspects:the feature is divergent or not.In the practical application of feature engineering,interdependence exists between features with the following consequences:the more the number of features is,the longer it takes to analyze features and train models,and the easier it causes dimensional disaster.Additionally,the more complex the model is with increasing features,the lower its generalization ability becomes.Feature selection can eliminate irrelevant or redundant features to improve the accuracy of the model and reduce the running time.Consequently,it can select the real related features to simplify the model,benefiting a fundamental understanding of the process of data generation.

For feature extraction,the common methods used in engineering are linear methods like variance,correlation coefficient,and nonlinear methods like mutual information.The variance and correlation coefficient can only measure the linear correlation[27].Comparatively,the mutual information coefficient can measure all kinds of correlation well,but its computation is relatively complex.Because the correlation of each feature is extracted by mutual information,the feature extraction involves many relevant features.The correlation between each feature and response variables is calculated with a subset of features to facilitate modeling.

After feature extraction,the model can then be trained directly.It is necessary to reduce the dimension of feature matrix because of the high computation cost and long training time caused by the oversize of the feature matrix.The commonly used methods for reducing dimension in chemical process are PCA,LDA(Linear Discriminant Analysis),and the related improved algorithms[28-30].There are many similarities between PCA and LDA to map the original sample to the lower dimension of the sample space,but their mapping target is different:PCA treats the sample after mapping with maximum divergence while LDA does after mapping with best classification performance.Therefore,PCA is used to reduce dimension for verifying the effectiveness of the method in the TE process in this work.

2.2.Feature subset selection based on mutual information

2.2.1.Mutual information

Mutual information is a nonparametric and nonlinear measure index in information theory.It is also a typical method to analyze the model of computational linguistics,with the aim of evaluating the correlation between two measured variables.If the joint probability distribution function of two random variables x and y is p(x,y),the marginal probability distribution function is p(x)and p(y)respectively,and mutual information I(x,y)is defined as[31-33]:

If there is no overlapping information between x and y,namely mutual independence,the mutual information value is equal to 0.Conversely,the mutual information value will be larger than zero.It is worth noting that I(x,y)requires the probability of distribution density of the known variables.In this paper,the kernel density estimation method is used to determine the corresponding probability values,as shown in Eq.(2)[34].

Fig.1.Fault monitoring based on feature engineering in chemical process.

Fig.2.Optimal subset selection based on mutual information.

2.2.2.Modeling methods

Considering the interdependence between variables in complex dynamic processes,different measurement variables will have differentsequence correlations,and the interaction between variables will be reflected at different sampling times.The diagram of modeling methods is shown in Fig.2.In order to better describe the dynamics of each measurement variable,the specific implementation steps are given as follows:

(1)Introduce continuous time measurement values for each variable to form an augmented data matrix Xa;

(2)Calculate a measurement variable Xiwith all variables xj(j=1,2,…ml)in Xa,that is Mi,j=I(xi,xj)

Fig.3.TE process.

Table 2 TE process faults

Fig.4.The number of principal component selection based on PCA.

(3)With the descending order of Mi,j∈R1×ml,select highly relevant variables to form the variable sub-block Xicorresponding to the top k maximum values.

(4)Select the feature variables from the subset of m variables and construct the best feature subset.

The variables obtained in each feature subset at different times describe the dynamics of the corresponding measurement variables.Compared with the analysis of PCA that directly uses all variables,this variable selection method can eliminate the“interference”effect of some unrelated variables,thus make the sub-block PCA model reflecting the dynamic characteristics of each measured variable better.Therefore,when monitoring the operation state of the dynamic process,the feature engineering method based on mutual information should be expected to achieve a better fault detection effect.

2.3.Feature extraction

The PCA method is concise and easy to implement for analyzing process data.It is assumed that there are m sensor acquisition variables under normal working conditions,and each sensor is sampled n times.Then,the data matrix X=[X1,X2,……Xi,Xm]∈Rn×m,where Xiis n×1 data vector,each data vector represents a variable of the process,m is the number of sample variables,n is the number of data.Among them,the covariance of X is[35]:

First,the covariance matrix C of X is obtained:

where,Ci,j=COV(Xi,Xj).

Then the eigenvalues and eigenvectors corresponding to the covariance matrix C are calculated,and C is decomposed by feature decomposition:

The corresponding eigenvalues are{λ1,λ2,λ3…,λm},and the eigenvectors are{P1,P2,P3,…,Pm}.A set of orthonormal is found to guarantee the maximum variance after transformation.The PCA algorithm reduces the dimension by mapping the high-dimensional data to the low-dimensional space,and the mapping direction of the data is the direction in which the data variance changes greatly.Theprojection in the direction of the larger variance is caused by the fluctuation of the data itself,and in the direction of the smaller variance is caused by the noise.When the difference in the projection direction is large,the data is scattered,and also contains most of the information of the original data,which is called the maximum variance theory.So the largest variance dimension is selected to guarantee that the most information is stored.If the variables between the two dimensions are strongly correlated,the data with respect to the two dimensions will reach a straight line,meaning that partial data of the dimensions is redundant.In this situation,the transformed matrix is Ti=Xpi,where the rows form the principal components.The number of principal components is determined according to the principal component cumulative contribution rate(CPV):

Table 3 Correlation set of continuous measurement variable

Fig.5.The number of principal component selection based on MIFE.

The T2=xTPΛ-1PTx statistic of principal component space T2can then be constructed,where Λ=diag{λ1,λ2,…λk}.The corresponding control limit with a confidence interval of 99%is obtained by[34]:

Fig.6.PCA monitoring results of fault 1.

Fig.7.MIFE monitoring results of fault 1.

Above all,the steps of fault monitoring for chemical process based on mutual information feature engineering are summarized as follows:

(1)Normal data of variables are normalized to form the modeling database.

(2)The probability of distribution density of time series variables is calculated to obtain the mutual information between variables.

(3)According to the mutual information value among variables,the process variables are divided into k related subsets and one non-related subsets to construct the best feature subset.

(4)The original PCA decomposition of the optimal subset data is carried out,and the K principal components are extracted with the CPV criterion.

(5)The best characteristic subset T2statistic is calculated,and its control limit is derived from Eq.(7).

3.Case Studies

In order to verify the feasibility of the fault monitoring method based on mutual information feature engineering modeling,this method was applied to the TE benchmark example.Various factors affecting the performance of this method are discussed in this section.

3.1.Process description

Fig.9.MIFE monitoring results of fault 8.

In this paper,a typical Tennessee Eastman process simulation system is used to verify the effectiveness of the proposed method.The process structure of TE is shown in Fig.3.It is a typical chemical process published by Downs and Vogel in the Computers&Chemical Engineering journal in 1993,mainly including 5 typical units:reactor,condenser,gas liquid separator,desorption tower,and circulating compressor.The process consists of 12 operating variables and 41 measurements variables,20 pre-set failures,and 6 control modes.Each training data consists of 480 sets of sample data,and the test data consist of 960 sets of sample data.

In this case study,22 continuous measured variables can effectively represent the state of the whole process,so they are used as the initial detection variables,as shown in Table 1.The 20 faults are shown in Table 2.

3.2.Feature subset selection

In order to prove the effectiveness of the method,this example selects fault 1,fault 8 and fault 13(three different kinds of fault)to carry out fault monitoring task.First,the mutual information of 22 continuous measured variables in the TE process is acquired in the normal state,and quantified in the abnormal state.The correlation variable subset is shown in Table 3.The feature variables with large weights reflect the nonlinearity and linear correlation between variables.Therefore,after eliminating the correlation redundancy,the related feature variables remain as CMV(2),CMV(7),and CMV(11).By classifying the mutual information variables and carrying out the weight ratio analysis,the optimal subset is further obtained as CMV(1),CMV(2),CMV(3),CMV(4),CMV(5),CMV(6),CMV(7),CMV(8),CMV(11),CMV(12),CMV(14),CMV(15),and CMV(17).

Fig.10.PCA monitoring results of fault 13.

Fig.11.MIFE monitoring results of fault 13.

3.3.Feature extraction based on PCA

The 500 groups of samples collected under normal working conditions are used as training sets of PCA and MIFE(Mutual Information Feature Engineering),and 960 groups of samples under normal conditions are used as validation sets of PCA and MIFE.The selection effect of latent variables based on PCA and MIFE is shown in Figs.4 and 5.First,considering the principal element contribution rate and the maximization of variance,the PCA monitoring model reorders the eigenvalues of the principal component from large to small,and selects first 13 variables according to the criterion of the CPV≥85%,as shown in Fig.4.The MIFE model has selected 11 corresponding variables to reconstruct the principal subspace,as shown in Fig.5.Compared with the dimensionality reduction feature extraction of PCA,the number of principal component based on MIFE is reduced by 10%.The contribution of some principal component contributions to MIFE is relatively low,indicating that the redundant information of nonlinear dependent variables in the process is effectively removed.

In order to demonstrate the superiority of the MIFE model,three typical faults 1,8 and 13 are selected for the key analysis.The monitoring effect of fault 1/8/13 based on PCA and MIFE is shown in Figs.6-11.It can be concluded that the two methods can directly detect the time of failure appearance.Compared with the MIFE model for fault 1,the missing rate of the PCA model gradually increases at the 533-960 sample points,as shown in Figs.6 and 7.Also,the missing detection rate of the PCA model increases gradually at 163-231 sample points.In contrast,the MIFE model can trigger the fault alert continuously from the fault signals.Even though the automatic loops pull off the effect of the process triggering device,the MIFE model still maintains a higher fault detection rate.Additionally,it is obvious that the traditional PCA method has difficulty in finding fault 8.For the MIFE method,only 6.375% of the fault samples are not detected and the failure rate is low,as shown in Figs.8 and 9.For the slow drift fault 13,the PCA model has poor detection ability before the 203 fault sample points,while the MIFE method can detect fault 13 effectively after the 174 sample points.The failure alarm rates of PCA and MIFE are shown in Table 4.From Table 4,it can be seen intuitively that the failure alarm rate of the proposed method MIFE is lower than that of PCA.

After above discussions,it is necessary to verify the failure rate of the MIFE method.Generally,a lower failure rate corresponds to a high falsealarm rate because the normal data sample error is also identified as failure.With 960 data collected under another set of normal working conditions,the false alarm rates of PCA and MIFE statistical indexes are tested.The corresponding results are shown in Table 5.We can see that the average false alarm rate of PCA is slightly lower than that of the MIFE algorithm.In previous discussion we know that the failure alarm rate of MIFE is significantly lower than PCA.For process monitoring,a lower failure alarm rate is better when false alarm rate is nearly the same.So through comparative analysis,the superiority and effectiveness of the process monitoring method based on MIFE are fully verified.

Table 4 Fault failure alarm rates of TE process

Table 5 Fault false alarm rates of TE process

4.Conclusions

Traditional monitoring methods often perform poor when monitoring the simultaneous nonlinear and non-Gauss process data.They present large false alarm rate and failure alarm rate,which seriously affect the decision-making process of chemical production.In this paper,a MIFE monitoring method is proposed to overcome the shortcomings of the traditional algorithm.The method first uses MI to select the feature subset to replace the traditional linear correlation detection method.Its main contribution is that the whitening variable subset can better maintain the original data cluster structure information,reducing the number of variable detection and redundant alarms.For PCA,only the linear relationship between variables can be reflected,and the nonlinear relationship cannot be measured effectively.For MIFE,after data is selected by the characteristics of mutual information,the correlation between the original nonlinear variables is removed and the PCA method is extended to the nonlinear domain,which makes up for the linear limitation of the PCA method when extracting the data features.In addition,MIFE based dynamic modeling method provides a feasible solution for chemical process monitoring with certain practical value.

主站蜘蛛池模板: 欧美国产视频| 97国产在线播放| yjizz视频最新网站在线| 欧美另类图片视频无弹跳第一页| 丁香综合在线| 中文字幕久久亚洲一区| 亚洲成人动漫在线观看| 欧美色伊人| 国产网站免费看| 亚洲欧美自拍视频| 国产福利拍拍拍| 亚洲成人精品| 在线精品亚洲一区二区古装| AV在线天堂进入| 永久免费av网站可以直接看的| 伊人福利视频| 日韩欧美国产成人| 国产微拍一区| 很黄的网站在线观看| 91伊人国产| 狼友av永久网站免费观看| 久久午夜夜伦鲁鲁片无码免费| 久草视频精品| 亚洲国产精品无码久久一线| 国产欧美亚洲精品第3页在线| 91国语视频| 亚洲经典在线中文字幕| 精品無碼一區在線觀看 | 亚洲第一香蕉视频| 亚洲精品手机在线| 国产精品免费电影| 国产精品v欧美| 国产农村妇女精品一二区| 99久久国产综合精品2023| 日韩第九页| 亚洲综合一区国产精品| 久久影院一区二区h| 热思思久久免费视频| 国内精品九九久久久精品| 欧美无专区| 亚洲黄网在线| 亚洲毛片在线看| 人禽伦免费交视频网页播放| 国产日本欧美在线观看| 日本成人不卡视频| 久久精品一品道久久精品| 久久国产热| 日韩黄色在线| 无码精油按摩潮喷在线播放 | 国产乱子伦视频三区| 狠狠操夜夜爽| 国产在线98福利播放视频免费| 国产亚洲欧美在线专区| 亚洲a免费| 三上悠亚精品二区在线观看| 成人欧美日韩| 国产亚洲欧美日韩在线一区二区三区| 国产精品无码一区二区桃花视频| 亚洲第一黄色网址| 制服丝袜在线视频香蕉| 999精品在线视频| 国产区免费| 国产SUV精品一区二区6| 国产成人调教在线视频| 国产精品爽爽va在线无码观看| 欧美一级在线播放| a级毛片一区二区免费视频| 亚洲精品制服丝袜二区| 婷婷色一二三区波多野衣| 国产精品真实对白精彩久久| 91精品国产91久久久久久三级| 国产精品无码制服丝袜| 特级aaaaaaaaa毛片免费视频| 亚洲成网777777国产精品| 日韩在线播放欧美字幕| 中文字幕丝袜一区二区| 国产精品微拍| 99在线免费播放| 亚洲人成网18禁| 久久青草免费91线频观看不卡| 一级一毛片a级毛片| 国产无码精品在线|