Comparison of machine learning methods for ground settlement prediction with different tunneling datasets

2021-12-24 02:53:04LibinTangSeonHongNa

Journal of Rock Mechanics and Geotechnical Engineering 2021年6期

Libin Tang, SeonHong Na

Department of Civil Engineering, McMaster University, Hamilton, ON, L8S 4L7, Canada

Keywords:Surface settlement Tunnel construction Machine learning (ML)Hyperparameter optimization Cross-validation (CV)

ABSTRACT

1. Introduction

In urban areas, metro tunnels could be a pivotal solution to traffic congestion and shed light on the new transit services.Among many potential challenges, the ground surface settlement induced by tunnels can cause significant damage to existing buildings and infrastructures during and after the construction (Adachi et al.,2003; Ocak and Seker, 2013; Wang et al., 2014; Jin et al., 2018).Usually, the maximal surface settlement (MSS) is considered as a key factor to evaluate failure risks and provide early warnings(Suwansawat and Einstein,2006; Dindarloo and Siami-Irdemoosa,2015; Mahmoodzadeh et al., 2020). Therefore, it is essential to estimate the tunneling-induced settlement with a profound understanding of associated triggering mechanisms to prevent any critical problems.

The settlement resulting from the ground loss,referred to as the volume of soil excavated over a primary design during tunneling,is inevitable because soil and rock around tunnels are disturbed during the construction (Neaupane and Adhikari, 2006), as depicted in Fig.1. Over the past 50 years, many approaches have been utilized for predicting the ground surface settlement, which can conventionally be divided into three categories: empirical equations,numerical modeling,and physical experiments.For instance,Peck (1969) proposed an empirical model to estimate the ground settlement as a function of horizontal distance from the tunnel centerline based on the MSS.Mair and Taylor(1997)found that the MSS can be empirically expressed by the tunnel diameter considering the volume loss due to tunneling.On the other hand,Ma et al.(2014) compared two numerical methods and proposed a novel model to reflect the ground loss and further improved the accuracy of the surface settlement prediction.Chehade and Shahrour(2008)also implemented numerical analysis successfully to reveal the influence of the relative position of twin-tunnels and the construction procedure on the settlement.

Fig.1. A typical section of the surface settlement induced by tunneling(adapted from Zhou et al., 2017).

Regarding experiments,Adachi et al.(2003)conducted a threedimensional trapdoor experiment to investigate the mechanism of the earth pressure distribution on the ground settlement. Ahmed and Iskander (2011) utilized a transparent soil model for the surface settlement profile induced by shield tunneling, which also identified deformation within the soil mass near the tunnel.However, due to the uncertainty and complexity of tunneling, it is difficult to develop a model that can explicitly capture relationships between the settlement and engineering characteristics.Moreover,the conventional methods have their intrinsic limitations, such as narrow applicability range, requiring skilful modeling experience,and limited prediction accuracy(Fattah et al.,2013;Fu et al.,2016;Zhou et al., 2017; Alotaibi et al., 2021).

On the contrary, with advances in artificial intelligence (AI),machine learning(ML)methods gain increasing popularity because of their capability to explore complicated relationships between the settlement and potential triggering factors.Compared with the conventional methods, ML approaches do not require significant experience in modeling and engineering background about geomaterial parameters. Therefore, intelligent models have been extensively applied as predictive tools related to tunneling(Khatami et al., 2013; Mohammadi et al., 2015; Vu et al., 2020). In general,these models can be divided into two categories according to input characteristics:one is considering the settlement itself,and the other is focusing on combined effects induced by potential triggering factors.The former considers time-series information in the tunneling process, and the moving window method is usually used for modeling. For example, Hu et al. (2019) adopted three different AI algorithms to predict the surface settlement during tunnel construction and found that the suitable moving window size usually ranges from 1 to 20. Although this approach may produce acceptable results,it cannot explore relationships between settlement and closely related influential factors, such as support methods, material properties, and geometric conditions, which is why many researchers focus more on the latter approach.

To better understand the triggering mechanism and provide reliable guidance to engineering practice, many researchers have proposed intelligent models to predict tunneling-induced settlement over the last 20 years. For instance, Ocak and Seker (2013)used three different ML methods, including artificial neural network (ANN), support vector machines (SVM), and Gaussian processes (GP), and found that the GP model is better than others and less influenced by overfitting. Similarly, Wang et al. (2013)adopted a wavelet smooth relevance vector machine (wsRVM),which indicated that the prediction model performs well and that the distribution of the predicted values can measure the prediction reliability.Zhang et al.(2020a)recently proposed an AI approach to predict the ground settlement during shield tunneling via considering multi-factors. Kohestani et al. (2017) proposed a random forest (RF) model to investigate the MSS caused by earth pressure balance (EPB) shield tunneling, and they found that the RF model has better applicability and accuracy than ANN for prediction.Hasanipanah et al.(2016)presented a novel ANN model optimized by particle swarm optimization(PSO)for predicting the MSS,which showed consistent results to the monitored values.In addition,the sensitivity analysis was conducted and indicated that the horizontal to vertical stress ratio has a slightly higher effect on the MSS compared to other input features. Goh et al. (2018) utilized a multivariate adaptive regression splines (MARS) approach to estimate the MSS induced by EPB tunneling. Besides, they also investigated the relationships between the MSS and the major influencing factors and found that geological conditions and shield operational factors have the most significant impact on the MSS.Mahmoodzadeh et al.(2020)used seven ML models for predicting the MSS,and results indicated that the deep neural network(DNN)approach outperforms other methods for this task.Table 1 shows a brief summary of mentioned works, containing essential information on modeling methods and features.It can be observed that the ground settlement prediction problem is closely related to a wide range of factors, including the shield operational parameters, the geological conditions, and the geometric characteristics.

Nevertheless, previous studies developed intelligent models from one dataset and have rarely discussed the potential applicability to other datasets. Additionally, DNN technique and hyperparameter optimization using stochastic search methods were less discussed. Therefore, intelligent models for this problem are still necessary to be explored and improved through more case studies.

In this study, four commonly used ML methods, SVM, RF, DNN,and back-propagation neural network(BPNN),are investigated for tunneling-induced settlement prediction associated with varioustriggering factors. Based on these methods, two different size datasets collected from previous studies (Neaupane and Adhikari,2006; Zhang et al., 2020b), referred to as Datasets A and B, are adopted to analyze the predictive modeling for MSS. Before removing outliers, Dataset A has 294 samples with 12 predictors,while Dataset B has 40 samples with 7 predictors. After determining an optimal model, the sensitivity analysis is conducted on two datasets independently to identify the relative importance of each input variable.

Table 1 Application of machine learning methods for the prediction of MSS.

The analysis of this paper mainly comprises six steps:(1)using the anomaly detection method to remove outliers and nonnumeric values and preliminary exploring correlation relationships between predictors and response;(2)dividing the processed dataset into two parts: 80% for training and 20% for testing; (3)building prediction models using different ML methods,combined with optimization approaches and 5-fold cross-validation (CV)method;(4)applying the developed models on the testing dataset and determining the best model performance;(5)carrying out the sensitivity analysis and revealing the relative importance of each variable on the MSS;and(6)discussing the model performance on different size datasets and comparing relative importance results.

2. Methodology

AI creates intelligent systems that learn,adapt,mimic,and even exceed human intelligence. These methods are considered to be useful tools to address practical issues, as they can exploit and capture complex, dynamic, and nonlinear relationships hidden in data(Zhang et al.,2019,2020c).Currently,AI is used in a wide range of applications,including data mining,pattern recognition,natural language processing, self-driving, and so on. As a primary component of AI, ML is proposed to solve problems that cannot be explicitly programmed. The fundamental principle of ML methods is to learn from historic experience automatically and rationally through computer algorithms. Then they are utilized to solve new problems based on what they have learnt, also known as the generalization ability.The ML has been widely used in geotechnical engineering as an alternative tool to reveal and handle the uncertainty and randomness that many engineers and researchers frequently face.

In this section, we briefly explain four ML methods (SVM, RF,BPNN,and DNN)for predicting the MSS induced by tunneling.Then we describe optimization, validation, and performance evaluation methods and their implementation.

2.1. Machine learning methods

2.1.1. Back-propagation neural network model

BPNN is a classical type of ANN, which aims to emulate the function of the human nervous system.The general architecture of the BPNN consists of an input layer (K1to Kn), one or multiple hidden layers(H1to Hn),and an output layer(O),as shown in Fig.2.To mitigate errors between the produced output and the expected one,BPNN always uses the gradient descent method to regulate the weight values of all layers and find out the minimum value of the error function in weight space.

The output of the hidden layer in the BPNN is expressed as

where wz，jis the weight connecting hidden neurons and output values;θzis the bias of output layers;n is the number of neurons in hidden layers; and Zjdenotes the sum of dot product and bias,which is used to calculate the final output. In this work, a single hidden layer BPNN model was selected for prediction modeling considering its broad applicability and relatively low computation cost.

2.1.2. Support vector machine model

SVM was first proposed by Cortes and Vapnik (1995) for classification problems using statistical learning theory,and later it was evolved for regression problems, known as support vector regression(SVR).In classical SVR models,the input variables are mapped into a high-dimensional feature space. Then the optimal decision function is developed to find the best fitting effect (Fig. 3). The regression function of SVR can be written as

where W is the weight vector,Φ(x)is a nonlinear mapping from the input space to the output space, and b is bias. Transforming the estimation function into a function minimization problem by the ε insensitive loss function:

Fig. 2. The typical architecture of back-propagation neural network (adapted from Momeni et al., 2014).

where K(xi,x)is the kernel function of the SVR model.The kernel is defined as a set of mathematical functions to transform the input data into the required form(Shahri et al.,2020;Zhou et al.,2021a).The commonly used kernel functions of SVR models contain linear,polynomial, radial basis function (RBF), and sigmoid.

2.1.3. RF model

Fig. 3. Illustration of the SVM model.

RF is a supervised learning algorithm that was first proposed by Breiman (2001) and it can be seen as a robust solution to avoid overfitting. This ensemble learning method generates many predictors and then aggregates the outputs (Fig. 4). For classification problems, an RF model comprises a collection of tree-structured classifiers {h(x，Θk)， k = 1，2， ???，n｝, where {Θk｝ is independent and identically distributed random vectors,and x is an input vector.Each tree-structured classifier in the model is a decision tree(DT).Each DT is independently developed during the training progress using multiple different training sets(also known as the‘bootstrap samples’),randomly generated from the original dataset.Each node of the DT is split using the best variable among a subset of predictors.After the ensemble classifier is constructed and finalized,a simple majority vote or an average value is taken for prediction(Zhou et al., 2019).

For the regression tasks, the building process of the RF model can be expressed as follows (Zhang et al., 2020c):

(1) Randomly select n data points from the training pool.This is why it is called a “random” forest, as the data points are randomly taken out from the pool, and therefore the final outcome is random.

(2) Build DT 1 based on n data points.

(3) Repeat steps 1 and 2 until K trees are constructed.

(4) Generate the RF model by parallelly adding the K sub-trees together.

(5) The estimation process of each tree is totally independent,and then take the average value as the final output.

The procedure can be expressed as follows:

where fi(xi) is the corresponding output generated by ith the DT;and k is the number of a DT. Then all individual outputs are averaged to obtain the final prediction.

2.1.4. Deep neural network model

Fig. 4. A typical illustration depicting the random forest method (adapted from Shreyas and Dey (2019)).

ANN-based models have been widely used for solving geotechnical engineering problems,as these models can reveal and reflect complex relationships hidden behind data (Santos Jr and Celestino, 2008). However, ANNs easily get trapped in local minima and may suffer from a slow convergence rate. Because of these potential issues,the DNN has been developed with the rapid development of computer hardware. Compared with traditional ANN, DNN can be considered as an improved version with multilayers. This method emphasizes the importance of “deep structure,” reflected by the number of hidden layers. Besides, DNN shows the importance of feature-based learning,which transforms features of the original space into a new feature space. Consequently,the DNN can perform better on data in a much easier way.This is another reason why the DNN is much applicable to the problems of “big data,” as it can describe the rich internal information of data(Zhang et al.,2021).The DNN has been successfully used for various applications, such as image classification, object identification, pattern recognition, and natural language processing.However,its potential application for geotechnical engineering fields needs to be further explored.

Like the architecture of ANN, the DNN includes input, hidden,and output layers (Fig. 5). In this work, the multi-layer perceptron method was used for developing DNN, which is one of the typical feedforward networks in which data flow without looping back from the input layer to the output layer (Mahmoodzadeh et al.,2020).

2.2. Hyperparameter optimization methods

PSO is a metaheuristic algorithm developed by Kennedy and Eberhart (1995), which is commonly used for hyperparameter optimization. Inspired by the feeding behavior characteristic of a bird flock, this approach attempts to mimic the process of sharing information between the group members (termed as “particles”).This algorithm uses a swarm of particles rather than a single one,which is a potential solution for optimization problems. The most important features of a particle are presented by its position,speed,and fitness.The fitness value can be calculated through the fitness function, which is defined as a function that takes a potential solution to the target problem as input and then produces output,revealing how much the solution can fit with the problem in consideration(Zhou et al.,2016).When a particle moves randomly,it is not only guided by its own position but also affected by the position of the whole group (Sheil et al., 2020). More specifically,the velocity of a particle contains the information of movement direction and distance, and it needs to adjust with the movement experience of all the particles to optimize the individual in the search space. The updated velocity vector at the time t+ 1 for the ith particle can be expressed as follows:where ρ1and ρ2are the two random numbers, distributed within the range of[0,1];βgand βlare the global and local learning factors,respectively;xiis the position of the ith particle;g*and xi*are the global and local best historical locations of the ith particle,respectively. In this work, the PSO method is utilized to perform hyperparameter optimization of the SVM and RF models.

Fig. 5. A typical structure of deep neural network.

The grid search (GS) method is a traditional way to tune hyperparameters of a ML algorithm.This process can be described as follows: reasonable values for each hyperparameter are manually defined, and then the model will iterate every combination of the designated values. The CV method is usually adopted to guide this approach on the training dataset. When all the parameter combinations have been tried,the optimal parameter combination with the best model performance is returned automatically. It has been successfully used for NN-based ML models, i.e. shallow ANN and DNN,as this approach can take full consideration of important parameters of a prediction model,including both numeric and nonnumeric hyperparameters.Therefore,the GS method is used to find the optimal combination of hyperparameters for neural networkbased models in this work.

2.3. k-fold cross validation

In general, the entire dataset is divided into two parts to build the model and validate its performance, which are also known as the training and testing datasets.The simplest way to split a dataset is based on ratios,such as 70%for training and 30%for testing.The division ratio can be adjusted based on specific problems.However,a small size dataset may constrain this method to use the limited data samples fully. Therefore, how to divide the dataset will significantly influence the model construction and validation. To resolve this issue,k-fold CV is considered as a powerful tool.In the k-fold CV, the original dataset is randomly divided into k subdatasets firstly. Then, k-1 sub-datasets are used as training data,and the remaining ones are used as validation data. During this procedure, the CV process is repeated k times, and the model performance can be evaluated by the average prediction error of k subdatasets. The k-fold CV method can take full advantage of data, as each part of the original dataset is randomly divided and used for both training and testing. In this work, 5-fold CV (k = 5) is used.

2.4. Evaluation of modelling performance

Three statistical criteria, including mean absolute error (MAE),root mean square error(RMSE),and Pearson correlation coefficient(R), are used to assess model performance. These performance indicators are defined as follows:

The MAE calculates the average of the absolute valuer of the error rates, which shows how close the actual and the predicted values are.As defined,the smaller the MAE with positive values,the better the model performance.Similar to the MAE,the RMSE is also the calculation of the error rate.Again,the RMSE with positive and smaller values indicates less error and better results for the correlation between the actual and the predicted datasets.The R is used to calculate the correlation between two datasets, and it varies from -1 to 1, indicating the inverse correlation and the perfect correlation, respectively. In particular, when R is 0, this means no linear correlation between the actual and the predicted values.

2.5. Implementation of the prediction model

The flowchart of developing prediction models and determining the best model performance is illustrated in Fig. 6. Six main steps are adopted to implement this task:(1):Cleaning data by removing outliers; (2): Analyzing relationships between predictors and response; (3): Randomly splitting the prepared dataset into two parts: training and testing; (4): Applying hyperparameter optimization combined with PSO, GS and 5-fold CV methods, and using the average RMSE as the controlling score; (5): Importing testing data into the developed models in Step 3 and then comparing the results;and(6):Implementing the sensitivity analysis to figure out the relative importance of each input variable and end up with comparing results with those found in Step 2.

Fig. 6. Flowchart of the proposed prediction models.

Fig. 7. Anomaly detection for Dataset A using the three-standard deviation method.

3. Case studies

As mentioned above,many factors may affect the ground surface settlement, which can be categorized into three main groups:geometrical characteristics (e.g. tunnel depth and diameter),geological properties (e.g. Young’s modulus, Poisson’s ratio,groundwater table, unit weight and water permeability), and construction parameters(e.g.excavation method,penetration rate and support method).Depending on specific conditions of tunnels and construction procedures,the potential influential factors may have different contributions to the ground surface settlement. In this work,two datasets collected from the previous research are used to develop and assess the proposed model.

3.1. Dataset A

Zhang et al. (2020b) utilized five types of ML algorithms to predict the surface settlement induced by the construction of Metro Line 4 of Changsha,Hunan Province,China.They shared the original dataset used to develop forecast models.In this work,this dataset is referred to as Dataset A.

3.1.1. Background information

Dataset A collected information of five tunnel sections within six metro stations, with a total length of 5.44 km. This tunnel was constructed using the EPB shield machine, starting in 2016 and being finished in 2019. Monitored settlement collected in the original dataset is considered as MSS. This project was primarily excavated in the zone of weathered rock, indicating that consolidation settlement occurred rapidly after the tunnel construction.

Dataset A contains 294 samples with 13 components.The tunnel cover depth(Cd)was commonly proven to be a key factor affecting the surface settlement,and it is the unique geometric parameter in Dataset A.The thrust(Th),torque(To),grout filling(Gf),penetration rate (Pr), and chamber pressure (Cp) recorded real-timely by the shield system during the excavation progress are selected as the operational parameters.The geological parameters(blow counts of modified standard penetration test (MSPT), modified dynamic penetration test (MDPT) of soil layers, modified uniaxial compressive strength of weathered rock(MUCS),groundwater table(W)and the geological condition at the cutter head face (Gc) measured before tunnelling are adopted as well. It should be noted that geological conditions at the cutter head face (Gc) are categorized into four groups: soil, gravel, rock, and mixed-face ground, which are labelled as 1, 2, 3, and 4, respectively. In addition, the shield stoppage (St) is also considered an abnormal condition, which is also closely related to the settlement. Similarly, St is also classified as 0 and 1 for continuous advancement and stoppage.In this work,these 12 factors are selected as input predictors.The target output is the ground surface settlement along the tunnel alignment.

3.1.2. Data preparation and analysis

Outliers of settlement in the original dataset were detected and removed using the three-standard deviation(3σ)method,resulting in 288 samples (Fig. 7).again, the tunnel cover depth (Cd) and the groundwater table (W)show a strong correlation with each other.

Fig. 8. The correlation coefficient between every two parameters of Dataset A: (a)Linear correlation indicated by R; and (b) Nonlinear correlation indicated by MIC.

Table 2 Descriptive statistical values of variable parameters for Dataset A.

Consequently, all these twelve input factors are found to be weakly associated with surface settlement. Nevertheless, Gc, Gf,and St are the most influential factors affecting the settlement.Besides, the model may be affected by the close relationship between tunnel cover depth (Cd) and groundwater table (W), which have relatively high contributions to the settlement.

3.2. Dataset B

Neaupane and Adhikari (2006) compiled a database from published literature,including various case studies of tunnel projects of different countries (e.g. Bangkok, Thailand; Frankfurt, Germany;Toronto, Canada). In this work, this database is referred to as Dataset B.

3.2.1. Background information

The original database consists of 40 samples, with 9 components available. However, the trough width (i) in the original database,denoting the inflection point,is not considered an output in this work; thus, it is removed from Dataset B. The tunnel diameter(D),tunnel cover depth(Cd),and volume loss(Vs)induced by tunneling per meter are adopted for modeling.The groundwater characteristics and construction methods are also considered important input variables to build the intelligent models.It is worth noting that the groundwater table (W) is classified into two categories: the water level above the tunnel crest is taken as 1, and below the tunnel axis is taken as 2. Similarly, the construction methods (Cm) are also categorized into three groups: 1, 2, and 3,denoting the hand-mined shield, mechanical shield, and semimechanical type (compressed air support) shield, respectively. In addition, the normalized volume loss, Vs/Vt, is the ratio of the volume of the settlement per unit length of the tunnel (Vs) to the ground loss during tunneling(Vt). For the soil strength properties,the undrained shear strength(Cu)and the internal friction angle(φ)are utilized or estimated using empirical equations if necessary.Consequently, in Dataset B, the variables D, Cd, Vs, and Vs/Vt are geometric characteristics; the variables Cu and W are geological properties, and the variable Cm is the operational parameter.

3.2.2. Data preparation and analysis

Outliers of settlement in Dataset B are detected and removed using the three-standard deviation (3σ) method, resulting in 39 samples (Fig. 9).

Table 3 shows the statistical values of variable parameters in pre-processed Dataset B.

The correlation coefficient for each of two variables among the seven input factors and the output parameter is also determined by the R and the MIC, respectively(Fig.10).

In Fig.10a,the volume loss(Vs)and the normalized volume loss(Vs/Vt) are found to be closely related to the settlement, while the other five input factors have a relatively poor correlation. The tunnel diameter (D) and the construction method (Cm) show a nearly independent relationship with the settlement. The cover depth (Cd) and the undrained shear strength (Cu) are negatively correlated to the settlement, meaning that the settlement reduces with the increase of these two factors. Considering the overall relationships, the volume loss is closely associated with the normalized volume loss. The results presented by MIC in Fig.10b are similar to those of the R,where the significance of volume loss and normalized volume loss is emphasized again.The cover depth and undrained shear strength rank 3rd and 4th relationship with the settlement,respectively.The other three input variables are not closely associated with the settlement. Considering the inner relationships, the tunnel diameter shows the same level of correlation with the cover depth and normalized volume loss,which may make the intelligent models more complex and unstable.

Fig. 9. Anomaly detection for Dataset B using the three-standard deviation method.

In a word,it can be inferred that except for the volume loss(Vs)and normalized volume loss(Vs/Vt),the other five input factors are weakly associated with the surface settlement. Moreover, the model performance may be influenced by the close inner relationships among cover depth (Cd), normalized volume loss (Vs/Vt), and tunnel diameter (D).

4. Model development and results

In this section, the establishment of forecast models and the validation of model performance are described in detail. The intelligent models are implemented in Python by the Scikit-learn and the Keras package with TensorFlow as a backend. The analysis is carried out on an Intel (R) Core (TM) i7-10750H (2.60 GHz)personal computer with 16G memory.

4.1. Dataset A

4.1.1. Model development

As mentioned,after the process of anomaly detection,there are 288 samples available. The entire dataset is then randomly split into two parts: 80%for training and validation using the 5-fold CV method and 20%for testing.Twelve parameters of 4 categories are taken as input predictors, and the MSS is the output. Before developing intelligent models, 12 input parameters and the settlement are normalized to [-1, 1] with the following criterion,which helps to accelerate the speed to find the optimal solution

where x′is the normalized value of x; and min(x) and max(x) are the minimum and maximum values of the original dataset x,respectively.

(1) SVM

The kernel function has an important influence on the SVM model,and the RBF is adopted as the kernel function in this work,considering its wide convergence domain. According to Srivastava et al. (2012), three main hyperparameters of the SVM model areselected for tuning combined with PSO: the penalty factor C, the kernel function γ,and the degree De,with search spaces of[1,20],[0.1, 0.5] and [1, 7], respectively. The maximal iteration number of the PSO algorithm is set as 100,while the population number is set as 20.The convergence of PSO will occur if fitness does not improve for 10 successive iterations or if the maximal iteration number is reached. It should be noted that the setting of PSO parameters is fixed in this work. After the search, the best parameters are Cbest= 8.59, γbest= 0.1254, and Dbest= 7; and the time cost is 24.73 s (Table 4).

Table 3 Descriptive statistical values of variable parameters for Dataset B.

Fig. 10. The correlation coefficient between every two parameters of Dataset B: (a)Linear correlation indicated by R; and (b) Nonlinear correlation indicated by MIC.

(2) RF

RF is a type of ensemble learning,which has been widely used to solve classification or regression problems in geotechnical engineering (Zhou et al., 2021b). The number of the tree (n_estimator)and the depth of the tree (max_depth) are two important hyperparameters in the RF model (Zhou et al., 2019), as they can significantly affect the model complexity and structure.Therefore,these two parameters are selected for optimization. In this study, the search spaces of both the two hyperparameters n_estimator and max_depth are set up as[1,50],respectively.The PSO is then utilized to search for the best solutions. After search, the best hyperparameters are:n_estimator=32 and max_depth=32;and the time cost is 116.19 s (Table 3).

(3) BPNN

GS method is used to find the best combination of hyperparameters for the BPNN model. In essence, GS is a kind of exhaustive method, as it needs to iterate every combination of parameters that have been manually specified during search process. A single hidden layer BPNN model is constructed, and the number of neurons (n_n), batch size (batch_size), activation function(activation_function),kernel initializer(kernel_initializer)and optimizer (optimizer) are considered for optimization(Moayedi et al., 2020). The search spaces of these hyperparameters are set up as [128, 64, 32], [8, 16], [’softsign’, ‘tanh’,‘relu’], [’uniform’, ‘glorot_uniform’, ‘he_uniform’], [’sgd’,‘rmsprop’,’nadam’], respectively. After search, the best combination is determined as n_n=128, batch_size=8, activation_function=‘softsign’, kernel_initializer= ‘uniform’, optimizer=‘rmsprop’, and time consuming is 54.85 s (Table 3).

Table 4 Specifications on the best parameters of machine learning methods in Dataset A.

Fig.11. Best prediction results of four machine learning methods on Dataset A: (a) PSO-SVM, (b) PSO-RF, (c) GS-BPNN, and (d) GS-DNN.

(4) DNN

The most important parameters for developing DNN are similar to those of BPNN, mainly including the number of hidden layers (n_l),the number of neurons within each layer (n_n), activation function(activation_function), batch size (batch_size), kernel initializer (kernel_initializer), optimizer(optimizer) and epochs (epochs).GS is also used to tune these hyperparameters, and except n_l [3, 4, 5] and epochs[100,200,300],search spaces for others are the same as those of BPNN. After search, the best combination of hyperparameters is n_l=3;neurons from the first to the last hidden layer are 128,64,and 32 respectively; activation_function=‘softsign’; batch_size=8; kernel_initializer= ‘glorot_uniform,’ optimizer=‘rmsprop’ and epochs=300.It takes 245.65 s to find the best results(Table 4).

4.1.2. Model performance and comparison

Predicted results on the training and testing datasets using these four ML methods are illustrated in Fig. 11. In order to compare produced results with actual values, the reference line is adopted,which presents the situation where the predicted values accordwith the measured settlements. MAE and RMSE are used to evaluate the prediction capacity,and the lower error reveals the better prediction capacity of the proposed models. The correlation between the measured and predicted values is assessed by the R.These three indicators are also summarized in Table 5.

Table 5 Comparison of prediction results generated by four machine learning methods on Dataset A.

Fig.12. Prediction results of testing dataset of Dataset A using four machine learning methods: (a) PSO-SVM, (b) PSO-RF, (c) GS-BPNN, and (d) GS-DNN.

It can be intuitively found that the RF model shows the best model performance on both training and testing datasets because the results are distributed around the reference line with less dispersion. The indicators of this model can be used to clarify this point further,as the RF model yields the lowest RMSE and MAE,and the highest R,on both training and testing dataset.The BPNN model has a similar prediction capacity compared with the DNN model.These two models almost have identical results and indicators.Compared with the RF model, the relatively large settlement is predicted by the BPNN, and the DNN model differs significantly from the actual value. The SVM model shows a relatively poor performance than the three models mentioned above.It yields the highest RMSE and MAE and the lowest correlation between the measured and predicted values. A comparison between these four models shows that the prediction accuracy reduces with the increase of settlement,and settlement ranges from-5 mm to 5 mm can be predicted with relatively low error.

Predicted results generated by these four ML methods on the testing dataset are compared with measured settlements (Fig.12).As indicated by the absolute error, the RF model shows more accurate results. In particular, the RF model outperforms than other three models in predicting large settlements.It confirms again that predicted values using the BPNN model are prone to be identical with those of the DNN model. While the SVM model yields a relatively high error, in particular, it performs poorly on the large settlement.

4.1.3. Sensitivity analysis

Variable importance illustrates each input variable’s contribution to the target prediction.This fact can be used to improve model performance by employing feature selection. As the RF model shows the best performance on Dataset A, the built-in feature importance in the Scikit-learn package is used herein,calculated by the mean decrease impurity. As mentioned, the RF is a set of DTs consisting of internal nodes and leaves.In DTs,every node makes a decision to split values in a single feature such that similar values of the dependent variable end up in the same set after the split. This process is based on some criteria, for example, Gini impurity or information gain for classification problems, and its variance reduction for regression trees.Therefore,when training a tree,how much each feature contributes to the decrease of the weighted impurity can be computed. This process can be expressed as follows:

where Ikjfis the importance of the node j on the feature f in a DT;Cjis the measure of the impurity of the node j;Mleft(j)and Mright(j)are the numbers of instances in the left and right subsets of node j,respectively;Mjis the number of instances in the node j;Cleft(j)and Cright(j)are the impurities of the left and right subsets of node j,respectively,and j is the total number of nodes in a DT.The variable importance can then be computed as:

where k is the number of regression trees.

The relative importance score of each input variable is illustrated in Fig.13.Geological properties,Gc and W,mainly affect the surface settlement because they contribute most to the model.Operational parameters, Pr, Th, and To, can also affect the settlement to some extent.However,their effect is not as strong as that of Gc and W. Geometric characteristic, Cd, does not significantly influence the tunnel-induced settlement,and its score is around 0.05.As for the stoppage state, St, it is less important than tunnel cover depth Cd. The geological property, MDPT, is the least important factor affecting the settlement,and it even scores below 0.01.These results are consistent with the relationship analysis in Section 3.1,indicating the necessity for analyzing both linear and nonlinear relationships. Again, the pre-stage analysis of relationships and sensitivity analysis results confirm the importance of geological properties on Dataset A.

4.2. Dataset B

4.2.1. Model development

The establishment of intelligent models for Dataset B is the same as that for Dataset A.SVM,RF,BPNN,and DNN models are also constructed combined with PSO, GS, and 5-fold CV methods. The setting of search spaces for hyperparameters is the same as those for Dataset A. Therefore, no further details are needed here. After the search, the best results for hyperparameters of the corresponding ML model are listed in Table 6,and it can be observed that the time cost is much lower than that in Dataset A.

4.2.2. Model performance and comparison

Best prediction results produced by these four ML methods on Dataset B are illustrated in Fig.14, indicating graphic comparisons of the measured and predicted settlements. The reference line is also used as the baseline to indicate that the predicted settlement is in complete agreement with the measured one.Again,the MAE and RMSE are used to evaluate the prediction capacity.The correlation between the measured and predicted values is assessed by the Pearson correlation coefficient (R). These three indicators are also summarized in Table 7.

Fig.13. The relative importance of input variables for Dataset A.

Table 6 Specifications on the best parameters of machine learning methods used for Dataset B.

Again, the RF model shows its superiority in predicting the tunnel-induced settlement,as it yields the highest R and the lowest RMSE, as well as MAE. Particularly, the RF model shows the best performance in predicting the settlement higher than 100 mm,implying its applicability for both small and large settlements.Results predicted by the SVM model are nearly similar to those generated by the BPNN model. However, the SVM model takes much less time and computational cost because it only has three hyperparameters for tuning. Like BPNN and SVM, the DNN model has a good capability in predicting settlements lower than 60 mm;however, it performs poorly on a relatively large settlement. This method takes much more time, as there are three hidden layers containing more neurons.

Predicted results by all four ML methods on the testing dataset are compared with measured settlements (Fig. 15). It can be observed that the RF model outperforms the other three models,and it yields the smallest overall error.It confirms again that the RF model has a strong applicability in predicting tunnel-induced settlement. The other three models have similar prediction capabilities, while they all cannot predict large settlements accurately.

4.2.3. Sensitivity analysis

The variable importance of Dataset B is shown in Fig.16,where the geometric characteristic, Vs, is the main contributing factor to the settlement, with a dominant score of more than 0.4. This is consistent with the study of Zhou et al. (2017), which is also in agreement with the findings of several empirical models.The other two geometric characteristics,D and Vs/Vt,have a similar influence on the settlement (scoring around 0.17), while the last geometric characteristic, Cd, has a relatively small impact. The operational parameter,Cm,does not significantly influence the tunnel-induced settlement neither, with its score lower than 0.05. In terms of geological properties,Cu and W barely affect the surface settlement,particularly W contributes least to the model.It is worth noting that the relative importance results show good consistency with the relationship analysis in Section 3.2,indicating the good reliability of R and MIC indices.As a result,relationship analysis and sensitivity results all demonstrate the importance of geometric characteristics.

5. Model development and results

5.1. Comparison with original work

For Dataset A, the accuracy of prediction models proposed in this work is higher than that of the original work conducted by Zhang et al.(2020b).This result comes from the fact that outliers of the original database were detected and removed in this paper through the anomaly detection process.Additionally,although the RF model has the best performance both in the previous and current works, the overfitting is much better controlled in this study.Because of the difference in data fed and model establishment,the produced sensitivity results are also different.In the original work,there were only three predictors(Cp,Pr,To)that contributed less to the prediction,and the other nine input features had a very similar effect. On the other hand, sensitivity results of this work present that only two geological properties (Gc and W) are dominant in predicting the MSS, less affected by the other ten parameters.

Fig.14. Best prediction results of four machine learning methods on Dataset B: (a) PSO-SVM, (b) PSO-RF, (c) GS-BPNN, and (d) GS-DNN.

Regarding Dataset B, the original work of Neaupane and Adhikari (2006) utilized a two-hidden layer BPNN model topredict the MSS,with R=0.881.This result indicated the potential applicability of neural network-based models on such tasks. Additionally, they also emphasized the importance of barring some anomalies.The R value of the original work was higher than that of the NN-based model proposed in this work. However, it is slightly lower than that of RF model, which equals 0.9. Without outlier removal, Zhou et al. (2017) also developed an RF model using the same database and explored the relative importance of predictors.The RMSE between actual and predicted values for the testing dataset was found to be 47.3 mm in their work, which was much higher than the corresponding value in our study. This result also reveals the importance of anomaly detection.It is worth noting that the sensitivity results of this study are in good agreement with the previous work, emphasizing pivotal influences of geometric properties.

Table 7 Comparison of prediction results generated by four machine learning methods on Dataset B.

5.2. Comparative study

On either Datasets A or B, the RF model outperforms the other three methods, although these two datasets have different sizes and features.We also note that the RF model can predict the large and discreet settlements with less error with an acceptable time cost, which indicates its superiority of wide feasibility. Generally,the SVM and BPNN models have similar performances in this work.However, the SVM model requires fewer parameters for tuning,thus reducing the time cost and computational efforts significantly.Considering this observation, the SVM model is highly recommended for the pre-analysis, which can be regarded as a baseline for comparison with other methods. Errors produced by the DNN model are acceptable on both datasets, but this model takes the most time, not achieving the goal as expected. Data quantity may account for this, as both datasets have a relatively small size,limiting their performance(Guo et al.,2019).Similarly,SVM,BPNN,and DNN models show that they have weaknesses in predicting large settlements, limiting their application on datasets with uneven data distribution.

The sensitivity analysis results by the RF model indicate the complexity and uncertainty involved with geotechnical engineering. For Dataset A, geological properties play an important role in predicting the MSS, while operational, geometric, and additional parameters have relatively less important influences.For Dataset B,geometric characteristics show a significant influence on the MSS prediction problem due to tunneling. Geological properties show the second relative importance of the MSS prediction, while operational parameters cannot obviously affect the MSS.The difference between Datasets A and B shows that intelligent models need to adjust to specific conditions, as they rely heavily on local data.

Fig.16. The relative importance of input variables for Dataset B.

Considering both Datasets A and B,furthermore,each ML model utilized in this work performs better compared to the previous works. This result indicates that the model performance is significantly affected by quantity, quality, and pre-processing technique of data.

Fig.15. Prediction results of testing dataset of Dataset B using four machine learning methods: (a) PSO-SVM, (b) PSO-RF, (c) GS-BPNN, and (d) GS-DNN.

6. Conclusions

This paper utilizes four different ML methods to estimate the MSS caused by tunnel construction. We investigate the relative effects of major influential factors, including geological properties,operational parameters, and geometric characteristics. The developed models are tested and compared by leveraging two datasets collected from previous studies.Our main findings are summarized as follows:

(1) The RF method performs best in capturing complex nonlinear relationships between the ground surface settlement and the combined effect of triggering factors. Considering its applicability range and acceptable computational cost, this method could be one of the best options for predicting tunneling-induced settlement under similar engineering conditions.

(2) The SVM method can achieve a balance between prediction accuracy and computation cost, which can be used for preliminary analysis and considered as a baseline for other methods.

(3) The BPNN and DNN models take the most time for construction and computation, while the results are relatively poor.However,their performance for other datasets may be improved with more samples, as data quantity and quality can significantly affect ML models.

(4) The sensitivity analysis shows that influential factors contribute differently to the settlement, indicating the complexity and uncertainty of tunneling.Moreover,the good consistency between relationship analysis and sensitivity results reveals the necessity of comprehensively analyzing linear and nonlinear relationships.

Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

The authors would like to acknowledge researchers for publishing their data, which were used in this work. The research presented was supported by the Natural Sciences and Engineering Research Council of Canada (NSERC)-Discovery Grant (Grant No.RGPIN-2019-06471)and the McMaster University Engineering Life Event Fund. These supports are gratefully acknowledged.

Journal of Rock Mechanics and Geotechnical Engineering2021年6期

Journal of Rock Mechanics and Geotechnical Engineering的其它文章: Spatial distribution modeling of subsurface bedrock using a developed automated intelligence deep learning procedure:A case study in Sweden; An evolutionary adaptive neuro-fuzzy inference system for estimating field penetration index of tunnel boring machine in rock mass; Prediction of rockhead using a hybrid N-XGBoost machine learning framework; Classification of clustered microseismic events in a coal mine using machine learning; An intelligent procedure for updating deformation prediction of braced excavation in clay using gated recurrent unit neural networks; Analysis of ground surface settlement in anisotropic clays using extreme gradient boosting and random forest regression models