Artificial neural network optimized by differential evolution for predicting diameters of jet grouted columns

2021-12-24 02:49:30PierreGuyAtngnNjokShuiLongShenAnnnZhouGiuseppeMooni

Journal of Rock Mechanics and Geotechnical Engineering 2021年6期

Pierre Guy Atngn Njok, Shui-Long Shen,b,*, Annn Zhou, Giuseppe Mooni

a Department of Civil and Environmental Engineering, College of Engineering, Shantou University, Shantou, 515063, China

b MOE Key Laboratory of Intelligent Manufacturing Technology, College of Engineering, Shantou University, Shantou, 515063, China

c Discipline of Civil and Infrastructure, School of Engineering, Royal Melbourne Institute of Technology (RMIT), Melbourne, 3001, Australia

d Department of Civil and Mechanical Engineering, University of Cassino and Southern Lazio, Cassino, 03043, Italy

Keywords:Artificial neural network (ANN)Differential evolution (DE)Jet grouting Model optimization Regularization

ABSTRACT A novel and effective artificial neural network (ANN) optimized using differential evolution (DE) is first introduced to provide a robust and reliable forecasting of jet grouted column diameters. The proposed computational method adopts the DE algorithm to tackle the difficulties in the training and performance of neural networks and optimize the four quintessential hyper-parameters (i.e. the epoch size, the number of neurons in a hidden layer,the number of hidden layers,and the regularization parameter)that govern the neural network efficacy. This approach is further enhanced by a stochastic gradient optimization algorithm to allow ‘expensive’ computation efforts. The ANN-DE is first trained using a prepared jet grouting dataset, then verified and compared with the prevalent machine learning tools, i.e. neural networks and support vector machine (SVM). The results show that, the ANN-DE outperforms the existing methods for predicting the diameter of jet grouting columns since it well balances training efficiency and model performance. Specifically, the ANN-DE achieved root mean square error (RMSE)values of 0.90603 and 0.92813 for the training and testing phases,respectively.The corresponding values were 0.8905 and 0.9006 for the optimized ANN, then, 0.87569 and 0.89968 for the optimized SVM,respectively. The proposed paradigm is bound to be useful for solving various geotechnical engineering problems regardless of multi-dimension and nonlinearity.

1. Introduction

The jet grouting is a ground treatment technique based on the high-speed jetting of a fluid into the soil body to create rigid soilcement columns (Liu et al., 2018; Croce et al., 2014; Arroyo et al.,2012). Depending on whether the jet grouting operation takes place in sandy or clayey soils, dissimilar column diameters can be obtained throughout the construction process as illustrated in Fig.1.A multitude of researches have been dedicated to the study of this variability and various models have been developed for forecasting the diameter of jet grouted columns (Ribeiro and Cardoso, 2017;Atangana Njock et al., 2018). These methods can generally be grouped into two main categories: (i) analytical and semitheoretical models (Croce and Flora, 2000; Modoni et al., 2006;Flora et al.,2013;Shen et al.,2013;Wang et al.,2020a),and(ii)the artificial intelligence-based models, which have achieved a great success in geotechnical engineering in the past few years(Jin et al.,2020; Jin and Yin, 2020; Shahrour and Zhang, 2020; Zhang et al.,2020a,b,c,d,e,f). Specifically, the machine learning-based models have been developed in response to some critical drawbacks of the analytical approaches. For example, the observed engineering phenomena such as jet grouted columns formation may not be fully predicted with analytical solutions due to a mismatch between the complexity of the governing mechanisms(e.g.soil disruption)and the unavoidable simplified schematization(e.g.time dependency).These constraints may prevent from capturing the role of relevant factors leading to inaccurate solutions. On the contrary, machine learning models trained in a real-world context are more flexible(Zhang et al.,2013,2016,2019a,b;Wei et al.,2021)and thus more suitable for predicting the properties related to jet grouted columns(e.g. diameter).

Fig.1. Column diameter obtained by Rodin jet pile method in (a) sandy soil and (b)clayey soil (after Shen et al., 2006).

Numerous machine learning-based models were developed to predict the diameter of jet grouted columns. In particular, the artificial neural network(ANN)and support vector machine(SVM),which are among the most popular intelligent tools to date, were consecutively adopted by Ochma′nski et al.(2015)and Tinoco et al.(2016)for forecasting the diameter of jet grouted columns based on soil and treatment characteristics. Generally speaking, the prediction results by their approaches are reasonable, despite some hurdles encountered during the execution process. These hurdles stem essentially from the intrinsic architecture and operational functions of the ANN and SVM algorithms (Gori and Tesi, 1992,Goodfellow et al.,2016;Salimans and Kingma,2016).They require intricate training processes for defining optimal neural network topologies and/or tuning the algorithm parameters. Besides, these models are vulnerable to high variance and solution stability over consecutive runs.In such conditions,the machine learning models fail to generalize well from the training data to any data from the problem domain, resulting in inaccurate predictions. There have been various attempts to solve these issues and optimize the solutions,many of which principally relied on metaheuristic methods(e.g. particle swarm optimization, simulated annealing, genetic algorithm).Therefore,the optimal solutions are achieved only if the parameters of these metaheuristic approaches are scientifically chosen, which is relatively arduous (Xue and Xiao, 2016; Xue and Liu, 2017). In this case, even if these models can ensure the accuracy of the solution,they are still prone to solution instability over consecutive runs. In other words, the implementation efficiency and accuracy of machine learning methods are two major problems yet to be adequately resolved for predicting the diameter of jet grouted columns (Zhang et al., 2020g,h; Shen et al., 2021). The urgent need for an adequate prediction tool is further pressed by the overwhelming popularity of the machine learning models in the field of geotechnical engineering,which is bound to reap huge fruit from the proposed development.

This paper proposes a strategy to guarantee the efficiency of ANN’s training process,while ensuring the stability and accuracy of the solution. Fundamentally, the proposed strategy employs a differential evolution(DE)algorithm enhanced by a stochastic optimization process. The objective is to decode the neural network into parameters that control its performance and optimize them to obtain a better result. Specifically, DE is used to optimize the parameters by iteratively testing a candidate solution, whereas a stochastic optimizer is adopted for efficiently adjusting the learning rate and reducing the computation time of the model. The core of the problem in conventional machine learning methods is first reviewed.A meticulous description of the proposed ANN model optimized using DE algorithm (ANN-DE) is presented. The proposed model is verified via the prediction of jet grouted column diameters.

2. Crux of the problem in ANN

The SVM and ANN are the leading machine learning techniques to predict the variability in diameter of jet grouted columns.In view of this paper’s objective,the literature review discussed in this section will only focus on the ANN and its issues. The ANN is a data processing system that tries to imitate the functioning of the human brain (McCulloch and Pitts,1943). Its structure is usually composed of interconnected neurons that interact to provide specific outputs based on given inputs(see Fig.2).When moving forward through the network, each neuron in the next layer is iteratively computed via multiplying the inputs received by each neuron(from several other neurons)by the assigned weights wijand passing the result through an activation function. Eventually, a bias bijis used to delay the triggering of the activation function.This operation is taken forward until an outputis obtained (Jain et al., 1996). Subsequently, the precision ofwith respect to the target output p is measured using a cost function C(i.e.logcosh function)as

Based on the first result,a backward pass is carried out to adjust the weights and biases, and thus to optimize the cost function.Mathematically, the weight and bias updating processes can be expressed by(Lippman,1987):

Fig. 2. Schematic illustration of a typical ANN architecture (recreated based on Werbos,1988).

where δ stands for the learning rate (Haykin, 1999; Goodfellow et al., 2016). Considering this arrangement, the neural network is continuously trained with new observations from the dataset,while the optimization of the cost function is pursued.

The design of ANN is a delicate process that has a significant effect on the model performance. Importantly, it has been widely acknowledged that there are two factors that affect how well a model performs, i.e. model selection and model variance.

(1) Model selection

The above-mentioned ‘design’ aims at delineating the optimum topology of the neural network(i.e.the number of hidden neurons,number of hidden layers).This procedure is a highly challenging task,which mostly involves a thorough training process. Empirical evidences have shown that a large majority of problems can generally be forecasted using one hidden layer(Min and Lee 2005).However,multi-dimensional problems with strong nonlinearity (Zhang et al.,2012,2020i,j,k;Goh et al.,2017,2018,2020)may require more hidden layers. Furthermore, keeping in mind that the hidden neurons are the core components of neural network architecture,the number of neurons that compose a hidden layer is thus quintessential for the network performance. Indeed, choosing a large number of neurons may increase the computation efforts; while on the other hand,choosing a smaller amount of neurons can reduce the model learning ability (Larose 2005). Although some rules of thumbs have been proposed to address this issue (Alibakshi, 2018; Alwosheel et al.,2018),improvements are still in demand.

(2) Model variance

The model variance is a critical ANN concern that is generally associated with the timespan of the training process on one hand(Prechelt,1998; Srivastava et al., 2014), and the amount of dataset features on the other (Hawkins, 2004; Alwosheel et al., 2018).Insufficient training may cause the model to underfit the dataset(low variance);whereas,on the other hand,too much training may lead to overfitting(high variance).In either case,the generalization ability of the neural network will be significantly reduced. In addition, the non-consideration of some informative features during the training process is also likely to ruin the model performance. Quite a number of methods have been proposed for alleviating the issue of overfitting.Among them,the regularization and early stopping techniques are most widely used in the field of machine learning. The regularization technique was mainly developed to alleviate the issue related to the number of features and noise in the dataset. This technique consists of adding a regularization term to the cost function to control the variance of parameters. By using the popular “L2 regularization” scheme(Bisong, 2019), Eqs. (1) and (2) can be rewritten as

where Ψ represents the degree of “L2 regularization”. As for the other parameters, they were previously discussed while introducing Eqs. (1) and (2). Another way to solve the overfitting problem is early stopping approach. This technique consists of training the ‘training set’ but stops at the point when the performance on the‘validation set’starts degrading(e.g.accuracy begins to decrease or loss begins to increase) (Goodfellow et al., 2016).Moreover, recent researches (Hawkins, 2004; Ying, 2019) have shown that controlling the number of training epochs can be tremendously effective against the overfitting problem. Basically,this strategy involves training the model multiple times,then select the number of epochs with the best performance.

Based on the previous observations, it is clear that the performance and training efficiency of an ANN model depend largely upon four parameters, i.e. the number of neurons, the number of hidden layers, the regularization term, and the epoch size. In fact,while the number of hidden layers can provide a reliable mapping of the problem nonlinearity, the network performance can be maximized by properly selecting the number of hidden layer neurons. Additionally, the overfitting problem can efficiently be fixed by using a‘regularizer’(explicit regularization)or setting the epoch size as a hyper-parameter and find its best value (implicit regularization). The fundamental concept of the proposed approach,i.e.ANN-DE,is to search for the optimum values of hyperparameters that are believed to underpin the structure of ANN.The DE algorithm has the particularly appealing ability to optimize the problem by iteratively testing and improving a candidate solution.Specifically,its vectors can be idealized as the“cloud”that explores the high value areas of the solution space effectively. This procedure intends to improve prediction while preserving the model’s simplicity.Because the optimization process of these parameters is computationally demanding,a stochastic optimization algorithm is utilized to make the model training easier.

3. Formulation

3.1. DE paradigm

This research adopts the DE model proposed by Storn and Price(1997), which has been acknowledged as a powerful minimizer of stochastic function in many research fields (Abraham et al., 2006;Pei et al.,2009;Ishaque et al.,2012).In general,DE model involves two main steps: (i) population initialization, and (ii) hyperparameters tuning. The first step is usually based upon an individual x (parameter vector) define in Eq. (5). The lower and upper bounds are first suitably delineated,and then the initial parameter values are randomly selected within the bounds.This concept was employed in this case to define the optimum hyper-parameters,and thus to obtain the ideal neural network architecture. In our model, the following four hyper-parameters were regarded as characterizing an individual x: the epoch size Ω, the number of neurons in a hidden layer Φ,the number of hidden layers Γ,and the regularization parameter Ψ. Parameter Ψ is adopted as solution against overfitting (see Eqs. (3) and (4)).

where D is the parameters need to be optimized,N is the size of the population(no less than 4),and G represents the aggregate amount of generations. More details on the initialization process can be found in Storn and Price (1997). Subsequently, the optimizing of hyper-parameters was methodically carried out using the DE operations advocated by Storn and Price (1997). Each of the N parameter vectors experiences mutation, recombination, and selection.

(1) Mutation This operation was undertaken to expand the search space.For a given vector xi，G, it consists of randomly selecting three vectors xr1，G, xr2，G, and xr3，Gso that the indices r1, r2, and r3are distinct.Then, the weighted differences of two of the vectors are added to the third.This operation is defined as (Storn and Price,1997):

where vi，G+1is the donor vectorand F is the mutation factor(F?[0,2]).

(2) Recombination

This procedure incorporates successful solutions obtained from the previous generation. The trial vector ui，G+1is built using the elements of both the target vector xi，G,and donor vector vi，G+1.It is assumed(Storn and Price,1997)that elements of the donor vector enter the trial vector with a probability CR as

where randj，i～U[0，1], and Irandis the random integer within the range of [1, 2, …, D] which ensures that vi，G+1≠xi，G.

(3) Selection

In this process, the target vector xi，Gis compared with the trial vector vi，G+1. Then the target vector corresponding to the lowest function value is selected for the next generation by (Storn and Price,1997):

3.2. Stochastic gradient optimization algorithm

A downside of the implicit regularization strategy used in this study is its low computation efficiency. To overcome this disadvantage and fully enjoy the power of this regularization approach, the optimization algorithm Nadam (Dozat, 2016) was adopted in this study. Nadam is a powerful and faster learning algorithm that integrates the Nesterov momentum (Nesterov, 1983) into Adam(Kingma and Ba, 2015). This method was associated to the abovediscussed DE algorithm for improving the quality of the training process. To be more specific, the Nadam algorithm was utilized to prevent the adaptive learning rate from becoming really small over time.The implementation steps of the Nadam method are presented below(Dozat,2016).

(1) Step 1: Initialize the parameters of the algorithm as follows υ = 0?999, ε = 1 × 10-8, μ = 0.99, and μt= μ (1-0.5 ×0.96t/250). The step size η was taken equal to 0.002.

(2) Step 2: Update the gradient by

(5) Step 5: Update the parameters

Fig. 3. Flowchart of the proposed ANN-DE model.

4. Implementation procedure

The execution sequence of the proposed ANN-DE is illustrated in Fig. 3 and summarized in the next paragraph. In general, after a meticulous processing of the data, the DE process is initialized.Then the mutation and the recombination operations are successively applied to each individual and transferred to the back propagation algorithm.At this stage,the back propagation process is performed. In other words, for each individual discretized into the ANN-DE hyper-parameters (Ω, Φ, Γ, and Ψ), the model is iteratively trained and checked to see whether or not it fulfills the stopping conditions (maximum number of generation and/or convergence criterion). If it does, the ideal network architecture is extracted;otherwise,the results are passed to a new generation for undergoing the same cycle. The logcosh function was used as cost function in this study owing to its ability to provide a certain degree of safety against outliers that are likely to cause incorrect predictions.

Pseudo-code of the proposed ANN-DE model is as follows:

(1) Phase 1: Preprocess the data and initialize the network;

(2) Phase 2: Perform the mutation and recombination operations;

(3) Phase 3: Transmit the hyper-parameters (Ψ (regularization parameter), Φ (number of neurons in a hidden layer, Γ(number of hidden layers), and Ω (epoch size)) to the backpropagation algorithm;

(4) Phase 4: Convert losses of network to fitness;

(5) Phase 5: Carry out the selection process (update the population);

(6) Phase 6: Verify the stopping criteria;

(7) Phase 7: Extract the optimum solution (most accurate network);

(8) Phase 8: End.

The initial parameters for ANN-DE adopted in this case are summarized as follows. For DE algorithm, population size and number of generations are set to be 50 and 100. Regarding neural network, number of hidden layers (Γ) is in the range of [1, 5],number of neurons in a hidden layer (Φ) is in the range of [1,10],regularization parameter (Ψ) is in the range of [0, 0.4], and epoch size (Ω) is in the range of [1000, 5000].(2021). The core mechanism underpinning the jet grouting method (i.e. jet-soil interaction) was extracted through the consideration of four characteristic field parameters. Specifically,following Ochma′nski et al.(2015),the jetting effect was considered in terms of specific energy at the nozzle E′n, as well as jet grouting variations (single, double, and triple systems). Indeed, E′nrepresents the kinetic energy produced at the nozzle per unit length of column(Flora et al.,2013).On the other hand,the soil type(which

Table 1 Descriptive statistics of the jet grouting dataset.

Fig. 4. Model selection process of optimized ANN: Variations of (a) correlation coefficient and (b) RMSE with the number of neurons for different data fractions.

5. Application

The ANN-DE model developed in the previous section was applied to the prediction of jet grouted column diameter.Moreover,to provide a robust and reliable basis of evaluation for the proposed method,it was compared against previous models(ANN,SVM,and theoretical approaches), which were trained using a large jet grouting database. Therefore, we discuss here the data collection and normalization procedures, as well as the selection of baseline evaluation models.

5.1. Data collection and normalization

Fig. 5. Model selection process of optimized SVM.

Table 2 Statistics of the extracted generation.

Two hundred and nine field records of single,double,and triple jet grouting systems were used to validate the aforementioned models. This large database was meticulously compiled from previous and relevant literatures; it can be found in Atangana et al.can be coarse without fine, coarse with fine, or fine soils) and standard penetration test of soil(Nspt)were used to characterize the soil(Zhang et al.,2017;Xiang et al.,2018;Li and Zhang,2020;Wang et al., 2020b). In this regard, it should be said that in some references, the strength properties of soil were given in terms of cone penetration test(CPT),and thus the empirical correlation proposed by Robertson et al.(1983)was adopted to carry out the conversion from CPT to Nspt. Table 1 gives the descriptive statistics of the jet grouting database used in this study.Moreover,it should be noted that, given the qualitative nature of the “jet grouting variations”and“soil type”,values from 1 to 3 were used to characterize these two parameters as shown in Table 1.

The values of input and output data were normalized between[0, 1] before both the training and testing procedures. This normalization was performed to avoid instabilities that could affect the learning ability of the model. In fact, this procedure allows avoiding correlation among input data and deriving a faster convergence during the training process. It can be expressed as

where Xnrepresents the normalized data; X is the original data sample;Xmaxand Xminare the maximum and minimum values of X,respectively.

5.2. Benchmark models

Together with the theoretical approach,the optimized schemes of ANN and SVM are also considered for comparison. This section demonstrates the relatively laborious processes required for extracting the best solutions out of ANN and SVM methods,which are compared to the proposed strategy in terms of performance and model selection efficiency.

(1) Optimized ANN model

The performance of a neural network model is greatly reliant on its architecture and learning rule. Because it is difficult to define a priori neural network with a good generalizing ability, different configurations were defined by varying the proportions of trainingtesting data as well as the number of hidden layer neurons. In particular, a maximum number of 10 neurons was considered,following the rule of thumb 2(N+1),where N is the number of variables.Besides,the fractions in percentage of 90-10,80-20,70-30,60-40, and 50-50 were utilized as training-testing set, respectively.Hence,to define the best networks architecture,it was necessary to train approximately 50 different configurations.The training strategy used in this case consisted of updating weights and biases according to Levenberg-Marquardt optimization(Marquardt 1963;More 1978).This method combines the advantage of the gradient descent and Gauss-Newton algorithms.As can be seen from Fig. 4,the best performance of optimized ANN was achieved with 2 hidden neurons and data fraction of 80-20. In fact, this model presents the highest coefficient of determination (R2= 0.9) and smallest root mean square error (RMSE = 0.2129). Moreover, compared with other models,it tends to provide rather stable solutions.

Table 3 Statistics of the proposed model’s efficacy.

Fig. 6. Evolutionary curve of the ANN-DE.

Fig. 7. Comparison between the proposed ANN-DE and optimized SVM models: (a)Training and (b) Testing.

(2) Optimized SVM model

As mentioned earlier,the performance of SVM(Tinoco et al.,2014,2016)models depends on the choice of the regularization and kernel parameters. In this study, a radial Gaussian kernel was adopted because of its prominent ability to deal with highly nonlinear problem.Adopting the cross validation(CV)technique,the Gaussian kernel parameters C(regularization parameter)and γ(kernel scale)were chosen as a result of the following steps.The training data were firstly split into 10 separate sets of folds. Then, the CV error was computed using different values for C and γ. The C and γ with the lowest CV error was selected and used for training the SVM on the complete dataset. Furthermore, the Bayesian optimization method was adopted to optimize the selection of C and γ.This technique has the advantage to tolerate stochastic noise in function evaluations(Pelikan, 2005; Snoek et al., 2012). The parameter ε is used to regulate the width of the ε-insensitive zone, which is utilized for fitting the training data.That is,larger values of ε′shall bring about more ‘flat’ estimates. Also, the parameter C fixes the compromise between the degree of tolerance to ‘deviations larger than ε′in optimization formulation and the model complexity.Specifically,we adopted a quasi-infinite loop to continuously optimize all the hyperparameters using the Bayesian optimization scheme.A great deal of combinations of hyper-parameters was tested over numerous iterations, based on which the best point (corresponding to the best possible solution)can be extracted.As shown in Fig.5,the best SVM model was obtained for epsilon (ε = 0.0012454), kernel scale(γ= 1.7558),and box constraint(C= 10.869).

(3) Theoretical approach

The model adopted as baseline comparison for the proposed DEANN is the one established by Flora et al.(2013).The reason is that,unlike other theoretical models, this approach embodies the four variables defined previously. This approach espouses the jet-soil interaction model established by Modoni et al. (2006) and provides simple and powerful formulae calibrated based upon empirical evidences.Specifically,this model relates the diameter of jet grouted columns to both the specific energy at the nozzle E′nand soil properties(soil type and strength characteristics).The relevant formulae are given by Flora et al. (2013):

Fig. 8. Comparison between the proposed ANN-DE and optimized ANN models: (a)Training and (b) Testing.

Fig.9. Comparison between the proposed ANN-DE model and theoretical approach by Flora et al. (2013).

where Drefis the reference diameter obtained for single fluid system; αEquantifies the interaction between the jet and the surrounding fluid(either air or grout spoil);Λ*is a parameter that reflects the composition of the injected fluid (either grout or water);and qcand Nsptare the well-known cone penetration test and standard penetration test values, respectively. Finally, β = 0.2 and δ = -0.25 are two other constants.

5.3. Comparison and discussion

In this section,the capability of the proposed method to predict the diameter of jet grouted columns is assessed and compared with that of the benchmark models defined in the previous section.RMSE was used as the principal metric for measuring the performance of the different models.

(1) Efficiency of the implementation

Considering the model structure selections performed in the previous section, either for the optimized ANN or SVM model, the selection process remains quite intricate. Using these methods, it was found that different results are obtained from different runs(with an accordingly high-variance rate). Besides, it was nearly impossible to guarantee the stability of the solutions (which is difficult to predict or control)of the different ANN schemes(Fig.4).This aspect is critical as in the case of a wrong chosen ANN scheme;suboptimal final solutions are more likely to be obtained.That is to say, even optimized, these methods still lack of efficiency (model selection)and robustness(stability and accuracy of results).On the contrary, owing to the strategy adopted in this study, i.e. training the ANN-DE to extract the best architecture,the optimum ANN-DE model was obtained without delay, by simply selecting the best individual (best performance) from the generations. Table 2 gives the statistics of the generation No.11, from which the best hyperparameters were extracted. Specifically, the best individual corresponds to the number of epoch Ω=3078,regularization parameter Ψ=0.14153,aggregate number of neurons in a hidden layer Φ=4,and number of hidden layers Γ=2.Moreover,the statistical results shown in Table 3 further revealed the significance of the proposed model. These statistics were obtained by successively varying the population size from 10 to 50,with an increment value of 10(i.e.5 cases), and running each case 20 times. Through this calibration process, it was observed that the accuracy of the proposed model increased very slightly with the increase of the population size;but this effect became less noticeable for high values of population size.On the other hand,the population size had a more obvious impact on the extracted generation,i.e.the likeliness to obtain the solution at an early generation.Therefore,considering the computation cost and model efficiency,it is believed that a population size within the interval of[20,30]is optimal for this model.These results stipulate that,despite a trivial calibration of initial population size of the DE algorithm,the ANN-DE performed rather steadily over consecutive runs, supplied higher accuracy and was less prone to overfitting.The latter are crucial and noteworthy that telegraphs its robustness and efficiency.Besides,Fig.6 illustrates the significant performance gain in terms of speed of training. The model converges rapidly thanks to the Nadam optimizer, enabling the production of satisfactory solutions at an early stage.

Fig. 10. Comparison of performance between the proposed approach and other machine learning methods.

To sum up, in addition to having the properties of the conventional shallow neural network (ANN), the proposed ANN-DE possesses the capability of a deep neural network in the sense that it can adapt to the problem complexity and involves multiple hidden layers. This simply means that, unlike the existing models that require the key parameters(such as number of layers or number of neurons) to be specified each time when the problem’s configuration changes (variables dimension, or amount of data), the proposed model adapts automatically to any change of the problem state. This automatic adaption was made possible by the DE optimization process,which is also believed to play a crucial role in the model accuracy.In the same way,the results show that the implicit regularization (epoch size) and explicit regularization (L2 regularization) allow regulating the stability of the solution in a satisfactory manner. In fact, the relatively higher performance of the proposed method over the others can be ascribed to the facts that the ANN-DE is fairly easier to train since it better handles the issues of high variance (cumbersome local minima) and/or slow convergence rate.On the contrary,ANN models may not be able to provide neurons or hidden layers when required by the system,contrasting with the performance of the proposed approach.

(2) Performance analysis

Fig.11. Distributions of predicted column diameters using (a) ANN-DE, (b) SVM, (c) ANN, and (d) theoretical model.

Fig.12. Design charts for obtaining the diameter of jet grouting column in (a) coarse grained soil without fine, (b) coarse grained soil with fine, and (c) fine grained soil,using the single fluid system.

Fig.13. Design charts for obtaining the diameter of jet grouting column in (a) coarse grained soil without fine, (b) coarse grained soil with fine, and (c) fine grained soil,using the double fluid system.

Fig.14. Design charts for obtaining the diameter of jet grouting column in (a) coarse grained soil without fine, (b) coarse grained soil with fine, and (c) fine grained soil,using the triple fluid system.

The obtained network architecture was tested and compared with the results of optimized SVM, ANN, and the theoretical approach as shown in Figs. 7-9, respectively. It can be observed that the proposed method slightly outperformed the other machines learning methods in the training and testing phases, with a maximum deviation of the correlation coefficient equal to 0.030337 and 0.028445,respectively.When considering Fig.7 for example, it can be seen that the values of the proposed ANN-DE are always more concentrated along the line of best fit. Yet, the aggregation of these small deviations explains the differences of R2values between ANN-DE and SVM. Besides, for some cases(diameter between 1.2 m and 4.5 m), the value predicted by the optimized SVM was significantly lower than that of the ANN-DE.Regarding the accuracy of the proposed model, it can be seen from Fig. 10 that the accuracy of the ANN-DE is slightly better than that of the optimized ANN (deviation of RMSE = -0.03549)and that of the optimized SVM model (deviation of RMSE = -0.0363). Besides, the theoretical model (Flora et al.,2013) performs very well within its calibration range, but the model efficiency tends to reduce considerably beyond that range(and or with more data). Then again, from what precede, the proposed model is advantageous as it incorporates a larger database and can readily adapt to new data. Moreover, based on the distribution of the predicted diameter values obtained using the abovementioned models(see Fig.11),it can be found that the generalization ability of the ANN-DE compares well with that of the other machine learning methods. The predicted diameter values comply with normal distribution with a maximum residual standard deviation of 0.02293. Importantly, given the abovementioned problem of finding a trade-off between the efficiency of model selection, accuracy and stability of machine learning models,the proposed ANN-DE appears to be more advantageous.

(3) Design charts

Inspired by Flora et al.(2013)and Ochma′nski et al.(2015),the present study capitalized on the relatively high accuracy of the proposed model to establish some design charts(Figs.12-14).The basic idea is to use ANN-DE training and testing to establish a model, and use it to determine different curves by changing jet grouting parameters (specific energy at the nozzle and jet grouting systems) and soil parameters (soil type and Nsptvalues). The specific energy at the nozzle(E′n)can be calculated by(Flora et al.,2013):

where vsis the average monitor lifting speed, p is the injection pressure at the pump, and Q is the flow rate.

It was important to delineate different cases, not only to suitably simulate the various combinations of soil-fluid interaction mechanisms that underpin the jet grouting method (via soil gradation-jet grouting system), but also to consider a relatively large spectrum of field conditions. It is important to note that compared with the previous studies, these charts bring about more confidence for the practical prediction of column diameters owing to not only the relatively good performance of the proposed model,but also the larger amount of data used to train the latter.That is, the additional data has extended the range of some parameters such as the specific energy at the nozzle and soil properties. This deduction is consistent with the empirical evidences provided by field and experimental investigations (Shibazaki,2003).

6. Conclusions

This paper has presented an intelligent strategy for improving the confidence in the prediction of jet grouted column diameters. Specifically, an ANN-DE model was proposed to overcome the training efficiency limitations of the existing models. The capability of the proposed approach to forecast the variability of jet grouted columns was assessed by comparing its results with that of optimized SVM and ANN models.It was found that the proposed ANN-DE can achieve a higher degree of accuracy (compared to the conventionally optimized ANN and theoretical models) while ensuring the reliability and efficiency of the training process.In fact,the predictions of the ANN-DE were found slightly better than that of the optimized ANN (with a deviation of RMSE = -0.03549), and optimized SVM model (with deviation of RMSE = -0.0363). Yet, keeping in mind that prediction models must concurrently integrate the selection efficiency,accuracy,and stability of the model,the proposed ANN-DE model outperforms both optimized ANN and SVM models in terms of finding a trade-off between these parameters. Overall, the relatively good performance and improved efficiency of the proposed model have allowed delineating some design charts which are bound to enhance the current jet grouting practice.

Declaration of competing interest

The authors wish to confirm that there are no known conflicts of interest associated with this publication and there has been no significant financial support for this work that could have influenced its outcome.

Acknowledgments

The research work described herein was funded by “The Pearl River Talent Recruitment Program”in 2019 for Professor Shui-Long Shen (Grant No. 2019CX01G338), Guangdong Province and the Research Funding of Shantou University for New Faculty Member(Grant No.NTF19024-2019).

Journal of Rock Mechanics and Geotechnical Engineering2021年6期

Journal of Rock Mechanics and Geotechnical Engineering的其它文章: Spatial distribution modeling of subsurface bedrock using a developed automated intelligence deep learning procedure:A case study in Sweden; An evolutionary adaptive neuro-fuzzy inference system for estimating field penetration index of tunnel boring machine in rock mass; Prediction of rockhead using a hybrid N-XGBoost machine learning framework; Comparison of machine learning methods for ground settlement prediction with different tunneling datasets; Classification of clustered microseismic events in a coal mine using machine learning; An intelligent procedure for updating deformation prediction of braced excavation in clay using gated recurrent unit neural networks