Experimental Analysis of Methods Used to Solve Linear Regression Models

2022-11-11 10:48:48MuaadAbuFarajAbeerAlHyariandZiadAlqadi

Computers Materials&Continua 2022年9期

Mua’ad Abu-Faraj,Abeer Al-Hyari and Ziad Alqadi

1Computer Information Systems Department,The University of Jordan,Aqaba,77110,Jordan

2Electrical Engineering Department,Al-Balqa Applied University,As Salt,19117,Jordan

3Computers and Networks Engineering Department,Amman,15008,Jordan

Abstract: Predicting the value of one or more variables using the values of other variables is a very important process in the various engineering experiments that include large data that are difficult to obtain using different measurement processes.Regression is one of the most important types of supervised machine learning, in which labeled data is used to build a prediction model,regression can be classified into three different categories:linear, polynomial, and logistic.In this research paper, different methods will be implemented to solve the linear regression problem, where there is a linear relationship between the target and the predicted output.Various methods for linear regression will be analyzed using the calculated Mean Square Error (MSE)between the target values and the predicted outputs.A huge set of regression samples will be used to construct the training dataset with selected sizes.A detailed comparison will be performed between three methods, including least-square fit; Feed-Forward Artificial Neural Network (FFANN), and Cascade Feed-Forward Artificial Neural Network(CFFANN),and recommendations will be raised.The proposed method has been tested in this research on random data samples, and the results were compared with the results of the most common method, which is the linear multiple regression method.It should be noted here that the procedures for building and testing the neural network will remain constant even if another sample of data is used.

Keywords:Linear regression;ANN;CFFANN;FFANN;MSE;training cycle;training set

1 Introducti on

In many engineering experiments,practical results are obtained that require finding the relationship between some of the inputs and outputs so that this relationship can be applied to find the output values depending on the input values without resorting to experiment and measurement provided that the output values are accurate and achievable at a very low error rate(very close to zero).The process of finding the relationship between the independent variables and the dependent ones (response)as shown in Fig.1 is called solving the linear regression model(or linear prediction process)[1-5].

Figure 1:Linear regression(prediction)model[3]

Methods other than linear regression are used to approximate the solution of an equation for some system, in [6] they used two different techniques to compare the analytical solutions for the Time-Fractional Fokker-Plank Equation (TFFPE); including the new iterative method and the fractional power series method(FPSM).The experimental result shows that there is a good match between the approximated and the exact solution.

On the other hand,the relation between the target value and the predicting variables is non-linear in most cases.Therefore,more techniques that are sophisticated must be used in similar cases.In[7],the authors used the reduced differential transform method to solve the nonlinear fractional model of tumor immune.The obtained results show that the solutions generated for the nonlinear model are very accurate and simple.

The quality of any selected method to solve the linear regression model can be measured by Mean Square Error (MSE)and/or Peak Signal to Noise Ratio (PSNR), these quality parameters can be calculated using Eqs.(1)and(2)[8-12]:

S:experimental data

R:calculated data

The selected method is very accurate when the MSE value is very close to zero, or/and PSNR value is close to infinite[13-16].The process of solving a linear regression model can be implemented as shown in Fig.2 applying the following steps[17]:

Collect the experimental data of measurements.

Analyze the collected data and perform some filtering and normalization if needed.

Select a set of data samples to be used as a training dataset.

Select a method to solve the prediction problem.

Check MSE or PSNR,if they are acceptable then save the model solution to be used later in the prediction process,otherwise,increase the training dataset size,or modify some model parameters and retrain the model again.

The process of predicting the value of non-independent variables depending on a set of values of independent variables is a very important process due to its use in many applications and vital fields,including educational,medical,and industrial.The results of the prediction process are used to build future strategies and plans,and accordingly,the mathematical models used in the prediction process must provide very high accuracy to reduce as much as possible the error ratio between the calculated values and the expected values,and given the importance of the prediction process in decision-making,we will in this research paper by analyzing some mathematical models whose structure remains to some extent fixed even if the number of inputs and the number of outputs is changed.

All that matters to us in the research paper is the accuracy of the results and obtaining accurately calculated values that are very close to the expected values, and accordingly, it was sufficient to use MSE and/or PSNR.

2 Solving Regression Model

The Linear regression model can be solved in a simple way using arithmetic calculations (least square fit method)[18,19],the solution of the model will find the regression coefficients as shown in Fig.2.

Figure 2:Solving prediction model[14]

Here, the process of linear regression solution using a simple example is being described; this example will be solved using MATLAB.If we consider the following regression problem shown in Fig.3:

Figure 3:Linear regression example

To calculate the regression coefficients,we must apply the following steps:

1.Generate a regression matrix that includes the independent variables values,the elements of the first column of this matrix must equal to one as shown in Fig.4:

Figure 4:Regression matrix

2.Use the backslash operator in MATLAB to divide the regression matrix by the output matrix,for this example we will get the values of the following coefficients,as depicted in Fig.5:

Figure 5:Coefficient values

3.Use the regression coefficients to construct the output equation according to Eq.(3):

4.Now we can apply Eq.(3)to predict the value of y for any given values of x1and x2.

Fig.6 shows the experimental and predicted outputs for this example, the predicted values are very close to the true values for all the samples.It is expected since the dataset is very small and the least square fit works efficiently with such a dataset.However, this method usually gives high MSE values,especially,when the size of experimental data is big,this will be discussed later in Section 4.

3 Artificial Neural Networks

An Artificial neural network(ANN)is a powerful computational model that consists of a set of fully connected neurons organized in one or more layers[20-23].Each neuron is a computational cell that performs two main operations as shown in Fig.7.

An activation function must be assigned for each layer and each neuron in this layer,the output of the neuron will be calculated depending on the assigned activation function,for linear activation function the neuron output will equal the summation,while for logsig and tansig activation functions the output of the neuron will be calculated as shown in Fig.8.

Figure 6:Experimental and predicted outputs(example)

Figure 7:Neuron operation[12]

Figure 8:Neuron output calculation using logsig(Sigmoid)and tansig(TanH)activation functions

ANN can be easily used in many applications including solving linear regression models by directly predicting the output values using the input variable values as an input for the ANN.ANN model can be treated as a black box,with selected inputs and the outputs to be predicted.A set of samples from the input data must be selected as training samples,these samples are used to train ANN,the results of training must give an acceptable value of MSE, so selecting the size of training samples and the number of training cycles will affect ANN performance.Each training cycle computes the neuron outputs starting from the ANN input layer,then the final calculated outputs are compared with target outputs by computing MSE between them,if the MSE value is acceptable then the computation will be stopped, otherwise, backpropagation calculations will be performed starting from the output to find the errors and make a necessary weight updating as shown in Figs.9 and 10.

Figure 9:Neuron outputs calculation

Figure 10:Error calculations and weights updating

The process of using ANN as a prediction tool can be summarized in the following steps:

Step 1:Data preparationfrom the collected data we must select several samples which include the independent variables values and the measured outputs(true labels to be predicted),these values must be organized in a matrix, one column for each sample value, the input data must be normalized to avoid error in the results of logsig or tansig calculations(see Fig.11).

Step 2:ANN creation and designin this step we must create an ANN architecture by selecting the number of layers and the number of neurons in each layer, an activation function must be assigned to each layer.The goal(acceptable MSE)and the number of training cycles must be determined(see Fig.12).ANN must be initialized and trained using the inputs and the target labels to be predicted.

After finishing each training cycle MSE will be computed and compared with the goal, if the error is acceptable,we can save the net to be used as a prediction tool,otherwise,we must increase the number of training cycles,or update ANN architecture and retrain it again.

Figure 11:ANN presentation[16]

Figure 12:ANN design and testing[16]

Step 3:ANN testingA set of new samples is selected for testing purposes,the saved ANN model is run and loaded with the test samples.MSE is calculated between the true values for the test samples and the predicted labels by the ANN model.If the computed value of MSE is acceptable,then ANN can be used in the future to predict any values of the outputs given the necessary inputs, otherwise,ANN must be modified and retrained again.

4 Implementation and Experimental Results

5000 samples of two independent variables and one dependent variable were selected.The linear regression model was solved using MATLAB,the size of training samples dataset size was varied from 100 to 2000 samples.The regression coefficients were computed for each training set of samples,then the predicted outputs were calculated using the associated regression equations,the expected MSE for each case was calculated,Tab.1 lists the obtained experimental results,MSE values are almost stable regardless of the training set size.

Now we will use the same samples to train and test ANN,and here two types of ANN architectures are selected,including Feed-Forward ANN(FFANN)and Cascade Feed-Forward ANN(CFFANN),the differences between these two types of ANN are shown in Fig.13[24-26].

Table 1: Results of linear regression model solving

Figure 13:CFFANN and FFANN architectures

The optimal architecture of the selected ANN consists of one input layer with 2 neurons and 1 output layer with 1 neuron.CFFANN with the selected architecture was trained and tested; Tab.2 lists the obtained results(for each training set ANN was run five times and the best case was selected).

Table 2: Results obtained by CFFANN with 2 neurons input layer and 1 neuron output layer

Table 3:Results obtained by CFFANN expanded to 10 neurons input layer and 1 neuron output layer

Table 4:Results obtained by CFFANN expanded to 2 neurons input layer,4 neurons hidden layer and 1 neuron output layer

In the previous experiments, the selected number of training cycles was equal to 3000 cycles,FFANN with minimal architecture was trained and tested,and Tab.5 lists the obtained results,while Tab.6 shows the required training cycles to achieve the goal for CFFANN with different architectures.

Table 5: Obtained results using FFANN with 2 neurons input layer and 1 neuron output layer

Table 6: Required cycles for CFFANN to achieve the goal

5 Results Analysis

Solving the regression model using the least square fit method shows very poor results, the calculated MSE between the targets and the calculated outputs using regression coefficients was always high regardless of the training sample size,as depicted in Figs.14 and 15.

Figure 14:Experimental and predicted outputs using least square fit method

To overcome the disadvantages of the least square fit method,ANN is introduced as a prediction tool in two different flavors,which are FFANN and CFFANN.Using FFANN architecture increases the quality of the linear regression solving,but it requires many training cycles and training time(as listed in Tab.5),expanding the number of neurons in the input layer or adding an extra hidden layer does not improve the value of MSE(see Fig.16).

Figure 15:Computed MSE using various training sets

Figure 16:Calculated MSE using FFANN

To improve the performance of the ANN model,it is better to use CFFANN.The main advantages of using CFFANN compared to FFANN are that it needs a smaller number of training cycles,and it can achieve the goal of minimal MSE value.Moreover,A small set of training samples can be used to train CFFANN;this ANN can be saved and easily used to predict the output using any given inputs,the output that is generated using CFFANN is much closer to the target with an MSE value closer to zero as shown in Fig.17.

Figure 17:Calculated MSE using FFANN

6 Conclusion

Several methods were implemented to solve the linear regression model,the least square fit method was used to find the regression coefficients, and these coefficients then were used to construct the regression equation,which was used to calculate the predicted output.The least square method showed the poorest results even if the set of training samples size was increased.To overcome the disadvantages of the least square method, various models of ANN were proposed, which include: FFANN and CFFANN.The obtained experimental results showed that CFFANN with any architecture and with various sizes of the training set achieved the best performance by minimizing the number of training cycles required to achieve the minimum MSE value.Thus,the CFFANN model is highly recommended to solve the linear regression model.

Recommendation:The proposed procedure will still be stable even if we use another sample of data,a simple modification of ANN architecture is required to match the number of inputs and the number of targets to be calculated.

Funding Statement:The authors received no specific funding for this study.

Conflicts of Interest:The authors declare that they have no conflicts of interest to report regarding the present study.

Computers Materials&Continua2022年9期

Computers Materials&Continua的其它文章: Swarming Computational Approach for the Heartbeat Van Der Pol Nonlinear System; Factors Affecting Internet Banking Adoption:An Application of Adaptive LASSO; A Study on Small Pest Detection Based on a CascadeR-CNN-Swin Model; Impact of Magnetic Field on a Peristaltic Flow with Heat Transfer of a Fractional Maxwell Fluid in a Tube; An Efficient Stacked-LSTM Based User Clustering for 5G NOMA Systems; Wheat Breeding Strategies under Climate Change based on CERES-Wheat Model