A PSO-GRNN model for railway freight volume prediction : Empirical study from China

Purpose: In this study, we aim to design a mathematical model for railway freight volume prediction which can provide an accurate direction for railway freight resource allocation. Based on the precise prediction, railway freight enterprises are able to optimize and integrate the limited resources to organize freight transport more efficiently and economically. Design/methodology/approach: In this study, we design a PSO-GRNN model to predict railway freight volume. In the proposed model, GRNN carries out the nonlinear regression analysis between railway freight volume and its influencing factors which can be described by detailed and measurable indexes and outputs the prediction value. PSO algorithm with time linear decreasing inertia weight and time varying acceleration coefficients is applied to optimize the basic GRNN model by searching the optimal smoothing parameter. Findings: The simulation result in this study indicates that (1) the PSO-GRNN model is able to predict railway freight volume by using the value of other relevant indexes as its input to fit the variation of railway freight volume; (2) Through optimization for GRNN model based on PSO algorithm, the proposed model performs well prediction accuracy; (3) Compared with RBFNN model and BPNN model, the superiority of the proposed model in prediction accuracy and curve fitting capacity is verified.


Introduction
Prediction is a kind of estimation about the development mode that how things are going on in the future.As an essential guidance towards social production, prediction has always been a highlight in academic research field.As for railway freight volume prediction, the widely applied methods have been limited to the traditional ones, such as time serial analysis, regression analysis, and gray prediction model and so on.However, with the scale of data becoming larger and requirement for prediction accuracy becoming higher, limitations of these traditional methods appear to be obvious.Therefore, many researchers have paid attention to the prediction model optimization.
With the development of artificial intelligence in recent years, neural networks have already made great contributions to prediction field, due to their high prediction accuracy and calculation efficiency and so on.Some relevant studies are given as follows.Korres, Anastopoulos, Lois, Alexandridis, Sarimveis and Bafas (2002) focused on applying radial basis function neural network (RBFNN) model to undertake the diesel fuel lubricity prediction.In this study, six fuel properties were regarded as the input variable The experimental results verified that the RBFNN model could use other fuel properties as input values to approximate the lubricity.Liang and Noore (2004) gave a Bayesian regularization optimized recurrent neural network modelling method for software reliability prediction.
Measured by software cumulative failure time, the software reliability was able to be better predicted by the proposed model when compared with the existing neural network models.Wang, Wang, Jia and Li (2005) designed a 4-layer modified BP neural network (BPNN) model for railway passenger volume time serial prediction.In this study, the basic BPNN model was optimized by adaptive learning rate algorithm and momentum BP algorithm.The relative errors of the testing samples calculated by the modified BPNN model were less than the basic BPNN model.Liu, Ji, Ye and Geng (2006)  In brief, substantial accomplishments have been achieved in the prediction field.And the application of neural networks in prediction trends to be mature.All these studies above laid a solid foundation for the research in our study.
The rest sections of this study are organized as follows.First we design a GRNN model for railway freight volume prediction in Section 2 and apply PSO algorithm to optimize the basic GRNN model in Section 3. Then an empirical study is given in Section 4 to verify the feasibility of PSO-GRNN model.In Section 4, a railway freight volume prediction index system containing seventeen indexes is established and the proposed model is constructed to predict the railway freight volume from 2007 to 2011.For comparison, BPNN model and RBFNN model are also used in Section 4 to predict the railway freight volume during the same period.Finally, we present the conclusions of this study in Section 5.

Generalized regression neural network and its mathematical basis
Generalized Regression Neural Network (GRNN) was first proposed by Specht (1991).It is a kind of artificial neural networks which use a brain synapse-like structure to manage the information (Tsuda, 1992) and is often used for function approximation (Liu, Ji, Ye & Geng, 2006).
GRNN has many advantages, such as fast learning, well nonlinear mopping capacity, flexible network structure, high fault-tolerance and well robustness (López-Martín, Isaza & Chavoya, 2012).These advantages become outstanding when the data is in a large scale.Therefore, GRNN model has been widely applied in many fields, such as classification, structure analysis, bioengineering and so on.
The mathematical basis of GRNN model for railway freight volume prediction is nonlinear regression analysis between railway freight volume and its influencing factors.The regression analysis between the railway freight volume of a year (dependent variable) y relative to its corresponding index vector of the influencing factors (independent variable) X can be calculated by equation ( 1). (1) Where ŷ(X) represents the prediction value of y, and f(X,y) is the joint probability density function between X and y.
In equation ( 1), the expression of f(X,y) is unknown.But it can be estimated from the training sample set by Parzen estimation (Parzen, 1962), which is as shown in equation ( 2). (2) Where is the estimation of f(X,y), n represents the number of training samples, m represents the dimension of the index vector, σ represents the smoothing parameter, and disti is the Euclidean Distance between X and Xi, which can be calculated by equation ( 3).

Modeling process of GRNN model for railway freight volume prediction
GRNN is a feed forward neural network composed of 4 layers, including input layer, pattern layer, summation layer and output layer.Each layer contains many neurons by which different layers can connect with each other.The topological structure of GRNN model is as shown in Figure 1.Based on its topological structure, the modeling process of GRNN model is given as follows.
Step 1. Input the index vector of the influencing factors of a year by the input neurons.
Step 2. Calculate the output value of each pattern neuron.
Weights from input neurons to pattern neurons are set to 1.So the input index vector will be transferred to each pattern neuron directly.The ith pattern neuron Pi corresponds with the ith training sample and uses radial basis function as its transfer function to calculate its output value by equation ( 5). (5) Step 3. Calculate the output value of the summation neurons.
There are 2 neurons in the summation layer, including simple arithmetic summation neuron Sa and weighted summation neuron Sw.Weights from pattern neurons to Sa are set to 1, while weights from pattern neurons to Sw are set to yi which is the output value of the ith training sample.So the output value of the two neurons can be calculated by equation ( 6) and equation ( 7), respectively.
In this step, the output value of Sw will be divided by the output value of Sa.The prediction value can be gained by equation ( 8). (8) As we can see from the modelling process above, smoothing parameter σ is the only parameter ought to be set manually.The prediction accuracy and the generalization capacity of GRNN model are both sensitive to its setting, which can be seen from Figure 2. the optimal smoothing parameter in this study.

Particle swarm optimization algorithm
In order to optimize the performance of GRNN model and reduce the prediction error, we apply particle swarm optimization algorithm to search the optimal smoothing parameter to construct the GRNN model.
Particle Swarm Optimization (PSO) algorithm was proposed by Kennedy and Eberhart (1995).
It has already been widely used in the optimization fields for its simplified operation and less parameters relying on individual experience (Hou, Lu, Xiong, Cheng & Wu, 2004).
PSO algorithm simulates a simplified social model that is composed of a group of particles.The particle has two parameters, including position and velocity.In the solution process of an optimization problem, these particles will modify their positions according to their own learning experience and their neighbours and finally find their best positions (Hashemi & Meybodi, 2004).

Design for the PSO algorithm in the optimization
In the optimization for GRNN model based on PSO algorithm, the smoothing parameter is set as the position of the particles and the searching space is one dimensional.The parameters in the tth iteration process of PSO algorithm are set as follows.The personal position of the kth particle is pk(t) and its velocity is vk(t).The best personal position of the particle is pbestk(t) and the global best position of the swarm is gbestk(t).
In PSO algorithm, Mean Square Error (MSE) between prediction values and actual values of the testing samples reflects the prediction accuracy of the model and can be used to judge the quality of σ.Therefore, the fitness function can be defined as equation ( 9). ( Where Q represents the size of testing samples.yq, and ŷq (t ) represent the actual value and prediction value of the qth testing sample, respectively.
The value of the fitness function will be calculated in every iteration.If PSO algorithm doesn't reach the termination condition (usually the maximum iteration time), the velocity and position of the particle will be modified by equation ( 10) and equation ( 11) in the (t+1)th iteration, respectively.
In the basic PSO algorithm, acceleration coefficients and inertia weight are usually constants.
But the performance of PSO algorithm is sensitive to the setting of these parameters.The  13) and ( 14), respectively.
A better solution of PSO-GRNN model can be gained by combining these two algorithms above.
When completing the (t+1)th iteration, the best personal position of the particle and the global best position of the swarm will be updated according to the value of the fitness function by equation ( 15) and equation ( 16), respectively.
Above all, the solution process of PSO-GRNN model is as shown in Figure 3.

Establishment of the railway freight volume prediction index system
There are various factors that influence railway freight volume.In this study, we mainly take five aspects into consideration, including social economic factors, policy factors, goods source factors, railway freight supplement factors and other transport mode factors.Each influencing factors can be described by detailed and measurable indexes.The railway freight volume prediction index system is established as shown in Table 1.

Social economic factors
Gross As we can see from Table 2, the initial data of the samples vary from each other in both values and units.So the normalization of the initial data by equation ( 17) is very necessary, which can not only reduce the prediction error of GRNN but also improve the calculation efficiency and reduce its calculation time (Leeghim, Seo & Bang, 2008).

Railway freight volume prediction simulation
In this study, we use Matlab R2012b to carry out the simulation by Lenovo Laptop with Intel Core i5 3235M 2.60GHz CPU and 4GB RAM.First, PSO-GRNN model is applied to complete the prediction simulation, in which "newgrnn()" is applied by Matlab to construct a GRNN without training error.In the simulation, the parameters of PSO algorithm are set as shown in Table 3.In the iteration process, variation of the global best position of the swarm and fitness value of normalized testing sample data are as shown in Figure 4 and Figure 5, respectively.As we can see from the two figures below, the global best position of the swarm (the optimal smoothing parameter) is 0.8233, and the minimum MSE of the testing sample data is 0.003771.The prediction results of the three prediction models above are given in The analysis above indicates that PSO-GRNN model has both better prediction accuracy and better curve fitting capacity when compared with RBFNN model and BPNN model, which can be also seen from Figure 7. Therefore, PSO-GRNN model we design in this study is more appropriate for railway freight volume prediction.Railway freight volume is influenced by various factors.In the prediction, a railway freight volume prediction index system containing seventeen prediction indexes is established as shown in Table 1.Index vector of the influencing factors in the index system is the input of PSO-GRNN model and its output is the prediction value of railway freight volume.In the proposed model, GRNN calculates the prediction value and PSO algorithm is used to optimize GRNN model to improve its prediction accuracy by searching the optimal smoothing parameter.
The proposed model has significant superiority to the widely applied RBFNN model and BPNN model.The following conclusions can be drawn after simulation.
• The PSO-GRNN model we design in this study is able to predict railway freight volume by using the value of other relevant indexes as its input to fit the variation of railway freight volume.
• Through optimization for GRNN model based on PSO algorithm, the proposed model performs well prediction accuracy with its MRE and MSE which is 1.36% and 0.003771, respectively.
• Besides, the railway freight volume prediction in this study limits to the total volume prediction.In order to make the railway freight volume prediction more meaningful in railway network construction and railway operation and management, we should expend the railway freight total volume prediction to the railway freight volume prediction in the space distribution of railway network in the future research.
compared the prediction performance of RBFNN model, BPNN model and gray prediction model in railway freight volume time serial prediction.Using data from 1989 to 2002 as training samples, RBFNN model had the minimum mean relative error of testing samples in 2003 and 2004 when compared with the rest two models.Adeli and Panakkat (2009) analyzed the influencing factors of earthquake magnitudes and presented a probabilistic neural network (PNN) model for predicting the earthquake magnitudes using eight elements as the seismicity indexes.The empirical study indicated that the model had different prediction accuracy for the earthquakes with magnitudes in different ranges.Brahme, Winning and Raabe (2009) presented a 3-layer BPNN model for the prediction of cold rolling textures of steels by inputting the values of sixteen variables, including carbon content, carbide size and so on.The experimental results in this study stressed the importance of the selection of the training data set for the prediction accuracy of the model.Ömer Faruk (2010) designed an ARIMA and BPNN based hybrid model for water quality time serial prediction.Using water temperature, boron and dissolved oxygen as the output variables, the hybrid prediction model could fit the relationship between the output variables and the time.The prediction accuracy of the hybrid model was more satisfactory than the traditional ARIMA model and BPNN model.Chen and Vachtsevanos (2012) modified fuzz neural network (FNN) and constructed an interval Type-2 FNN model for the prediction of bearing health conditions.The prediction results of the proposed model were compared with the widely used adaptive neuro-fuzzy inference system.The comparison indicated that the interval Type-2 FNN model had better prediction accuracy.Okkan (2012) optimized wavelet neural network (WNN) model by Levenberg-Marquaradt (L-M) algorithm based feed forward neural network (FFNN) and applied the proposed model to predict monthly reservoir inflow.The results in this study showed that the WNN model was a more appropriate tool to model the monthly inflow series of dam and had better prediction accuracy than FFNN model and multiple linear regressions model.Mostafa (2012) proposed multi-layer perceptron neural network (MPLNN) and generalized regression neural network (GRNN) to undertake KSE closing price movement prediction.The feasibility and accuracy of neural network model in the stock exchange movement prediction were proved superior to the traditional regression analysis and ARIMA.Gnana Sheela and Deepa (2013) constructed a hybrid wind speed prediction model based on MLPNN and selforganizing map neural network (SOMNN).The proposed model possessed higher prediction accuracy and less error than the traditional MPLNN model, BPNN model and RBFNN model, which verified the proposed model can improve both the accuracy of prediction and the convergence rate.

Figure 2 .
Figure 2. Influence of smoothing parameter on GRNN model (data from Section 4) basic setting does not always satisfy the demand of the optimization problems, which requires parameter optimization for the basic PSO algorithm.Inertia weight is the inheritance of the last velocity.It has a great influence on both global and local searching capacity of PSO algorithm.So in order to make PSO algorithm have a well global and local searching capacity in the initial and final stage of the iteration process, respectively, we use time linear decreasing inertia weight algorithm (Nie, Ji & Qin, 2011) to modify the inertia weight in the iteration process.The expression of the algorithm is equation (12).(12) Where wmax and wmin represent the maximum and minimum value of the inertia weight, T represents the maximum iteration.As for acceleration coefficients, Ratnaweera, Halgamuge and Watson (2004) proposed a time varying acceleration coefficient algorithm.This method can prevent PSO Algorithm from getting into local minimum in the initial stage of iteration process and accelerate its convergence in the best global position in the final stage.In this algorithm, C1 and C2 are modified by (

Figure 4 .
Figure 4. Global best position of the swarm in every iteration

Figure 6 .
Figure 6.Training performance of BPNN model with given parameters

Figure 7 .
Figure 7. Curve fitting for the target volumes of the three prediction models Compared with RBFNN model and BPNN model, the superiority of the proposed model in prediction accuracy and curve fitting capacity is verified.Therefore, the proposed model can be promoted to the practical railway freight transport organization.Although PSO-GRNN model presents a well prediction performance in this study, it consumes much longer calculation time than RBFNN model and BPNN model.The calculation time mainly relies on two factors, including the parameter setting of PSO algorithm and the scale of sample data.Based on the defect of our study, the future research should focus on the improvement of the calculation efficiency of the proposed model.Two approaches should be attached great importance to in the future research.One approach is to modify parameter setting of PSO algorithm; the other one is to optimize the data processing.The latter can be realized by improving the manually established railway freight volume prediction index system and decreasing the dimension of sample data by rough set theory or other methods.

Sample data and normalization of the initial data
In this study, data from 1991 to 2006 are selected as training samples and data from 2007 to 2011 are used as testing samples.Table2shows a part of the initial data of the samples.The source of the data is from China Statistical Yearbook which is available on website http://www.stats.gov.cn/tjsj/ndsj/.

Table 2 .
Railway freight volume and initial data of railway freight volume prediction indexes

Table 3 .
Parameter setting of PSO algorithm

Table 4 .
The Mean Relative Error (MRE) of the prediction volumes and MSE of the normalized testing sample data are also presented in the table.

Table 4 .
Railway freight volume prediction resultsFrom Table4, we can find that PSO-GRNN model has the minimum MRE and MSE compared with RBFNN model and BPNN model.The MRE of PSO-GRNN model is 0.49 of RBFNN model and 0.47 of BPNN model.The MSE of PSO-GRNN model is 0.1 of both RBFNN model and BPNN model.Although RBFNN model and BPNN model can gain very high prediction accuracy in some years, their prediction errors are very large in the other years.The difference between