Intelligent Transportation System Real Time Traffic Speed Prediction with Minimal Data

: Purpose: An Intelligent Transportation System (ITS) must be able to predict traffic speed for short time intervals into the future along the branches between the many nodes in a traffic network in near real time using as few observed and stored speed values as possible. Such predictions support timely ITS reactions to changing traffic conditions such as accidents or volume-induced slowdowns and include re-routing advice and time-to-destination estimations. Design/methodology/approach: Traffic sensors are embedded in the interstate highway system in Detroit, Michigan, USA, and metropolitan area. The set of sensors used in this project is along interstate highway 75 (I-75) southbound from the intersection with interstate highway 696 (I-696). Data from the sensors including speed, volume, and percent of sensor occupancy, were supplied in one minute intervals by the Michigan Intelligent Transportation Systems Center (MITSC). Hierarchical linear regression was used to develop a speed prediction model that requires only the current and one previous speed value to predict speed up to 30 minutes in the future. The model was validated by comparison to collected data with the mean relative error and the median error as the primary metrics. Findings and Originality/value: The model was a better predicator of speed than the minute by minute averages alone. The relative error between the observed and predicted values was found to range from 5.9% for 1 minute into the future predictions to 10.9% for 30 minutes into the future predictions for the 2006 data set. The corresponding median errors were 4.0% to 5.4%. Thus, the predictive capability of the model was deemed sufficient for application. Research limitations/implications: The model has not yet been embedded in an ITS, so a final test of its effectiveness has not been accomplished. Social implications: Travel delays due to traffic incidents, volume induced congestion or other reasons are annoying to vehicle occupants as well as costly in term of fuel waste and unneeded emissions among other items. One goal of an ITS is to improve the social impact of transportation by reducing such negative consequences. Traffic speed prediction is one factor in enabling an ITS to accomplish such goals. Originality/value: Numerous data intensive and very sophisticated approaches have been used to develop traffic flow models. As such, these models aren’t designed or well suited for embedding in an ITS for near real-time computations. Such an application requires a model capable of quickly forecasting traffic speed for numerous branches of a traffic network using only a few data points captured and stored in real time per branch. The model developed and validated in this study meets these requirements.


Introduction
As investment in construction and expansion decreases, making better use of urban traffic infrastructure is necessary.
Intelligent Transportation Systems (ITS), an assembly of advanced components that collect, store, process and transmit traffic information for assisting traffic management, has emerged as one way of dealing with this change in approach. An Advanced Traffic Information System (ATIS) is a core component of ITS which relies on modern technology (e.g., wireless communication) to disseminate real-time traffic information to drivers. Several ATIS systems have been developed, such as Visteon's Navmate System (Visteon, 2000), to provide road users with updated information and guide them in selecting the shortest or fastest routes.
Historical and real-time information are collected and applied in meeting vehicle routing objectives. Historical information presents the state of the transportation system during previous time periods. Such information can be used for long-term traffic volume prediction needed for transportation infrastructure planning. Real-time information contains the most upto-date traffic conditions suitable as the basis for short term predictions ranging from a few minutes to a couple of hours in support of operational traffic management. In the absence of this predictive information, drivers are implicitly projecting future conditions based on historical (if they experienced it before) and current traffic information. Therefore, short-term from one to thirty minutes into the future. Only the current speed and the speed at one time preceding were required, equivalent to the use of speed and current acceleration alone. Based on the use of the mean relative error and the median error as the primary validation metrics, this approach was found to be effective.

Background
Numerous data intensive and/or very sophisticated approaches have been used to develop traffic flow models, for example by Min, Wynter and Amemiya (2007). Zhu and Yang (2011) present a visco-elastic model based on mass and momentum conservation in which the elastic effect provides for a higher-order model. Romero and Benitez (2010)  Several different methods have been used to predict and to help mitigate traffic congestion.
One prediction method is based on the Kalman filter algorithm and was first applied by Okutani and Stephanedes (1984) to predict traffic volumes in an urban network. The Kalman filter uses adaptive parameters sensitive to dynamic conditions. The main advantage of this method is that it can update the adaptive parameter to make the predictor reflect the traffic fluctuation promptly.
Innamaa (2001)  Forecasts were better for long links with sub-links than for short links. Nagatani (1993) used a cellular automaton model to study traffic jams induced by an incident which separated traffic flow from traffic stoppages. Computer simulation was used to analyze the model. Arnaout and Bowling (2011) studied the use of vehicle-based adaptive cruise control to avoid traffic jams on a highway using computer simulation.
The issue of routing individual vehicles has been well studied and can be formulated as a short path problem with solution using efficient labeling algorithms (Gallo & Pallottino, 1988;Dijkstra, 1959;Moore, 1959). The labeling algorithm has been improved to solve the shortest path problem with time-dependent link travel times assuming first-in-first-out (FIFO) vehicle movement (Chabini, 1997).
The assumption of the basic model for searching the optimal path is that the link travel times are constant (e.g., deterministic and time-independent). Real-time traffic routing has emerged as a promising approach for ATIS with the latest progress in information technology and telecommunication. For these systems, as soon as traffic conditions change, a reliable routing plan can be generated with the consideration of predicted travel time information rather than purely current condition.
The studies discussed above all used sophisticated, computationally intensive methods that may not be consistent with needs of an ITS to make near-real time speed forecasts using little data. One way of doing this is discussed in the next section.

Model Building and Validation
Model building involved three types of models:

Explanatory Models
Data from calendar year 2006 was used to estimate the parameters of a multi-level explanatory model (MTM) built as follows. Equation 1 is derived as follows. A more direct use of the available data could be accomplished by using equation 2.
S t is the current speed at time t. (3) where r s(t-1) is the residual of the speed calculated at time t+n.

S t n = S t+n + r s t +n
Combining equations 2 and 3 as well as expressing the result in a form that is useful for regression modeling yields equation 4: for any n in (0, 30] minutes, n a real positive number. where r s(t) is the residual of the speed calculated at the current time t. r s t −1 is the residual of the speed calculated at the past time t-1.
Next, the values of the coefficients b 0 , b 1n and b 2n must be estimated. The coefficient b 0 is the average of the residuals which by the definition of residuals is zero.
Coefficients b 1n and b 2n were estimated as follows. Twelve values of the prediction horizon n were considered: 1, 2, 3,4,5,6,7,8,9,10,15, and 30 minutes. For each value of n, b 1n and b 2n were estimated using standard linear regression techniques. For each coefficient b 1n and b 2n , a second-degree polynomial was fit to all twelve points, equations 5 and 6. The correlation coefficients were very good: 0.9545 for b 1n and 0.9736 for b 2n . b 1n =0.0001 n 2 −0.0099n+ 0.467

Predictive Models
The model given in equation 7 can be used for speed prediction at the sensor 66305 upon validation that the predicted speeds match the actual speeds well. This was addressed in several ways.
Ideally, a simple linear regression between the predicted speeds and the actual speeds should produce a fit with a zero intercept and a slope of exactly 1.0. The closer the regression results to this ideal the more valid the prediction model. Figure 2 shows the fit results using JMP software giving an intercept off by only 1/2 MPH (0.8 km/h) and a slope of almost exactly 1.0.
The relative errors between the observed and predicted values, equation 8, assist in validating the model: is the relative speed prediction error for observation i minutes in the future and actual speeds at that time were plotted and can be seen in Figure 3.
The model predictions tend to slightly lag, by 15-30 minutes, the actual values when changes in speed are abrupt and large such as during morning rush hour but the median error is an acceptable 3.1%.
The largest relative median error is in Table 1 is about 10%. A 10% error in forecast speed translates into a travel time forecast error of less than one minute (about 0.85 minute) for a distance of 10 miles with travel at the speed limit of 70mph and for a distance of 5 miles with travel at 35mph. Such forecasting errors are insignificant particularly considering that sensors on an urban freeway network are much closer than 5 miles apart.

Summary and Conclusions
Hierarchical linear regression has been used to develop a model that meets the requirements of an ITS for travel speed prediction: small number of data values observed and stored as well as quick computation. The model requires only two data values which consider current speed and acceleration to predict travel speeds in the range (0, 30] minutes in the future. The primary validation metrics, relative mean error and relative median error, show that the model would be effective at predicting travel speeds and thus resulting travel times. Descriptive models reveal that traffic speed is homogenous for non-holiday week days throughout the year. Thus, one year worth of data concerning such days was used to support model building and validation.