I am working in a project dealing with the forecasting of traffic congestion
in urban road networks, for use in a real time drivers' information system.
I used formulations like

^y(t+1) = a1.y(t) + a2.y(t-1) + ..... +
          b11.q1(t) + b12.q1(t-1) + ..... +
          b21.q2(t) + b22.q2(t-1) + ..... +
          c1.yh(t+1) + c2.yh(t) + ...
          d11.qh1(t) + d12.qh1(t-1) + ..... +
          d21.qh2(t) + d22.qh2(t-1) + ..... + (1)

where ^y(t+1) the predicted level of flow at time t+1 at the link of interest
       y(t) the measured flow at the link of interest
       qi(t) the measured flow at link i upstream of the link of interest
       yh(t) historical flow for the link of interest at time t
       qhi(t) historical flow for link i at time t
              (historical flow (t) = average flow(t) over the last few days)

and kalman filter theory to identify the system parameters which are assumed to
be time varying.
I used several formulations (differnt terms of the AR and MA components) but
I did not find any formulation that consistently performs better.

I then tried to predict the ratio between predicted and historical
flow ^ry(t+1)=^y(t+1)/yh(t+1) and then from this ratio to predict the ^y(t+1)
using formulations like
  ^ry(t+1) = A1.ry(t) + A2.ry(t-1) + ...
             B11.rq1(t) + ........
             B21.rq2(t) + ....

   ^y(t+1) = ^ry(t+1).yh(t+1)

where rqi(t) = qi(t)/qhi(t)

I found that the latter approach consistently performs better than the former.
There is often a regularity in the patterns observed every day, but given the
fact in both formulations I used historical info I didn't expect significant
differences in the forecasts.
Can anybody help me on how I can a develop a theoretical proof of this outcome?
Do you know of any papers discussing a relevant subject?

Thanks very much,

Petros Vythoulkas
Transport Studies Unit
University of Oxford