library(forecast) library(lmtest) library(caret) #State Unemployment seasonally adjusted #http://www.quandl.com/FRED-Federal-Reserve-Economic-Data/WIUR-Unemployment-Rate-in-Wisconsin #Using Quandl data, great little site wi<-read.csv('http://www.quandl.com/api/v1/datasets/FRED/WIUR.csv?&auth_token=gigXwpxd6Ex91cjgz1B7&trim_start=1976-01-01&trim_end=2013-04-01&sort_order=desc', colClasses=c('Date'='Date')) #some minor clean up colnames(wi)<-c('date', 'rate') wi$date<-as.Date(wi$date) summary(wi) #base data, 1-436, test data 437-448 wi.b<-wi[1:436,] wi.p<-wi[437:448,] wi.ts<-ts(wi.b$rate, start=c(1976, 1), frequency=12) wi.p.ts<-ts(wi.p$rate, start=c(2012, 5), frequency=12) plot.ts(wi.ts) #Lets test some models mean<-meanf(wi.ts, 12) naive<-rwf(wi.ts, 12) s.naive<-snaive(wi.ts, 12) drift<-rwf(wi.ts, 12, drift=T) #linear fit m1<-tslm(wi.ts~trend) m2<-tslm(wi.ts~trend+season) #checking for autocorrelation res1 <- residuals(m1) par(mfrow=c(1,2)) plot(res1, ylab="Residuals",xlab="Year") Acf(res1, main="ACF of residuals") res2 <- residuals(m2) par(mfrow=c(1,2)) plot(res2, ylab="Residuals",xlab="Year") Acf(res2, main="ACF of residuals") par(mfrow=c(1,1)) #Durbin-Watson Test dwtest(m1, alt="two.sided") dwtest(m2, alt="two.sided") #yep autocorrelation city! No surprize here, due to the nature of unemployment #STL ETS Decomposition m3<-stl(wi.ts, s.window='periodic') plot(m3) m4<-ets(wi.ts, model='ZZZ') plot(m4) #ARIMA m5<-auto.arima(wi.ts) plot(forecast(m5, h=12))
#neural networks m6<-nnetar(wi.ts) m6 plot(forecast(m6, h=12)) #Testing for accuracy the first 4 models a1<-accuracy(mean, wi.p.ts) a2<-accuracy(naive, wi.p.ts) a3<-accuracy(s.naive, wi.p.ts) a4<-accuracy(drift, wi.p.ts) a.table<-rbind(a1, a2, a3, a4) #Creating the forecast and accuracy for the next 6 models f1<-forecast(m1, h=12) f2<-forecast(m2, h=12) f3<-forecast(m3, h=12) f4<-forecast(m4, h=12) f5<-forecast(m5, h=12) f6<-forecast(m6, h=12) a5<-accuracy(f1, wi.p.ts) a6<-accuracy(f2, wi.p.ts) a7<-accuracy(f3, wi.p.ts) a8<-accuracy(f4, wi.p.ts) a9<-accuracy(f5, wi.p.ts) a10<-accuracy(f6, wi.p.ts) #Combining into a table with row names a.table<-rbind(a.table, a5, a6, a7, a8, a9, a10) row.names(a.table)<-c('Mean', 'Naive', 'S. Naive', 'Drift', 'Lm~Trend', 'Lm~Trend+Sea', 'STL', 'ETS', 'ARIMA', 'Neuro') #make into a data frame so the best model is first, according to MAPE a.table<-as.data.frame(a.table) a.table<-a.table[order(a.table$MAPE),] a.table
Results so far: Looks like the mean like forecasts are doing the best, the fancy models are not doing very well.
Should it not be:
ReplyDeletewi.b<-wi[13:448,]
wi.p<-wi[1:12,]
As the last 12 observations are the earliest on my data set.
Good stuff.
ReplyDeleteWe detected (automatically) a breakpoint in the parameters of the model (using some of the "fancy stuff"-Chow test)beginning at 1987/11. We then did a plot of the data to confirm this finding. Our final model was an AR1(.536) with differencing again developed automatically.
[(1-B**1)]Y(T) =+ [(1- .536B** 1)]**-1 [A(T)]
The MAPE was 1.92 for the 12 wittheld obs with a constant forecast of 6.9 using the last 292 observations to do the analysis.
Here is the Chow test with the most recent being the one used (if significant)
DIAGNOSTIC CHECK #4: THE CHOW PARAMETER CONSTANCY TEST
The Critical value used for this test : .01
The minimum group or interval size was: 142
F TEST TO VERIFY CONSTANCY OF PARAMETERS
CANDIDATE BREAKPOINT F VALUE P VALUE
143 1987/ 11 33.584 .0000000000*
154 1988/ 10 3.5030 .0155049592
165 1989/ 9 2.7945 .0399948481
176 1990/ 8 3.2434 .0219736806
187 1991/ 7 2.4196 .0656220872
198 1992/ 6 2.6833 .0463470469
209 1993/ 5 2.1367 .0949460656
220 1994/ 4 2.5293 .0568032208
231 1995/ 3 2.3000 .0767510868
242 1996/ 2 1.3555 .2559143166
253 1997/ 1 1.4750 .2207140364
* INDICATES THE MOST RECENT SIGNIFICANT BREAK POINT: 99% CONFIDENCE LEVEL.
IMPLEMENTING THE BREAKPOINT AT TIME PERIOD 143: 1987/ 11
THUS WE WILL DROP (DELETE) THE FIRST 142 OBSOLETE OBSERVATIONS
AND ANALYZE THE MOST RECENT 294 STATISTICALLY HOMOGENOUS OBSERVATIONS
Tom Reilly
www.autobox.com