Sunday, October 21, 2007

Results

Three different neural network algoritms were used, each of them was tested in a 10-fold cross-validation procedure in order to estimate the generalization error of the model. Both mean square error (MSE) and root mean square error (RMSE) were computed while the RMSE was used as the evaluation instrument. The results are presented in Tables 2, 3, and 4.



It can be seen from Table 2 that the average MSE of 10 MLP neural networks tested in the cross validation resampling procedure was 0.009879, while the average RMSE was 0.089345. As suggested by [9], the average RMSE of all 10 networks is used as the population error. The test result obtained on the whole train sample (450 cases) is reffered as the apparent error. As presented in Table 2, the apparent error of the tested MLP network is 0.289761. The excess error of the MLP network, computed as the difference among the population error and the apparent error [9] was - 0.200416. Since the average RMSE obtained on the test sample in the cross validation resampling procedure contains a bias due to the selected size of the network, the final validation of the network models is conducted on the hold-out test sample. The average RMSE obtained on the hold-out sample was 0.216812. The topology of the NN that produced the best out-of-sample performance consisted of 5 input units, 6 hidden units, and 1 output. The dimension of hidden units was reduced by pruning, while the number of input variables was reduced by the sensitivity analysis. Selected input variables were: 3-month treasury (3MT), industrial production index (IPI), inflation rate (INFL), consumer price index (CPI), and S&P500 index (SP500). Table 3 presents the results of the RBF network tested in the cross validation resampling procedure.



The average MSE of all ten RBF networks was 0.006445, while the average RMSE was 0.075413. The apparent error was 0.08979, resulting with the excess error of -0.014376. RBF network that produced the best out-of-sample RMSE consisted of 5 input units, 15 hidden and 1 output unit. Selected input variables were: discount rate (Disc), 3-month treasury rate (3MT), 10-year treasury rate (10YT), oil price (OIL), and production price index (PPI). The results of the GRNN tested in the cross validation resampling procedure are presented in Table 4.



The average MSE of all ten GRN networks was 0.008004, while the average RMSE was 0.085957. The apparent error was 0.070212, resulting with the excess error of 0.015745. GRNN that produced the best out-of-sample RMSE consisted of 15 input units, 270 units in the hidden, 2 hidden units in the summation/division layer, and 1 output unit. All fifteen input variables were selected as important for the GRNN model.
It is evident that the RBFN model produced the lowest generalization error (average cross-validation RMSE and average out-of-sample RMSE (0.053904)). It is surprising that the average cross-validation (i.e. population) errors of the RBFN and MLP networks are lower than their apparent errors, therefore producing negative excess errors. It is the opposite with the GRNN network, which gives higher average cross-validation RMS error than the apparent error.
When the influence of the input variables to the model is concerned, each of the three best NN models extracted different set of important predictors. MLP and RBFN both extracted only five input variables as important to the model, while the GRNN did not reduce its input space.

0 comments: