# Feed Forward Neural Networks

From Rave Documentation

# Introduction

The Feed Forward Neural Network created with this model has one hidden layer and one output layer (linear transfer function). You must have the Neural Network Toolbox installed to create this model. This model corresponds to the "newff" function in MATLAB.

# Model Settings

- Improved Generalization
- Regularization - Check this box to enable "regularization." This reformulates the performance function as P = g*mse + (1-g)*msw where mse is the mean sum of squares of network errors, msw is the mean sum of squares of network weights and biases, and g is the performance ratio. You can specify the value of g using the edit box to the right of the word "Regularization". This value must be between 0 and 1. A value of 1 is equivalent to unchecking the regularization box.
- Note: If you are using the Bayesian Regularization training algorithm, it will automatically control the regularization and the value you enter in this box doesn't matter (it also doesn't matter whether you check the Regularization box or not).

- Regularization - Check this box to enable "regularization." This reformulates the performance function as P = g*mse + (1-g)*msw where mse is the mean sum of squares of network errors, msw is the mean sum of squares of network weights and biases, and g is the performance ratio. You can specify the value of g using the edit box to the right of the word "Regularization". This value must be between 0 and 1. A value of 1 is equivalent to unchecking the regularization box.

# Command Line Output

Depending on the training algorithm you have selected, some of the statistics listed here will be printed to your main MATLAB command window (not your Rave window).

- Validation Checks x/y - x is the number of successive training iterations for which the error of the validation set has increased. If x>=y, the training algorithm will stop.
- Mu - a weighting coefficient that indicates whether the training algorithm is acting more similarly to Newton's method (small Mu) or gradient descent with small step size (large Mu). Mu is decreased after successful steps and increased if a tentative step would increase the performance function.