|
· |
Regression
Node (Least-Squares): The scoring code that generates the multiple linear
regression estimates from the HMEQ data set. The score code can be
used in calculating new prediction estimates by specifying entirely
different values to the input variables in the multiple linear
regression model. The scored code will first identify any missing
values in each one of the input variables in the multiple linear
regression model. If there are any missing values in any one of the
input variables, then the target variable is estimated by its own
average value. The scored code then displays the least-squares model
with each input variable in the model and the associated parameter
estimates with the intercept term is then added to the model
to calculate the predicted values. The residual values are then
calculated by calculating the difference between the target values
and the fitted values.
|
|
· |
Regression
Node (Logistic): The scoring code that generates the logistic
regression estimates from the HMEQ data set that is one of the
models under comparison from the Assessment node.
|
|
· |
Tree
Node: The scoring code that generates the decision tree modeling
estimates by fitting the binary-valued target variable, i.e.
bad creditors, from the HMEQ data set. The SAS scoring code displays
the recursive splits of the series of if-then partitioning rules of
the input variables with the listed target proportions that are
performed in creating the decision tree that are created from the
range of values of the input variables in the model to predict the
binary-valued target variable. Simply copy the SAS program code into
a separate SAS program to calculate entirely different
classification estimates by fitting the decision tree model to new
data and a new set of input values. The decision tree model is
one of the models under comparison from the Assessment
node.
|
|
· |
Neural
Networks Node: The scoring code that generates the neural
networks estimates from the HMEQ data set. The scoring code will set
the target variable will be estimated by its own mean if there are
any missing values in any one of the input variables in the model.
The score code will then standardizes each input variable in the
model. The code then generates the linear combination of input layer
weight estimates with the previously computed standardized input
variables for each hidden layer unit. The input layer bias term is
added to each hidden layer. The hidden layer weight estimates are
applied to the linear combination of weight estimates and
standardized input variables with the activation function applied to
each hidden layer unit. The hidden layer units are multiplied by the
hidden layer weight estimates that are added together along with the
hidden layer bias term to generate the final neural network
estimates.
The neural network model is one of the models under
comparison from the Assessment node.
|
|
· |
Princomp/Dmneural
Node (Dmneural): The scoring code that generates the dmneural
network modeling estimates from the HMEQ data set. The scoring code
will first display the separate dummy variables that are created for each
class level from the categorical-valued input variables in the
model. This is followed by imputing missing values from the
interval-valued input variables in the model. The interval-valued
input variables in the model are then standardized since that input
variables display a wide range of values. The code will then display
the principal component scores for each input variable in the model
at each stage of the iterative model. The code then calculates the
fitted values from the squared activation function that is selected at each stage of the
model. The predicted values from the additive nonlinear model are calculated
by adding the fitted values from the first stage and the residual
values in the following stages to the iterative model.
|
|
· |
Princomp/Dmneural
Node (Principal Components): The scoring code that generates the
principal components estimates from the 2004 major league
baseball hitters. The scoring code will display up to two
separate principal components that were selected from the node and
the corresponding scree plots that are generated from the
node.
|
|
· |
User-Defined
Node (PROC GENMOD): The scoring code that generates the user-defined modeling
estimates that are generated from the PROC GENMOD procedure by
fitted the logistic regression model in predicting the binary-valued
target variable bad clients, BAD, from the HMEQ data set.
|
|
· |
User-Defined
Node (time series): The scoring code that generates the user-defined modeling
estimates from the PROC ARIMA procedure by fitted the time
series model in predicting the lead production over time.
|
|
· |
Ensemble
Node (Combined): The scoring code that generates the ensemble modeling
estimates by combining the previous logistic regression. neural
network, and decision tree models. The ensemble model is one of the
models under comparison from the Assessment node.
|
|
· |
Ensemble
Node (Combined): The scoring code that generates the ensemble modeling
estimates by combining the modeling estimates from the multiple
linear regression model and the neural network model from the HMEQ data set.
In other words, the code will display the corresponding scoring code
from the multiple linear regression and neural network models. The fitted values to the ensemble model are calculated by taking the
average of the two separate fitted values.
|
|
· |
Ensemble
Node (Stratified): The scoring code that generates the
stratified modeling technique that combines the multiple linear regression modeling
estimates by separating or partitioning the training data set that
you want to fit. In other words, separate models are created for
each level of segmentation or partitioning of the data that you want
to fit.
|
|
· |
Ensemble
Node (Bagging): The scoring code that generates the bagging
estimates that is analogous to bootstrapping where separate
prediction estimates are created by resampling the data that you want
to fit by combining the prediction estimates for the multiple
linear regression model.
|
|
· |
Ensemble
Node (Boosting): The scoring code that generates the boosting
model from the logistic regression model by fitting
the categorical-valued target variable where the observations are
weighted. In other words, the observations are modified by
increasing the weight estimates for each observation that have been
misclassified from the previous fit.
|
|
· |
Memory-Based
Reasoning Node: The scoring code that generates the nearest
neighbor modeling estimates from the HMEQ data set. The scoring code
will display the PROC PMBR procedure with the listed option setting
like the smoothing constant to the nearest neighbor model.
|
|
· |
Two-Stage
Model Node: The scoring code that generates the two-stage
modeling estimates by fitting the decision tree classification
model, then fitting the subsequent multiple linear regression model
from the HMEQ data set.
|
|
|
|
|
|
Back
to Page
|