|
A new feature implemented in KnowledgeMiner (yX) for Excel is
on-the-fly evaluation of self-organized regression models based on the concept of noise immunity.
Also, a new model quality measure - Descriptive Power - is introduced which takes
into account the risk of obtaining a chance model, only, from a given data set.
The Problem
A key problem in knowledge discovery from data is final evaluation of generated models.
To know whether the obtained model is likely to adequately reflect an input-output relationship that exists in reality
or if it's just a chance model with noncausal correlations is essential for applying models obtained by data
mining in real systems. However, it is not possible to get this information out
of the modeling algorithm the model was generated with, only. Some new, external
information is required.
Why
Let's have a look at this example: There is a data set of 2 outputs, y1 and y2, and 10 inputs xi,
and two models M1 and M2 were generated by some
regression-based modeling method (fig.1; red line: the model, blue line (almost hidden): the original data).
Fig. 1: Model graph of the two models M1 and M2.
For both models an accuracy (model fit on the learning data, R2,
for example, or a more complex criterion like PSE, AIC or BIC) of 99,9% is reported.
Concluding from this accuracy and from the graphs of fig. 1 there is no reason to not considering
both models as "true" models that reflect a causal relation between
output and input. Now, assume that there is information that only one model
actually describes a causal relationship while the other model simply reflects
stochastic correlations. Although this information is given to you -
which is usually not the case in real-world - you cannot decide
from the available information which of the two models is the true model and which one
is the chance model. Only applying the models on some new data (which adds new information) will turn out
model M2 as the only valid model (fig. 2):
Fig. 2: Prediction of models M1 and M2.
The Noise Filtering Behavior of an Algorithm
This example shows that any closeness-of-fit criterion measured on a set of learning data does not
suffice to evaluate a model's predictive and descriptive power. Recent
research has shown that model evaluation requires at least a two stage validation approach:
1. Level
A noise filtering mechanism as an integrated part of the modeling process
to avoid overfitting the model to the learning
data by employing information which was not already used for building the
model (the concept of an external criterion) .
2. Level
Since noise filtering in level 1 cannot be seen as ideal noise filter a characteristic is
required that describes the true noise filtering behavior of the modeling
algorithm to provide additional external information for model validation.
Such a noise sensitivity or noise immunity characteristic has been obtained
for and implemented in KnowledgeMiner (yX) for Excel.
The noise sensitivity characteristic expresses a pretending model quality
Qu that can be obtained when simply using a data set of M potential
inputs of N random samples. It is pretending model quality (accuracy), because, by definition,
there is not any causal relationship between stochastic variables a priori (true and
best model quality Q = 0, by definition), so - when using random samples -
any model of quality Q > 0 just pretends having that better quality and having the found
input-output relationship while we know that it actually does not exist. This means, given a
number of potential inputs M and a number of samples N, a threshold quality
Qu = f(N, M) can be calculated that any model's quality Q
must exceed to be considered valid in that it likely describes a relevant relationship
between input and output.
Otherwise, a model of quality Q <= Qu is assumed invalid, since its
quality Q can also be obtained by a chance model.
Figure 3 shows the noise sensitivity characteristics of different modeling
algorithms for comparision and figure 4 gives an example to explain the concept of
noise sensitivity characteristic.
Fig. 3: Noise sensitivity characteristics of different modeling algorithms.
Fig. 4: The concept of noise immunity.
Descriptive Power
In addition to deciding if a model appears being valid or not, the
noise sensitivity characteristic is also a tool for calculating the
descriptive power of an input-output model, directly. It introduces a new model
quality measure, which is adjusted by model complexity and the algorithm's
noise sensitivity behavior and which, finally, is independent from the learning data set dimensions.
The Descriptive Power (DP) is defined as:
whith Q as the obtained accuracy of the evaluated model and Qu(N, L) as the reference
accuracy calculated from the number of samples N the model was created on and
the number of input variables L the model is actually composed of
(selected relevant inputs), with L <= M.
Figure 5 shows an example of two models M1 and M2
which show the same accuracy Q = Q1 = Q2 but different
Descriptive Power since both models where obtained from data sets of different
sample lengths, and thus, different noise immunity of the modeling algorithm.
Fig. 5: Descriptive Power of two models.
Model Evaluation
The concept of an algorithm's noise sensitivity and Descriptive Power provide
additional external information required to check a model's validity with respect to
whether or not it distinguishes from a chance model and to which extent. Back to
the example at the top of this page this means that it is possible now to
identify suspect models right after modeling, automatically. For model M1
the following evaluation result might have been reported in KnowledgeMiner (yX):
MODEL EVALUATION: INVALID
The requested noise immunity could not be applied for the chosen sample length.
Instead, a POOR noise immunity was used for modeling, only. To get the requested noise
immunity increase the number of samples to at least 116.
The model seems not reflecting a valid relationship. The likelihood that
the data used for modeling is actually random data with no existing
input-output relationship is 66%.
Keep in mind, however, that the model was built using POOR noise immunity.
This makes evaluation of the model more uncertain.
The model was generated by self-organizing high-dimensional modeling.
The implemented model evaluation approach is a very powerful tool to help minimizing the risk
of false interpreting accuracy and quality of models and of using invalid models
that look great, but which only pretend having high quality, but which actually are
overfitted.
Noise Immunity Levels in KnowledgeMiner (yX)
In KnowledgeMiner (yX) the noise immunity levels shown in figure 6 are
available to the user for building models of corresponding validity and reliability
by avoiding pretending model accuracy above the one a level is assigned to.
If you choose a GOOD noise immunity for building a model, for example,
KnowledgeMiner will take care that at the end of modeling the resulting model
usually will not have a pretending accuracy above a value of 8% while for a POOR noise
immunity the pretending accuracy can have a value of up to 30%.
It is important to note that under certain conditions - especially if the
number of samples of the learning data set is small - validity and reliability
of a model on the one hand and model accuracy on the other hand may
become mutually exclusive goals: If I request increased model reliability model
accuracy may decrease and vice versa.
Fig. 6: Noise immunity levels for model self-organization.
|