KnowledgeMiner Home
 
 
Solutions >Does My Model Reflect A Causal Relationship?

One new feature implemented in the PLATINUM edition of KnowledgeMiner 5 is final, on-the-fly evaluation of linear and nonlinear GMDH models. This document is about to show how this new model evaluation approach actively supports answering the above question. Also, a new model model quality measure that takes into consideration the noise filtering power of the algorithm and model complexity is introduced: Descriptive Power.

The Problem

A key problem in knowledge discovery from data is final evaluation of generated models. This evaluation process is an important condition for application of models obtained by data mining. From data mining, only, it is impossible to decide whether the estimated model can reflect the causal relationship between input and output, adequately, or if it's just a stochastic model with noncausal correlations. Model evaluation needs – in addition to a properly working noise filtering for avoiding overfitting the learning data - some new, external information to justify a model's quality, i.e., both its predictive and descriptive power.

Why

Let's have a look at this example: Based on a data set of 2 outputs and few inputs, KnowledgeMiner creates a GMDH regression model for each output variable Y1 and Y2 (fig.1).

Graphs of two models

Fig. 1: Model graph of two models

For model 1, a Coefficent of Determination (R2) of 0.9998, an Approximation Error Variance (AEV) of 0.0002, and a cross-validated Prediction Error Sum of Squares (PESS) of 0.0005 is reported, while model 2 shows a R2 of 0.9997, an AEV of 0.0003, and a PESS of 0.0006. Concluding from these or any other common model quality or error criteria and from the graphs of fig. 1 there is no reason to not classify both models as "true" models that reflect a causal relation between output and input. Also, remembering that a most important difference of KnowledgeMiner compared to the vast majority of data mining tools is its inductive, self-organized model synthesis that implements a powerful noise filtering during modeling, already (see also "Self-Organising Data Mining" book, section 3.2), this underlines the above assumption. However, the person who created the data set for this example states that only one model actually describes a causal relationship while the other model simply reflects some stochastic correlations, because output and inputs are completely independent one another (random numbers). Even with this information given - which is usually not the case for real-world data - the modeler cannot decide from the available information which of the two models is the true model. Only applying the models on some new data (which adds new information) will turn out the true model (fig. 2):

Prediction of two models

Fig. 2: Prediction of the two models

This example clearly shows that any "closeness-of-fit" measure does not suffice to evaluate a model's predictive and descriptive power, finally. Recent research has shown that model evaluation requires a two stage validation approach (at least):

1. Level
Noise filtering to avoid overfitting the learning data (hypothesis testing) based on external information not used for creating a model candidate (hypothesis) as an integrated part of the "Learning" process. A corresponding tool that has been using in KnowledgeMiner from the beginning within "Learning" is leave-one-out cross-validation, expressed by the PESS criterion.

2. Level
A characteristic that describes the noise filtering behavior of the "Learning" process to justify model quality based on external information not yet used in the first validation level. An approximation of the noise filtering characteristic is implemented in KnowledgeMiner 5 Platinum for the first time for linear and nonlinear GMDH models. This characteristic was obtained by running a Monte Carlo simulation many times, so it expresses a kind of new, independent "common knowledge" that any model can be and must be adjusted with (see also "Validation in Self-organising Data Mining "). Figure 3 shows a detail of the characteristic of linear GMDH models.

Noise filtering characteristic

Fig. 3: Noise filtering characteristic of KnowledgeMiner's GMDH algorithm
M: number of inputs; N: number of samples; Qu: virtual quality of a model
Qu=1: noise filtering does not work at all; Qu=0: ideal filtering

The reason for a second level validation is (1) that noise filtering implemented in level 1 is very likely to not being an ideal noise filter and thus not working properly in any case (see example) and (2) to get a new model quality measure that is adjusted by the noise filtering power of the algorithm.

The noise filtering characteristic expresses a virtual model quality Qu that can be obtained when using a data set of M potential inputs of N random samples. It is virtual model quality, because, by definition, there is not any causal relationship between stochastic variables (true model quality Q=0, by definition; for a definition of model quality Q, please see " Validation in Self-organising Data Mining"), but there are actually models of quality Q > 0, which, when using random samples (see example above), just reflect stochastic, nonexistent correlations. In result, given any number of potential inputs M and number of samples N, a threshold quality Qu=f(N, M) can be calculated by KnowledgeMiner that any model's quality Q must exceed to be stated valid in that it describes some relevant relationship between input and output. Otherwise, a model of quality Q <= Qu is assumed invalid, since its quality Q can also be reached when simply using independent variables, which means that this model does not differ from a model of just stochastic correlations. It's simply garbage. In addition to deciding if a model appears being valid or not, the noise filtering characteristic is also a tool for quatifying to which extent the data is described by a causal relationship between input and output. This introduces a new, noise filtering and model complexity adjusted model quality measure: Descriptive Power (DP), which is defined as:

Descriptive Power

whith Q as the measured quality of the evaluated model (a closeness-of-fit measure, R2 for example) and Qu(N, L) as the reference quality calculated from the number of samples N the model was created on and from the number of input variables L the model is actually composed of (selected relevant inputs), with L <= M. This means that Descriptive Power is corrected by any virtual quality that may exist and that directly allows for model complexity. For example, two models M1 and M2 show the same quality Q = Q1 = Q2, but M1 uses more relevant inputs than M2 to reach that quality Q, so, with L1 > L2, the Descriptive Power of M2 is higher than that of M1.

The bottom line

KnowledgeMiner 5 Platinum now evaluates a created model and calculates its Descriptive Power on the fly. You don't have to care about it. KnowledgeMiner will serve you with all corresponding information in the model report to make you more effective and successful in your data mining and knowledge extraction efforts.

Back to our example above, KnowledgeMiner 5 Platinum will provide this additional information in the report for the two models (fig. 4):

MODEL EVALUATION:
The model cannot be validated, because noise filtering does not work on this unsufficently sized information basis. The obtained model accuracy can also be reached when just using random numbers as input data.
Increase the
number of samples to about 75 and/or decrease the
number of potential input variables to below 3.

a) Report of Model 1 --> status: uncertain

MODEL EVALUATION:
The model appears to reflect a valid relationship. It describes 98 % of the data.
For the chosen model type, modeling is based on a VERY POORLY sized information basis, which may result in a not properly working noise filtering. To improve the noise filtering capability of the algorithm and thus improving model reliability, it is highly recommended to decrease the variables/sample ratio. If possible increase the
Number of Samples: to above 131 and/or decrease the
Number of Variables: to below 3.

b) Report of Model 2 --> status: valid

Fig. 4: Reported evaluation results of the two models

In result of modeling we now get the information that model 2 really does very well (DP of 98%), while model 1 is uncertain. Trying to follow the recommendations given in the report of model 1 and decreasing the number of potential inputs to 4, in a second modeling run, KnowledgeMiner comes up with this report for model 1 (fig. 5):

MODEL EVALUATION:
The model seems to reflect a random relation only. Either the data used is actually noncausal or the noise-to-signal ratio of the data is too high to validate a relevant relation. If the data is considered noncausal, the best model that describes the data is just the mean value of the target variable: y = 0.2469.

Fig. 5: Evaluation result of model 1 after remodeling --> status: invalid

This result, which is the true result, is the more interesting as the model still fits the data quite well (R2 = 0.8173; see also model graph in fig. 6; would you have considered this model invalid when viewing that graph?), however, KnowledgeMiner correctly calculates a Descriptive Power of 0% for this model.

Model graph after remodeling

Fig. 6: Model graph of model 1 after remodeling
--> looks good, but the closeness-of-fit is misleading: A calculated Descriptive Power
of 0% correctly indicates the true noncausal nature of the data

In only two steps, and using KnowledgeMiner 5 Platinum on some training data only, we have learned that model 2 is a valid model that will generalize well while model 1 simply pretends a high quality and a relation that is not true in reality, and therefore, has to be rejected.

The implemented two stage model validation approach now allows, for the first time, to get an active decision support in model evaluation for minimizing the risk of false interpreting models and using invalid models that just reflect some noncausal correlation.

We appreciate your feedback about your experience and results from with or without using (other tools) this feature.

©2002, Frank Lemke

© 2001-2007 Script Software Intl.Site MapContact Us