Problems of complex objects modelling can be solved
by deductive logical-mathematical or by inductive sorting-out
methods. Deductive methods have advantages in the cases of
rather simple modelling problems when the theory of the object
being modelled is known and therefore it is possible to develop
a model from physically-based principles using users knowledge
of the process. Besides these information aspect the praxis
relevant applicability of modelling techniques and tools act
a significant part for their extensive and various supply
at user support. The user is normally interested in the solution
of the initial problem and barely has any expert knowledge
about deductive mathematical modelling.
Efforts of using the known tools of artificial intelligence
were not successful in many cases in the past. That is,
because methods of artificial intelligence are based on
knowledge extraction of human skills in a subjective and
creative domain -model building. Besides these aspects,
in such a way it's not possible to solve the significant
problems of modelling for complex systems such as inadequate
a priori information, great number of unmeasurable variables,
noised and extremly short data samples and ill-defined objects
with fuzzy characteristics. In this case knowledge extraction
from data, i.e. to derive a model from experimental measurements
using inductive methods has advantages in cases of rather
complex objects having only a few a priori knowledge.
One development direction that take up the practical demands
represents the self-organization of mathematical models
which is realizable by means of statistical learning networks
like GMDH algorithms.
In classical GMDH algorithms the partial models
have to be chosen wether to be linear or nonlinear functions
in each generated layer. Lemke1 has developed an
algorithm for the generation of optimum partial models. A
complete polynom of second degree will be optimized
f(xi, xj) = a0+a1xi+a2xj+a3xixj+a4xi2+a5xj2,
using various selection criteria like the PESS criterion.
In distinction to classical algorithms this one has the
ability to synthesize linear or nonlinear models of optimal
complexity depending on the object structure and a meaning
reduction of model complexity related to the existing noise
level of the data. This results in a more flexible modelling
in each layer because the partial models could consist of
no any (y=a0), one or both input variables of
every possible combination depending on their actual contribution.
The aim is to avoid for short and very noisy data samples
inclusion of redundant variables in modelling which, ones
part of the model, couldn't be excluded afterwards. So,
at the end it could be expected to get simpler models.
Successful applications of GMDH algorihms are known
especially in those areas where theoretical systems analysis
is not applicable because of the complicatedness of the object
being examined, the status of knowledge of the related scientific
theory and the required time. An important area especially
for decision support systems is the analysis and prediction
of systems of characteristics. In the following we present
one newer example where the KnowledgeMiner tool was used.
- 3.1. Solvency Checking
Basis for the examination and automatic model synthesis
were sets of 19 anonymous characteristics of 81 companies
which have been served a banking establishment do decide
a company's solvency. 10 decisions have been chosen from
the bank to serve for results checking and the other 71
decisions were used as learning sets for modelling. There
are several methodologies to obtain the required models
using GMDH but in distinction to neural networks each of
them deliver assertions of the influence of the individual
characteristics on the decision.
A. Model of the dependence of the decision from the
variables
There were generated linear yM= ‡i
aixi and nonlinear static models of
the decision variable from the 19 characteristics xi.
The decision variable has been set related to the decision
to +1 (positive") or -1 (negative"). All obtained
models has extracted the variables x5, x8,
x10, x15 as significant, e.g.
yM= -3,4528 + 0,1174 x5 + 0,1701
x15 - 0,551 x8 + 1,311 x10.
These four variables could be interpreted as the main decision
variables.
B. Modelling of independent systems of equations
An other and more expending way is the generation of linear
respective nonlinear systems of equations separately for
all positive and negative decisions. In the case of linear
models it is
x+=A+x+
; x-=A-x-
; A={aij} , with aii=0.
Such systems better grasp the spectrum of decisions because
they have a greater breadth of variation and could be interpreted,
too. Then, the corresponding model values xi+/
xi- will be calculated for the checking
set variables xic . The membership
to class + or - was decided on the basis of the deviations
¨i+ = xic -
xi+ respectively ¨i-
= xic - xi-
. The results in table I have been obtained in the following
cases:
|
TABLE I Classifications obtained from systems of
equations
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
c1
|
c2
|
c3
|
c4
|
c5
|
c6
|
c7
|
c8
|
c9
|
c10
|
|
y
|
-
|
+/-
|
+
|
-
|
+
|
-
|
-
|
+
|
+
|
-
|
|
yM
|
+
|
+
|
+
|
-
|
+
|
-
|
-
|
+
|
+
|
-
|
a. s+=‡i |¨i+
|; s-=‡i= |¨i- |.
b. s+=‡iN
|¨i+ |; s-=‡iN
|¨i- |,
in which N is the set of indices of those variables having
influence in model A.
c. s+=‡iM+
¨i+ xic ; s-=‡iM-
¨i- xic , in
which M+, M- are the sets of indices
of those input variables the best fitting models were obtained
(for positive and negative decisions).
d. A next way for decision making is to calculate for the
variables xic their deviation ¨i+
and ¨i- and classify each variable
on the basis of the minimum deviation. The final decision
is made as a sum of all classifications.
C. Synthesis
A synthesis of different classifications enables to better
describe the wide spectrum of possible decisions without
lost of the explanation component. In table II a synthesis
on the basis of majority decisions is shown.
|
TABLE II Synthesis of different classifications
|
|
|
|
|
|
|
|
|
|
checking set
|
target
|
model A
|
model B.b
|
model B.d
|
synthesis
|
value
|
|
c1
|
-
|
+
|
-
|
-
|
-
|
true
|
|
c2
|
+/-
|
+
|
+
|
+
|
+
|
true
|
|
c3
|
+
|
+
|
+
|
+
|
+
|
true
|
|
c4
|
-
|
-
|
-
|
-
|
-
|
true
|
|
c5
|
+
|
+
|
+
|
+
|
+
|
true
|
|
c6
|
-
|
-
|
-
|
-
|
-
|
true
|
|
c7
|
-
|
-
|
-
|
-
|
-
|
true
|
|
c8
|
+
|
+
|
+
|
+
|
+
|
true
|
|
c9
|
+
|
+
|
+
|
-
|
+
|
true
|
|
c10
|
-
|
-
|
-
|
-
|
-
|
true
|