| About KnowledgeMiner® (yX)
Information is not knowledge. - Albert Einstein |
What is KnowledgeMiner (yX) for Excel?
KnowledgeMiner (yX) for Excel is a knowledge mining tool that works with data
stored in Microsoft Excel for building predictive and descriptive models
from this data autonomously and easily. It supports both major releases of
Microsoft Excel, 2004 and 2008. The modeling engine of KnowledgeMiner (yX)
for Excel implements unique modeling technologies which are built on the
principles of self-organization: Learning from noisy data an unknown
relationship between output and input of any given system in an evolutionary
way from a very simple model to an optimal complex one which generalizes well.
KnowledgeMiner (yX) for Excel implements a completely redesigned and
redeveloped modeling engine called (yX). It is based on the modeling
technologies also found in our established KnowledgeMiner 6.0 software, which
has been successfully used by our customers for more than 10 years.
KnowledgeMiner (yX) in brief
Description
KnowledgeMiner (yX) for Excel is 64-bit parallel software. It employs a
twofold, simultanious parallelism: Vector processing (single instruction
- multiple data parallelism (SIMD)) and multi-processing (multiple instruction
- multiple data parallelism (MIMD)) to take full advantage of multi-processor and/or multi-core based
Macs. Also, it automatically scales to the number of processing elements found at
runtime (fig. 1).
Fig. 1: KnowledgeMiner (yX) for Excel running at optimal
speed on dual-core MacBooks and iMacs and/or eight-core Mac Pros.
This means, it runs at optimal speed no matter if the machine it is running on is still a single processor
machine or if it is driven by a dual-core or a quad-core CPU or by two
quad-core CPUs, for example (fig. 2). This implies that KnowledgeMiner (yX)
for Excel will also scale to future many-core hardware
not available today. The concluding rule is that (yX) for Excel -
unlike the vast majority of any kind of software on the market today -
runs faster, almost linear, actually, with increasing number of processing
elements and clock speeds (fig. 2).
Fig. 2: KnowledgeMiner (yX) for Excel scales very well to the
number of processor cores actually available at runtime.
The new 64-bit parallel modeling engine results in breathtaking
performance gains. KnowledgeMiner (yX) for Excel
runs more than 600 times faster than the our traditional KnowledgeMiner 6
software as shown in figure 3, according to a recent performance study.
This makes KnowledgeMiner users even more productive
and gets complex modeling and prediction tasks done in almost no time.
Fig. 3: Average speedup of KnowledgeMiner (yX) for Excel on an 8-core
64-bit Intel-based Mac Pro compared to KnowledgeMiner 5.4 running the
same modeling tasks on the same machine.
Starting from version 2.0, KnowledgeMiner (yX) for Excel is high-dimensional
modeling software. It is high-dimensional in both data set dimensions:
It works on data sets with a large number of samples and on data sets with a large
number of potential input variables. Furthermore, the software scales from very small to
large data sets in both dimensions so that the user don't have much to care
about reliability of the resulting self-organized models. Figure 4 shows
the ranges which KnowledgeMiner (yX) for Excel High Dimensional is designed
to work in as compared to other common data mining technologies.
Fig. 4: Common modeling technologies and their applicability
to different data set dimensions.
A major concern for models built from any data-driven modeling technology
is model reliability, and the risk of obtaining an invalid model grows fast
with increasing number of input variables and with increasing model complexity. Therefore, original research into
noise immunity of high-dimensional state space modeling has been performed
by KnowledgeMiner Software for many years. This research resulted in highly
improved noise immunity algorithms compared to traditional
modeling approaches (see fig. 3). These new algorithms has been implemented
in KnowledgeMiner (yX) HD for the first time and they additionally allow
at the same time a new model evaluation approach, which helps and supports
the user in a powerful way to assess obtained models. This concept is
described in more detail in the section "Noise Immunity and Descriptive Power".
Based on our independent, cross-platform (yX) core modeling engine,
self-organizing, parallel, high-dimensional modeling with on-the-fly
model evaluation provides a unique and original tool which allows also
non-experts to build reliable predictive analytical models of complex
systems from the desktop with unprecedented power, ease-of-use, and
ease-of-applicability likewise.
An example which illustrates the power of KnowledgeMiner (yX) for Excel HD
is a recent study on modeling and prediction of climate change in
global and regional scales.
One model, for instance, self-organized from 1900 monthly historical air and
sea surface temperature data and over 9300 potential input variables (by using a
time lag of up to 518 months) is composed of rarely 18 relevant inputs, and it took
less than 9 minutes on an 8-core Mac Pro to get this model
(fig. 5)! This
translates into
- 36 minutes runtime on a dual-core machine,
- 72 minutes when running serially, all in 64-bit mode,
- about 90 minutes serial runtime in 32-bit space, and
- 3 days and 20 hours if it was running using
our classical KnowledgeMiner 6 software!
Fig. 5: Global temperature prediction model
self-organized from a high-dimensional data set.
System Requirements
- Mac OS X 10.5 (Leopard) or newer
- Any multi-core Intel-based Mac to run software concurrently (recommended), PPC Macs to run software sequentially, only
- Microsoft Excel 2004 or 2008
|