Knowledge extraction

Knowledge Discovery in Databases (KDD ), knowledge discovery in databases in German, the supplements are often interchangeably used data mining to preparatory studies and transformations of the data to be evaluated. The goal of the KDD is the detection of previously unknown technical contexts from existing, usually large data sets. In contrast to data mining KDD comprises the total process and the preparation of data and the evaluation of results. The term KDD was coined in scientific circles by Gregory Piatetsky - Shapiro, while in practice, the term data mining is more common, however, is the negative connotation traditionally in statistics.

The steps of the KDD process are

Usually, these steps are repeated several times. A common approach is model CRISP- DM.

Software

  • RapidMiner is a freely available open source tool for machine learning and data mining, which supports the more technical steps of the knowledge discovery (data selection, data cleaning, data reduction, modeling, visualization, etc.)
  • WEKA is an open source tool, which was developed by the University of Waikato. It contains an extensive collection of algorithms for the Knowledge Discovery in Databases.
  • Environment for Developing KDD - Applications Supported by Index -Structures is a research project of the Ludwig- Maximilians- University of Munich, the number of data mining algorithms, including (especially for cluster analysis and outlier detection, but also index structures ), for use in teaching and research.
219375
de