Weka (machine learning)

Weka ( Waikato Environment for Knowledge Analysis) is a software tool that provides different techniques from the fields of machine learning and data mining. It was developed at the University of Waikato, and is written in Java. It is a freely available software that is licensed under the GNU General Public License.

The software is an integral part of the book " Data Mining: Practical Machine Learning Tools and Techniques" by Ian H. Witten, Eibe Frank and Mark A. Hall, of the English standard work on the subject of machine learning. The software was developed by the Association for Computing Machinery in 2005 with the " SIGKDD Service Award" for the high contribution to research, among others, by providing the source code as open source.

Weka is known for its variety of classifiers such as Bayesian classifiers, artificial neural networks, support vector machines, decision trees, ID3, C4.5 but also meta- classifiers, boosting and ensemble. In other data mining fields such as cluster analysis, only the most basic methods such as k- means algorithm and the EM algorithm are offered.


The workbench WEKA is divided into the following areas:

  • Preprocessing: Allows particular the selection of the analyzed attributes
  • Classification
  • Cluster analysis
  • Association Analysis
  • Attribute selection: Determines the most helpful for the classification attributes of the data
  • Visualization