Association rule learning

The association analysis refers to the search of strong rules. This consequent association rules describe correlations between joint forces of things. The purpose of association analysis, therefore, is to identify items ( elements of a set, such as individual products of a shopping cart ), which imply the occurrence of other items within a transaction. One of such uncovered relationship between two or more items can then "If item (quantity ) A, then item (quantity ) B ' and A → B are represented as rule of the form.

A typical field of application is the context in purchasing, the so called Market Basket Analysis to initiate targeted advertising. At 80 percent of purchases in which beer is purchased, even potato chips are bought. Both products are found in 10 percent of purchases. Frequently, these findings will be used in cross-marketing.

Characteristics of association rules are:

  • Support: relative frequency of the examples in which the rule is applicable.
  • Confidence: relative frequency of the examples in which the rule is true.
  • Lift: The lift is how high the confidence for the rule exceeds the expected value, so it shows the general meaning of a rule.


Given an association rule { } → { toothbrush toothpaste }.

  • Support: The support is calculated, for what proportion of all transactions, the rule { } → { toothbrush toothpaste } applies. To calculate the number of transactions that contain both interest Itemmengen, divided by the number of all transactions.
  • Confidence: For what percentage of transactions in which { } occurs toothbrush, toothpaste also occurs { }? To calculate the confidence number of the rule satisfying transactions will include the number of transactions, the toothbrush { } is divided.
  • Lift: Suppose that 20 percent of all customers buy { toothbrush, toothpaste }, 10 percent of all customers buy { } toothbrush and 40 percent of all customers buy toothpaste { }. Then, the rule has a five-fold lift.

Algorithms must be designed so that all association rules with a pre- determined minimum support and Mindestkonfidenz found. The method should not require any assumptions about the analyzed features. Example, this would also not be possible for a mail with many thousands of items.

The first algorithm for association analysis is the AIS algorithm (named after its developers Agrawal, Imielinski and Swami ) from the Apriori algorithm was developed. This is more and more replaced by the much more efficient FPGrowth algorithm.

Areas of application

Cross Marketing Customer will be displayed when viewing an item in a webshop also article, bought also the other customers (eg digital camera → memory card)