Formal concept analysis

Formal concept analysis (FBA ) is a part of the mathematical order theory. Your initial motivation is the concrete representation of complete lattices and their properties by means of formal contexts to study properties of these associations as properties of the associated contexts. Because the English translation Formal Concept Analysis can be found sometimes in German also actually a misnomer Formal Concept Analysis.

The theory in its present form dates back to the Darmstadt research group led by Rudolf Wille, Bernhard Ganter and Peter Burmeister, in which the early 1980s Formal Concept Analysis arose. The mathematical foundations, however, were already created by Garrett Birkhoff in the 1930s as part of the general lattice theory. Before beginning work, the Darmstadt group there have been attempts in various French groups. Philosophical foundations of Formal Concept Analysis rely in particular on Charles S. Peirce and Hartmut von Hentig.

FBA Practical applications, such as data and text mining, knowledge management, semantic web, software development, economics or biology in diverse areas.

  • 3.1 attribute exploration

Motivation and philosophical background

In the article, Restructuring Lattice Theory ( 1982), who founded the Formal concept analysis as a discipline, the discomfort of the lattice theory and pure mathematics is commonly referred to as a motivation: The often achieved by " spiritual high performance sport " production theoretical results is impressive, neighboring the links between areas and even parts of a theory would, however, weaker.

" Restructuring lattice theory is an attempt to strengthen connections to our general culture again by the theory interpreted as specific as possible and thus better communication between the association theorists and potential users of lattice theory is promoted. "

This goal goes directly back to Hartmut von Hentig, the 1972 restructuring of Sciences called, " to make them better learnable, mutually available and general ( ie beyond the expertise ) to criticism. " Thus, FBA aims of their origins, on interdisciplinarity and democratic control of research.

You corrected the original approach of the lattice theory with the development of formal logic in the 19th century. While a term was reduced as a predicate on its periphery (similar also in the model theory ), the theory of concepts should now be back less abstract by taking account also of the content. This FBA is geared to the extension and intension categories of linguistics and classical term logic. My understanding of the term corresponds to the DIN standards 2330 concepts and terms, DIN 2331 conceptual systems and their representation, DIN 2342 terms of terminology theory.

Clarity of terms is desirable in the sense of Charles S. Peirce 's Pragmatic Maxim fact that observable, elementary properties of subsumed objects are deployed. In his later philosophy Peirce assumed that logical thinking is aimed at capturing reality, through the three- step concept, judgment and conclusion. Mathematics abstracts logical thinking, develop forms of possible reality and can therefore support rational communication. Rudolf Wille defined in this context:

"The aim and importance of formal concept analysis as mathematical theory of concepts and concept hierarchies is to support the rational communication of people by developing mathematically appropriate conceptual structures that can be logically enabled. "

Mathematical Foundations

The main objective of Formal Concept Analysis is the representation of complete associations by means of formal contexts. In addition, however it also allows conversely, the study of data in the form of formal contexts with funds from the theory of order. The basic definitions for this will be discussed in this section.

Formal contexts and formal ideas

Given two sets and a relation. The triplet is then described as a formal context, as its object set and as its feature set; for an object and feature means "the object has the feature ". It is often written instead. The quantity is referred to as the relationship of the incidence formal context.

Are the quantities and at last, then formal contexts can well represent in the form of " cross-tabs ". It should be noted here that the articles and features that can be arranged arbitrarily in this representation. This order is then not part of the formal context, but only its representation.

Is set of objects of a formal context, then referred to with

The amount of the common features of the objects in. Correspondingly definitiert for a set of features of the amount

All items that contain all the features of. The amount and are recognized as derivatives of the corresponding quantities and designated and the functions, both of which are named with, called derivative operators of.

The derivative operators meet a number of very basic features. Are quantities of items and quantities of characteristics, shall apply

  • And dual,
  • And dual,
  • And,
  • .

In fact, in order to define the derivative operators a antitone Galoisverbindung between the power set of associations in the object set and the feature set. Conversely, every such Galoisverbindung between power set associations represent as a pair of derivative operators of a formal context.

At a formal context is now called a formal concept of a pair, if

  • A lot of articles from is
  • An amount of characteristics of all is
  • And
  • Applies.

The amount is then called the extent and the amount of content of the concept. The set of all terms is denoted by. If one formal contexts as crosstabs represents and selects an appropriate order on the objects and features, as can be understood as formal concepts maximal rectangles in this crosstab.

Are now, it can be with

Define an ordering on. This order then makes the structure to a complete lattice. In fact, reversed after the main theorem of Formal Concept Analysis every complete lattice is order to a concept lattice.

Concept lattices can be represented as an ordering charts ( line graphs ) and thus develop the data in their structure and their contexts. The articles have it all (joined by edge ) above it features; in the example is 4 straight, composed and square.

Mathematically accurately can initially justifies the simplified labeling of concept lattices. Looking for an object the set of all terms that have in its scope, this quantity has a main filter in the concept lattice. Therefore, only the object is below the smallest term containing to the extent noted. Dually is listed above the largest term, which has a given feature in the feature content. A term in the order graph has exactly then an object in its scope, if it is above the concept of which is labeled with the article. Accordingly, a term has in the order chart a feature in its content if it is below the term, which is labeled with the feature.

Fundamental Theorem of Formal Concept Analysis

It should be a formal context and its concept lattice. One can for objects and features then the terms

Consider. It is called the object concept of and the attribute concept of. Furthermore, applies

Is now a complete lattice, so is isomorphic to if there are pictures in such a way that

Applies. Particular, it is isomorphic.

Implications of the theory of formal contexts

For a formal context theory its implications can be examined. It is an implication of just a pair with what is usually written with. It is said that in true if every object that has all the characteristics of, with all the characteristics of features, so if applies. This condition is equivalent to the fact that applies.

Is a lot of implications from and so we denote by the smallest set containing and closed under. It means a lot under completed if always or applies to all the implications, so if always implied. One then sees that the picture is a closure operator on the power set of.

If an implication of, as follows, if the following holds. This is equivalent to the fact that even the implication always applicable in all formal context, apply in which all implications from.

A base is then a set of valid user of implications, so that each ( semantically ) follows a valid implication of already made ​​. Using appropriate inference rules of syntactic rules as the Armstrong The completed in this new sense set of all implications of a theory, as they also design according to, for example, with respect to the underlying context satisfiable.

The base is called irredundant if it is -minimal with this property. An example of a base is irredundant canonical base (see exploration feature ), which further has the property to be minimal with respect to the size of the base.

It is true that a lot of implications iff is a basis of a context, if the amount of closed sets with exactly what the content of is.

Attribute exploration

It is possible to show the implications of the theory for specific subject areas with the help of a formal context. This means in particular that it is possible to do this with the aid of a sufficient quantity of examples, which are then the objects of the formal context. Theoretically, such a set of examples of a human expert, or a machine are provided.

However, this arises a problem that is not guaranteed a priori that an adequate number of examples is given, or whether some are redundant samples generated as already given examples suffice. Among the aspects that the generation of good examples is difficult, the survey of experts or even new experiments are expensive, and literature search or algorithms can be expensive, this is a serious problem.

This can be remedied, the algorithm of attribute exploration here. Starting from an already known set of implications and an already known set of examples from the topic proposes the algorithm ago implications, which can then by an expert (human or not ) accepted or rejected. This is an implication are then accepts exactly if it is valid in the said topic. If an implication is rejected, the expert must produce a counter-example, the (human or not ) then accepted by an expert or may be rejected. This is an implication are then accepts exactly if it is valid in the said topic. Through an accepted counter-example, the implication is refuted and thus generates the smallest possible amount of accepted implications that at the end of the topic fully describes. In addition, the number of examples is completed.

Applications

The Formal concept analysis can be used as a qualitative method of data analysis, such as data and text mining, knowledge management, semantic web, software development, economics or biology. A direct application is to structure the original data differently and to visualize.

Pictures of Formal concept analysis

112485
de