Latent Dirichlet Allocation

Latent Dirichlet allocation (LDA ) is one of David Lead, Andrew Ng, and Michael I. Jordan imagined in 2002 generative probability model for documents such as text or Bildkorpora. In this case, each body member (often referred to as document ) is regarded as a mixture of different underlying topics ( eng. latent topics ). Each visual word in the document is in turn assigned to one or more topics. These subjects whose number is fixed at the beginning, explaining similarities between documents. So would be possible topics in Bildkorpora for example, meadow or road; in text corpora abstract content, such as sports, politics or education.

LDA is, inter alia, used for document modeling, text classification, and finding your new content into text corpora. Other applications are in the field of bioinformatics.