Subject indexing

When indexing, and indexing ( Austria, Bavaria: Beschlagwortung ) or Verstichwortung, is the term for information retrieval, the assignment of descriptors to a document for the development of the facts contained therein. It can be distinguished with non- prescribed controlled indexing descriptors with a thesaurus or subject catalog or notations of classification and indexing free or free indexing. In the Community Indexing (also social or collaborative tagging ) with the help of social software is also called tagging instead of indexing and tags instead of descriptors.

Methods

After various points of each different indexing types and methods can be distinguished:

  • Manual, computerized and automatic indexing
  • Controlled Indexing and free indexing
  • Direct Grading Syntactic indexing and indexing

Manual indexing

The Manual indexing, intellectuals indexing or indexing is a method of indexing documents in a document representative keywords ( engl. "Subjects " ) are assigned by an indexer. The manual indexing is performed by experts using controlled terminology lists and similar regulations vocabulary; it allows a linguistic analysis of individual formulations and a synonym award, but has the disadvantage that it is expensive, slow and expensive, its quality depends on the consistent operation of the staff and the predefined Deskriptorwortschatz is static. In addition, the user must know the indexing vocabulary to research documents.

Automatic indexing

A common method of automatic indexing is the full-text indexing, with the exception of stop words all words in a text to be included in the index ( eg a search engine ). If necessary. are words by stemming ( dt reduction) back to a common root word.

Using statistical indexing methods is made and thus included in the index only words that occur with a certain frequency in the text by identifying word frequencies a selection. A simple method of term weighting is the inverse document frequency. In this method, the frequency of the term in a document is determined. This value is set by the frequency of documents containing the term, in relation. So it is easy to read as a descriptor of the value or the weight of the term. The weight of a term is higher, there are ever fewer documents with this term in the archive and the more often the term occurs in the document to be indexed. At the frequency of the term significance can be read. In this document, for example, often " concept " is used, because this word is important for the topic. Only " term" is too wide a term -in-law. That shows that only at the frequency can not be detected, whether it is a good or bad descriptor. Only in combination with the above-mentioned weighting method is used to create significant descriptors.

With the help of computational linguistics and intelligent automatic methods are possible, while not come close to the manual indexing, but are much more stable in terms of indexing consistency.

Especially in the library catalog is called the automatic index creation - even within multi-unit subject strings of a syntactic indexing awarded by qualified personnel in a manual indexing ( keyword index ) - Verstichwortung, from which the keyword catalog is created. The automatic extraction of keywords from a full-text - eg on the index creation - is so named.

Computer-aided indexing

In the computer-aided indexing (including indexing ) descriptors are proposed by machine and manually selected. Here, the indexing is done by computer with pre-or post-processing by humans or in interaction with people.

Indexing of images

For content indexing of images, the classification Iconclass is used in many museums. The Schlagwortnormdatei is increasingly in the museum sector use. Many photo agencies and picture archives using the IPTC -NAA standard and the rules contained therein for Categories and Tags. However, a major role to play even in-house keyword lists. In addition, there are various procedures that can be researched feedback images by similarity search and Relevance.

91511
de