Lexical density

The lexical density is a measure in linguistics, especially computational linguistics, which indicates the proportion of content words to the total number of words in percent. The term derives from the English term for content words, lexical words, ago. Content words are those words that have their own lexical meaning. Along with those are the function words that carry predominantly grammatical meaning.

The lexical density can be calculated using the following formula:

The scaling to values ​​between 0 and 100 is not necessary and is not always done, especially when comparing the lexical words are not in proportion to the total number of words, but to the number of grammatical units, such as subsets. In addition, a possible weighting of lexical words depending on the frequency in the language.

The measure was introduced by Jean Ure for the description of register variation. Michael Halliday is also found that the lexical density in spoken is lower than in written language. The lexical density can be applied to text analysis in forensic linguistics (including Plagiarismuserkennung ).

510035
de