Clause

Under subset is any sequence of words within a sentence, in which in addition to a main clause either at least one other main clause or a subordinate clause is included. Each of these parts, " law " and " law " or " law " and " subordinate clause " is a subset of the entire set. Of course, more main or subordinate clauses may be included in a set.

An example of

"Any sum one can currently require, says one of them, when one man only spare the trouble with the tax authorities. " (Quote from Der Spiegel, No. 26, 2008, p 55) This complex sentence, a sentence comprising three subsets, clearly marked by the comma.

Criterion for subset

Subset of a set can be only one such word order, which largely meets the minimum requirements for record. These must be about the German rule subject and predicate available and the enhancements that makes the predicate required. Restrictions are permitted insofar as ellipses are considered rate -shaped. In practical work subset is approximated in many cases by a simple statement - defined - operational: A record then so many subsets ( in quantitative linguistics often referred to with the acquisition of the English term Clause ) as finite verbs ( = verbs has in a personal form). Applying this criterion to the example sentence of the previous section, so be with the finite verbs " could", " says " and " spare " the three subsets determined.

Linguistic significance of the subsets

As with other linguistic units also carry subsets by their nature, complexity and frequency in the stylistic characteristics of texts. In the quantitative linguistics, two aspects are in the foreground: the frequency with which subsets of different lengths in texts occur ( distribution of the subset lengths) and the ratio of the record length for the subset length or also that of the subset length to the length of the constituents ( components) of the subsets ( Above all, phrases, parts of a sentence, words). Instead of using subsets of the Clauselänge is it sometimes worked with the related concept.

As an example, the data were presented that have been obtained from textbooks of medicine; the subset lengths are represented in it as well as with some other text classes corresponding to the positive negative binomial distribution. The data comes from Schefe (1975); adjusting the distribution of Best (2006):

In the table, x: number of subsets per set, n ( x) observed in the evaluated corpus number of sets of length x; NP (x ) the number of sets of length x, which is calculated when one adjusts the positive negative binomial to the observed data. The test results with P = 0:27, the positive negative binomial distribution is a good model for the observed data. The result of such tests is evaluated as good when P ≥ 12:05, which is true in this case. For more detailed explanation, reference is made to the literature cited.

Development of the lengths of subsets

As well as the word length and a block length, the length of sub- blocks is a size which varies over time. In German-speaking scientific and technical texts 1770-1940 there is a trend in which the subset lengths as the sentence lengths also make sure to remove to and from 1850 again, as Fruehauf noted. As part of sentences to the author summarizes the main and subordinate clauses, but also infinitive and participle. These changes in language use Piotrowski follow the law in its form for the reversible language change, as the following table shows.

(Note: . T is numbered for the calculation after decades period you fit to the observed data up to 1940 to the Piotrowski - law in the form of the reversible language change, then the specified values ​​calculated result, the time 1960 is disregarded because. is unclear due to the available data, whether this implies a change in trend or whether it is merely an " outlier ". adaptation of the model yields a coefficient of determination of C = 0.92, where C is considered good if it is greater than / equal to 0.80. For more detailed explanations, please refer to the literature cited. )

764250
de