The analysis of the vocabulary of a text can be done with the following modules:
- word list: a list of all character strings abd their frequency in a text, sorted ascending by alphabet.
- word sequence list: similar to a word list, but instead of single words the results is a list of sequences of character strings, e.g. United States or European Union or a phrase like it rains cats and dogs
- word permutation list: each character string within a text unit is combined as a 2-word-sequence with each following character string.
- vocabulary growth (TTR-dynamics): The result is the developement of the TTR-values (type-token ratio). This value always starts with 1 (each character string occurs once in the text) and decreased, but without reaching 0. The comparison of TTR-values only makes sense if the number of words/character strings is (nearly) the same.
Menu of word list:
Menu of word sequences:
Menu of word permutations:
Menu of vocabulary growth:
last change of this page: May 23, 2007
back to the home page