tree-tagger-arm6-4.3.2
-
* -proto: If this option is specified, the tagger creates a file named
"lexicon-protocol.txt", which contains information about the degree of
ambiguity and about the other possible tags of a word form.
The part of
the lexicon in which the word form has been found is also indicated.
'f'
means fullform lexicon and 's' means affix lexicon.
'h' means that the
word contains a hyphen and that the part of the word following the
hyphen has been found in the fullform lexicon.
* -eps : Value which is used to replace zero lexical frequencies.
This is the case if a word/tag pair is contained in the lexicon but not
in the training corpus.
The default is 0.1. The choice of this parameter
has some minor influence on tagging accuracy.
* -beam : If the tagger is slow, this option can be used to speed it up.
Good values for are in the range 0.001-0.00001.
* -base: If this option is specified, only lexical information is used
for tagging but no contextual information about the preceding tags.
This option is only useful in order to obtain a baseline result
to which to compare the actual tagger output.