treetagger-3.1
-
* -token: The words/tokens are printed in addition to the POS tags
* -lemma: Lemmas are printed as well.
* -sgml: This option instructs the tagger to ignore tokens which start
with '<' and end with '>' (SGML tags).
* -lex
: The file The file format is identical to the format of the lexicon argument of the training program (see below). * -no-unknown: If an unknown word is encountered, emit the word form as lemma. This was previously the default behaviour. Now, the default behaviour is to print "contains additional lexicon entries to be used by the tagger. " as lemma. * -threshold: This option tells the tagger to print all tags of a word with a probability higher than
times the largest probability.
(The tagger will use a different algorithm in this case and the set of best tags might be different from the tags generated without this option.) * -prob: Print tag probabilities (in combination with option -threshold) * -pt-with-prob: If this option is specified, then each pretagging tag (see above) has to be followed by a whitespace and a tag probability value. * -pt-with-lemma: If this option is specified, then each pretagging tag (see above) has to be followed by a whitespace and a lemma. Lemmas may contain blanks. If both -pt-with-prob and -pt-with-lemma have been specified, then each pretagging tag is followed by a probability and a lemma in that order.