FRDCSA:

treetagger-3.1

Jump To: Parent Description

* -token: The words/tokens are printed in addition to the POS tags * -lemma: Lemmas are printed as well. * -sgml: This option instructs the tagger to ignore tokens which start with '<' and end with '>' (SGML tags). * -lex : The file contains additional lexicon entries to be used by the tagger. The file format is identical to the format of the lexicon argument of the training program (see below). * -no-unknown: If an unknown word is encountered, emit the word form as lemma. This was previously the default behaviour. Now, the default behaviour is to print "" as lemma. * -threshold
: This option tells the tagger to print all tags of a word with a probability higher than
times the largest probability. (The tagger will use a different algorithm in this case and the set of best tags might be different from the tags generated without this option.) * -prob: Print tag probabilities (in combination with option -threshold) * -pt-with-prob: If this option is specified, then each pretagging tag (see above) has to be followed by a whitespace and a tag probability value. * -pt-with-lemma: If this option is specified, then each pretagging tag (see above) has to be followed by a whitespace and a lemma. Lemmas may contain blanks. If both -pt-with-prob and -pt-with-lemma have been specified, then each pretagging tag is followed by a probability and a lemma in that order.