bios-1.1.0
-
Bios is a suite of syntactico-semantico analyzers that include the most common
tools needed for the shallow analysis of English text.
Currently the following tools are included:
(*) Smart tokenizer that recognizes abbreviations, SGML tags etc.
(*) Part-of-speech (POS) tagger.
The POS tagger is implemented as a
a wrapper around the TNT tagger by Thorsten Brants.
(*) Syntactic chunking using the labels promoted by the CoNLL
chunking evaluations (http://www.cnts.ua.ac.be/conll2000/chunking).
(*) Named-Entity Recognition and Classification (NERC) for the CoNLL
entity types plus an additional 11 numerical entity types.