webnewscrawler-1.0
- WebNews crawler is a java application to crawl (download, fetch) resources via HTTP. You can use it as a generic crawler to download WEB pages from Internet. It has a set of filters to limit and focus your crawling process. In addition WebNews crawler comes with powerful HTML2XML library that can extract desired data from HTML pages and represent it in XML format. Together with ability to parse RSS feeds this crawler is useful for acquiring and cleaning WEB news articles.