2011-07-25 12:23:21 +02:00
|
|
|
Crawler-Commons Change Log
|
|
|
|
|
|
|
|
Release 0.2
|
2013-01-28 03:45:41 +01:00
|
|
|
- Substantiate Javadoc (lewismc)
|
2013-01-28 03:43:34 +01:00
|
|
|
- Review default.properties (lewismc)
|
2013-01-24 00:08:51 +01:00
|
|
|
- add HTTP status code & reason to FetchedResult (Fuad Efendi via kkrugler)
|
|
|
|
- support for multiple user agent names (Tejas Patil via kkrugler)
|
|
|
|
- added javadoc generation, publish in /doc/javadoc (kkrugler)
|
|
|
|
- switch to using eclipse-formatter.properties (kkrugler)
|
|
|
|
- support robots.txt files that have UTF-16LE and UTF-16BE BOMs (kkrugler)
|
|
|
|
- support for user agent names that contain spaces (kkrugler)
|
|
|
|
- fixed handling of BOM in sitemaps (Vivek Magotra via kkrugler)
|
2011-07-25 12:23:21 +02:00
|
|
|
- refactoring of SiteMap objects (Hannes Schwarz via jnioche)
|
|
|
|
- added simple support for the file: protocol (kkrugler)
|
|
|
|
- cleaned up packaging and added "install" target (kkrugler)
|
|
|
|
|
|
|
|
Release 0.1
|
|
|
|
- parsing robots.txt
|
|
|
|
- parsing sitemaps
|
|
|
|
- URL analyzer which returns Top Level Domains
|
|
|
|
- a simple HttpFetcher
|