1
0
Fork 0
mirror of https://github.com/crawler-commons/crawler-commons synced 2024-05-28 20:36:05 +02:00
crawler-commons/CHANGES.txt
lewis.mcgibbney@gmail.com 1fca6c714f clean up for 0.2 release
2013-01-30 04:12:34 +00:00

24 lines
976 B
Plaintext

Crawler-Commons Change Log
Release 0.2
- Move to pure Maven for CC build lifecycle (lewismc)
- Move Javadoc out of core code (lewismc)
- Substantiate Javadoc (lewismc)
- Review default.properties (lewismc)
- add HTTP status code & reason to FetchedResult (Fuad Efendi via kkrugler)
- support for multiple user agent names (Tejas Patil via kkrugler)
- added javadoc generation, publish in /doc/javadoc (kkrugler)
- switch to using eclipse-formatter.properties (kkrugler)
- support robots.txt files that have UTF-16LE and UTF-16BE BOMs (kkrugler)
- support for user agent names that contain spaces (kkrugler)
- fixed handling of BOM in sitemaps (Vivek Magotra via kkrugler)
- refactoring of SiteMap objects (Hannes Schwarz via jnioche)
- added simple support for the file: protocol (kkrugler)
- cleaned up packaging and added "install" target (kkrugler)
Release 0.1
- parsing robots.txt
- parsing sitemaps
- URL analyzer which returns Top Level Domains
- a simple HttpFetcher