1
0
Fork 0
mirror of https://github.com/crawler-commons/crawler-commons synced 2024-05-03 22:26:15 +02:00

Updates changelog for #376, #380, #401, #414, #425, #428, #422/#424, #114/#390/#430, #245/#360

This commit is contained in:
Sebastian Nagel 2023-07-12 16:16:30 +02:00
parent 6fb34cf856
commit a62bd80140

View File

@ -1,6 +1,11 @@
Crawler-Commons Change Log
Current Development 1.4-SNAPSHOT (yyyy-mm-dd)
- [Robots.txt] Implement Robots Exclusion Protocol (REP) IETF Draft: port unit tests (sebastian-nagel, Richard Zowalla) #245, #360
- [Robots.txt] Close groups of rules as defined in RFC 9309 (kkrugler, garyillyes, jnioche, sebastian-nagel) #114, #390, #430
- [Robots.txt] Empty disallow statement not to clear other rules (sebastian-nagel, jnioche) #422, #424
- [Robots.txt] SimpleRobotRulesParser main() to follow five redirects (sebastian-nagel, jnioche) #428
- [Robots.txt] Add more spelling variants and typos of robots.txt directives (sebastian-nagel, jnioche) #425
- [Robots.txt] Document effect of rules merging in combination with multiple agent names (sebastian-nagel, Richard Zowalla) #423, #426
- [Robots.txt] Pass empty collection of agent names to select rules for any robot (wildcard user-agent name) (sebastian-nagel, Richard Zowalla) #427
- [Robots.txt] Rename default user-agent / robot name in unit tests (sebastian-nagel, Richard Zowalla) #429
@ -9,16 +14,17 @@ Current Development 1.4-SNAPSHOT (yyyy-mm-dd)
- [Robots.txt] Deduplicate robots rules before matching (sebastian-nagel, jnioche) #416
- [Robots.txt] SimpleRobotRulesParser main to use the new API method (sebastian-nagel, jnioche) #413
- Generate JaCoCo reports when testing (jnioche) #409, #412
- Push Code Coverage to Coveralls (Richard Zowalla, jnioche) #414
- [Robots.txt] Path analyse bug with url-decode if allow/disallow path contains escaped wild-card characters (tkalistratov, sebastian-nagel, Richard Zowalla) #195, #408
- [Robots.txt] Handle allow/disallow directives containing unescaped Unicode characters (sebastian-nagel, Richard Zowalla, aecio) #389
- Improve readability of robots.txt unit tests (sebastian-nagel, Richard Zowalla) #383
- Upgrade project to use Java 11 (Avi Hayun, Richard Zowalla, aecio, sebastian-nagel) #320
- [Robots.txt] Handle allow/disallow directives containing unescaped Unicode characters (sebastian-nagel, Richard Zowalla, aecio) #389, #401
- [Robots.txt] Improve readability of robots.txt unit tests (sebastian-nagel, Richard Zowalla) #383
- Upgrade project to use Java 11 (Avi Hayun, Richard Zowalla, aecio, sebastian-nagel) #320, #376
- [Robots.txt] RFC compliance: matching user-agent names when selecting rule blocks (sebastian-nagel, Richard Zowalla) #362
- [Robots.txt] Matching user-agent names does not conform to robots.txt RFC (YossiTamari, sebastian-nagel) #192
- [Robots.txt] Improve robots check draft rfc compliance (Eduardo Jimenez) #351
- Upgrade dependencies (dependabot) #379, #384, #394, #399, #404, #419
- Upgrade Maven plugins (dependabot) #377, #381, #386, #396, #397, #398, #400, #402, #403, #405, #406, #407, #415, #418
- Javadoc: ensure Javascript search is working (sebastian-nagel, Richard Zowalla, aecio) #378
- Javadoc: ensure Javascript search is working (sebastian-nagel, Richard Zowalla, aecio) #378, #380
Release 1.3 (2022-07-19)
- [Sitemaps] Disable support for DTDs in XML sitemaps and feeds by default (Kenneth Wong) #371