1
0
Fork 0
mirror of https://github.com/crawler-commons/crawler-commons synced 2024-05-12 16:36:02 +02:00
crawler-commons/src
Sebastian Nagel 6fb34cf856
Implement Robots Exclusion Protocol (REP) IETF Draft: port unit tests (#360)
- port unit tests from https://github.com/google/robotstxt
- adapt "Google-only" unit tests dealing with overlong lines
  and none-standard user-agent names
- adapt unit tests dealing with overlong lines and percent-encoded
  URL paths were the behavior of SimpleRobotRulesParser is not
  wrong and could be even seen as an improvement compared to
  the restrictions put on API input params by the Google robots.txt parser
2023-07-12 15:28:59 +02:00
..
main Merge pull request #430 from sebastian-nagel/cc-390-114-robots-closing-rule-group 2023-07-12 10:35:48 +02:00
test Implement Robots Exclusion Protocol (REP) IETF Draft: port unit tests (#360) 2023-07-12 15:28:59 +02:00