mirror of
https://github.com/crawler-commons/crawler-commons
synced 2024-06-25 11:57:10 +02:00
6fb34cf856
- port unit tests from https://github.com/google/robotstxt - adapt "Google-only" unit tests dealing with overlong lines and none-standard user-agent names - adapt unit tests dealing with overlong lines and percent-encoded URL paths were the behavior of SimpleRobotRulesParser is not wrong and could be even seen as an improvement compared to the restrictions put on API input params by the Google robots.txt parser |
||
---|---|---|
.. | ||
main | ||
test |