1
0
Fork 0
mirror of https://github.com/crawler-commons/crawler-commons synced 2024-05-03 22:26:15 +02:00

[BasicNormalizer] Query parameters normalization in BasicURLNormalizer,

closes #308
- add unit test to prove that an empty query is removed
This commit is contained in:
Sebastian Nagel 2023-06-12 22:29:54 +02:00
parent 9261174c6c
commit e5563c3049
2 changed files with 4 additions and 0 deletions

View File

@ -1,6 +1,7 @@
Crawler-Commons Change Log
Current Development 1.4-SNAPSHOT (yyyy-mm-dd)
- [BasicNormalizer] Query parameters normalization in BasicURLNormalizer (aecio, sebastian-nagel) #308
- [Robots.txt] Handle allow/disallow directives containing unescaped Unicode characters (sebastian-nagel, Richard Zowalla, aecio) #389
- Improve readability of robots.txt unit tests (sebastian-nagel, Richard Zowalla) #383
- Upgrade project to use Java 11 (Avi Hayun, Richard Zowalla, aecio, sebastian-nagel) #320

View File

@ -142,6 +142,9 @@ http:///////, http:/
http://example.com?,http://example.com/
http://example.com?a=1,http://example.com/?a=1
# empty query #308
http://example.com/?,http://example.com/
# normalizing percent escapes #263
https://www.last.fm/music/Prefuse+73/_/90%+of+My+Mind+Is+With+You,https://www.last.fm/music/Prefuse+73/_/90%25+of+My+Mind+Is+With+You

Can't render this file because it has a wrong number of fields in line 3.