1
0
Fork 0
mirror of https://github.com/crawler-commons/crawler-commons synced 2024-05-29 12:46:04 +02:00
crawler-commons/src/test
Sebastian Nagel 774c5c8092
Improvements to BasicURLNormalizer (#292)
- better percent-encoding of URL paths and queries, fixes #263
- hostnames:
  * convert IDNs from Unicode to Punycode, fixes #248
  * remove trailing dot
- normalize path `/..` to `/`
- also normalize path of file:/ URLs
2020-06-22 13:51:39 +01:00
..
java/crawlercommons Sitemaps to implement Serializable, fixes #244 (#294) 2020-06-22 12:51:40 +01:00
resources Improvements to BasicURLNormalizer (#292) 2020-06-22 13:51:39 +01:00