1
0
mirror of https://github.com/crawler-commons/crawler-commons synced 2024-09-23 17:33:23 +02:00
Commit Graph

5 Commits

Author SHA1 Message Date
Sebastian Nagel
78d7e7e85f
Sitemaps to implement Serializable, fixes #244 (#294)
- make all sitemap classes including extensions to implement the
  Serializable interface
- extend sitemap parser unit tests to check object serialization
  on various types of sitemaps (index, Atom/RSS feeds, video sitemaps,
  etc.)
2020-06-22 12:51:40 +01:00
Evan Halley
ed0933f2b3
fixing NewsAttribute.equals(), comparing against that.publicationDate, updating the unit teset, added change to Changes.txt (#291) 2020-06-16 14:39:05 +01:00
Evan Halley
c04e3f17e7
Adding asMap to ExtensionMetadata Interface (#288)
* added abstract method to extension metadata

* implemented asmap in image/link/mobile/news attributes

* implemented asmap in videoattributes

* adding video attributes unit test

* added news attributes unit tests

* unit test for link attributes

* unit tests for image and mobile attributes

* added constants to news and link attributes
fixing a small issue in NewsAttributes.toString

* using constants instead of strings in more attributes

* cleaned up the imports

* decreasing the visibility of LinkAttributes.PARAMS_PREFIX
adding a comment explaining it's usage

* added related issue to the changelog

* reverting change to NewsAttributes.equal, that causes a unit test failure
2020-06-15 15:55:20 +01:00
Sebastian Nagel
8522cfdd34
[SiteMapParser] getPublicationDate in VideoAttributes may throw NPE, fixes #283 (#286)
- check for null values before converting ZonedDateTime to Date
2020-02-17 15:33:38 +00:00
Sebastian Nagel
b924bd0828 Sitemap extension support
- optionally parse elements in the namespace of sitemap extensions:
  - Google video sitemaps (resolves #35)
  - Google image sitemaps (resolves #36)
  - Google news sitemaps
  - alternate links in sitemaps (resolves #149)
- the code is taken from Tanguy Moal's (@tuxnco) PR #162
  with the following modifications:
  - port from DOM to SAX parser
  - keep specific extensions separate from the "core" sitemap classes
2018-09-28 12:04:39 +02:00