NPE is generated because parseFloat returns a Float object that can be set null in case of NumberFormatException, but the VideoPrice accepts only float.
To bypass this issue and avoid reccuring errors, I've moved the VideoPrice price field to a Float object instead accepting null in case of.
It is far from ideal, and parseFloat would enjoy being able to parse different locale formatting. Anyway, in a first quick fix, this allows the rest of the file to be parsed,
whereas the previous error had all the file to fail while parsing.
- optionally parse elements in the namespace of sitemap extensions:
- Google video sitemaps (resolves #35)
- Google image sitemaps (resolves #36)
- Google news sitemaps
- alternate links in sitemaps (resolves #149)
- the code is taken from Tanguy Moal's (@tuxnco) PR #162
with the following modifications:
- port from DOM to SAX parser
- keep specific extensions separate from the "core" sitemap classes
* Use the Java 8 date and time API (java.time.*) to parse dates in sitemaps
- use thread-safe DateTimeFormatter instead of ThreadLocal<DateFormat>
- simplify parsing of RSS publication dates
- remove obsolete regex pattern to catch dates with time zone
but without seconds (covered by DateTimeFormatter.ISO_OFFSET_DATE_TIME)
- extend unit tests
* Fix Javadoc error and warnings, update change log
* Remove obsolete dependency to jaxb-api
- import of javax.xml.bind.DatatypeConverter has been removed
by updating to Java 8 date and time API
* Allow for legacy URIs when checking sitemap namespaces
- e.g., allow legacy namespace URI but ignore URLs
from image and video sitemap extensions
- resolve relative namespace URIs
- add namespace URIs of sitemap extensions (news, images, videos)
* Address kkrugler's review comments:
- document addition of sitemap namespace required by sitemap
protocol specification when calling setStrictNamespace(true)
- remove early return on <rss> root element
* Add main to SimpleRobotRulesParser for testing
- implement toString() for robot rules
- fix line breaks in comments
* Do not detect MIME type as Tika dependency has been removed
- Makes SimpleRobotRulesParser._rules property protected
and adds getters for SimpleRobotRulesParser._rules and
RobotRules's properties
- Changes SimpleRobotRulesParser return type from BaseRobotRules
to SimpleRobotRules to allow access to concrete class without
nasty type casts while still obeying super class contract
- ignore query part of URL to determine sitemap location prefix
for URL validation, fixes #202
- resolve relative links in RSS feeds, fixes #203
- allow non-continuous content (containing XML entities or CDATA)
when parsing links in RSS feeds, fixes #204
- extract links from <guid> elements in RSS feeds, fixes #201