1
0
mirror of https://github.com/crawler-commons/crawler-commons synced 2024-09-26 10:10:52 +02:00
Commit Graph

86 Commits

Author SHA1 Message Date
kkrugler_lists@transpac.com
ea67b56e42 Add tests for wildcards (via alparslanavci), and sorting rules 2014-03-13 23:50:17 +00:00
kkrugler_lists@transpac.com
af74ccf44d Add support for wildcards (via alparslanavci), and sorting rules 2014-03-13 23:49:49 +00:00
kkrugler_lists@transpac.com
300d6ebdb7 Roll in patch from Lewis for issue #23 (http://code.google.com/p/crawler-commons/issues/detail?id=23) 2014-01-24 21:16:38 +00:00
kkrugler_lists@transpac.com
dc8f241782 Fix up tests to match latest data file 2014-01-24 21:05:46 +00:00
kkrugler_lists@transpac.com
aa4d410223 Make setProcessed public, was implicitly package private 2014-01-24 20:51:33 +00:00
kkrugler_lists@transpac.com
dbae7e20df Updated comments w/link to actual data Mozilla data file 2014-01-24 20:44:51 +00:00
kkrugler_lists@transpac.com
16e46b0d50 Added a few more suffixes 2014-01-24 20:44:31 +00:00
kkrugler_lists@transpac.com
a98bb030af Updated to latest from http://mxr.mozilla.org/mozilla-central/source/netwerk/dns/effective_tld_names.dat?raw=1 2014-01-24 20:44:12 +00:00
digitalpebble@googlemail.com
9b6bf65b1a cleanup of ANT build remnants [lib and lib-ext] 2013-10-21 15:31:14 +00:00
digitalpebble@googlemail.com
816832b10b [maven-release-plugin] prepare for next development iteration 2013-10-11 15:21:59 +00:00
digitalpebble@googlemail.com
1389cf0066 [maven-release-plugin] prepare release crawler-commons-0.3 2013-10-11 15:21:52 +00:00
digitalpebble@googlemail.com
2e08419852 Fixed scm info in pom 2013-10-11 15:20:53 +00:00
digitalpebble@googlemail.com
ee88e20e4a [maven-release-plugin] rollback the release of crawler-commons-0.3 2013-10-11 15:18:50 +00:00
digitalpebble@googlemail.com
45975212ad [maven-release-plugin] prepare release crawler-commons-0.3 2013-10-11 15:13:19 +00:00
digitalpebble@googlemail.com
464d5c7956 [maven-release-plugin] rollback the release of crawler-commons-0.3 2013-10-11 12:48:50 +00:00
digitalpebble@googlemail.com
6ed2b2da50 [maven-release-plugin] prepare release crawler-commons-0.3 2013-10-11 12:40:15 +00:00
digitalpebble@googlemail.com
315a208b95 re-trying the release 2013-10-11 12:38:24 +00:00
digitalpebble@googlemail.com
92fb22c2a3 [maven-release-plugin] prepare release crawler-commons-0.3 2013-10-11 11:42:18 +00:00
digitalpebble@googlemail.com
097a927868 [maven-release-plugin] rollback the release of crawler-commons-0.3 2013-10-11 11:35:34 +00:00
digitalpebble@googlemail.com
704bf5ba8b [maven-release-plugin] prepare release crawler-commons-0.3 2013-10-11 11:06:20 +00:00
digitalpebble@googlemail.com
add77028cc [maven-release-plugin] rollback the release of crawler-commons-0.3 2013-10-11 10:59:37 +00:00
digitalpebble@googlemail.com
c7554efdcb [maven-release-plugin] prepare release crawler-commons-0.3 2013-10-11 10:58:00 +00:00
digitalpebble@googlemail.com
644254769e [maven-release-plugin] rollback the release of crawler-commons-0.3 2013-10-11 10:48:40 +00:00
digitalpebble@googlemail.com
dea86d57ea [maven-release-plugin] prepare for next development iteration 2013-10-11 10:46:23 +00:00
digitalpebble@googlemail.com
68106fd316 [maven-release-plugin] prepare release crawler-commons-0.3 2013-10-11 10:46:11 +00:00
digitalpebble@googlemail.com
baed790af1 upgraded version of Tika + reverted to 0.3-SNAPSHOT 2013-10-11 10:40:00 +00:00
digitalpebble@googlemail.com
ecdf47221e [maven-release-plugin] prepare for next development iteration 2013-10-03 09:31:50 +00:00
digitalpebble@googlemail.com
4e2b0bac6f [maven-release-plugin] prepare release crawler-commons-0.3 2013-10-03 09:31:44 +00:00
digitalpebble@googlemail.com
14919f77f0 marking version 0.3 in CHANGES 2013-10-03 09:12:38 +00:00
digitalpebble@googlemail.com
5f3ab105ad SiteMap tester can take mime type as argument 2013-10-03 09:04:23 +00:00
digitalpebble@googlemail.com
4ce4b358b6 issue 29 : more robust parsing when loc element is missing 2013-10-02 13:40:50 +00:00
digitalpebble@googlemail.com
d9e3cb4cbb Issue 25:Robots.txt parser should not lowercase sitemap URLs 2013-09-06 12:33:02 +00:00
digitalpebble@googlemail.com
15aa39d41c Added utility class for testing sitemaps 2013-07-18 14:01:37 +00:00
lewis.mcgibbney@gmail.com
a0328358c0 Issue 16: Remove Ant scripts and configurations 2013-07-01 19:18:25 +00:00
digitalpebble@googlemail.com
f4c0186292 Upgraded version of maven javadoc plugin 2013-07-01 10:33:59 +00:00
digitalpebble
7596599e02 issue 26 : default priority correctly implemented in SiteMaps 2013-05-24 14:15:51 +00:00
digitalpebble
40ef1f5a10 issue 27 : [SiteMap] Unnecessary String concatenations when logging + in SiteMapURL.toString() 2013-05-24 14:09:26 +00:00
lewis.mcgibbney@gmail.com
1042aa4436 cleanup pom 2013-01-30 06:35:14 +00:00
lewis.mcgibbney@gmail.com
7066ffee14 trivial commit to pom 2013-01-30 06:15:34 +00:00
lewis.mcgibbney@gmail.com
057e96c8a1 [maven-release-plugin] prepare for next development iteration 2013-01-30 06:05:01 +00:00
lewis.mcgibbney@gmail.com
28e3fe9d08 [maven-release-plugin] prepare release crawler-commons-0.2 2013-01-30 06:04:47 +00:00
lewis.mcgibbney@gmail.com
a184dae67f trivial update to project pom 2013-01-30 06:03:15 +00:00
lewis.mcgibbney@gmail.com
b5583b87f7 trivial update to project pon 2013-01-30 06:02:05 +00:00
lewis.mcgibbney@gmail.com
7f066c745e [maven-release-plugin] prepare release crawler-commons-0.2 2013-01-30 05:58:05 +00:00
lewis.mcgibbney@gmail.com
1fca6c714f clean up for 0.2 release 2013-01-30 04:12:34 +00:00
lewis.mcgibbney@gmail.com
6e7ee690d2 fix maven clean plugin configuration 2013-01-30 04:05:32 +00:00
lewis.mcgibbney@gmail.com
1494fe23eb substantiate project pom and move external jars out of lib 2013-01-30 03:57:47 +00:00
lewis.mcgibbney@gmail.com
df993771c1 Update for 0.2 release 2013-01-28 04:14:09 +00:00
lewis.mcgibbney@gmail.com
51bb23f01c CC 12 Substantiate Javadoc 2013-01-28 02:47:01 +00:00
lewis.mcgibbney@gmail.com
2f34db8056 CC 12 Substantiate Javadoc 2013-01-28 02:45:41 +00:00