public class SimpleRobotRulesParser extends BaseRobotsParser
Constructor and Description |
---|
SimpleRobotRulesParser() |
Modifier and Type | Method and Description |
---|---|
BaseRobotRules |
failedFetch(int httpStatusCode)
The fetch of robots.txt failed, so return rules appropriate give the HTTP
status code.
|
int |
getNumWarnings() |
BaseRobotRules |
parseContent(String url,
byte[] content,
String contentType,
String robotName)
Parse the robots.txt file in
|
public BaseRobotRules failedFetch(int httpStatusCode)
BaseRobotsParser
failedFetch
in class BaseRobotsParser
httpStatusCode
- a failure status code (NOT 2xx)public BaseRobotRules parseContent(String url, byte[] content, String contentType, String robotName)
BaseRobotsParser
parseContent
in class BaseRobotsParser
url
- URL that content was fetched from (for reporting purposes)content
- raw bytes from the site's robots.txt filecontentType
- HTTP response header (mime-type)robotName
- name of crawler, to be used when processing file contents
(just the name portion, w/o version or other details)public int getNumWarnings()
Copyright © 2015. All rights reserved.