Nutch error No agents listed in 'http.agent.name'

I am using nutch2.2.1. The log file generates the following error

ERROR protocol.RobotRulesParser - The agent we advertise (nutch-spider-2.2.1) is not listed first in the http.robots.agents property!

My nutch-site.xml (for the property above)

<property>
<name>http.agent.name</name>
<value>nutch-spider-2.2.1</value>
</property>

      

my nutch-default.xml

<property>
<name>http.agent.name</name>
<value></value>
</property>

      

Where is the actual problem? Read it clearly (explain correctly). This question is posted here , but I have to forward this question (if necessary) to post it again.

+3


source to share


1 answer


You will add a property "http.robots.agents" and put http.agent.name as the first agent name and keep the default * at the end of the list. Just:



<property>
     <name>http.robots.agents</name>
     <value>nutch-spider-2.2.1,*</value>
</property>

      

+3


source







All Articles