码迷,mamicode.com
首页 > 其他好文 > 详细

nutch 采集效率--设置采集间隔

时间:2014-09-05 12:37:31      阅读:193      评论:0      收藏:0      [点我收藏+]

标签:des   style   blog   color   io   div   sp   log   on   

fetcher.max.crawl.delay  默认是30秒,这里改为 5秒
修改nutch-default.xml
<property> <name>fetcher.max.crawl.delay</name> <value>5</value> <description> If the Crawl-Delay in robots.txt is set to greater than this value (in seconds) then the fetcher will skip this page, generating an error report. If set to -1 the fetcher will never skip such pages and will wait the amount of time retrieved from robots.txt Crawl-Delay, however long that might be. </description> </property>

 

nutch 采集效率--设置采集间隔

标签:des   style   blog   color   io   div   sp   log   on   

原文地址:http://www.cnblogs.com/i80386/p/3957662.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!