Nutch 2.2.1 doesnt continue after Injector job

2019-01-29 00:57发布

问题:

I am learning nutch and trying to carawl as per this tutorial .I am working on an ubuntu machinewith bash shell. But when I run the script, the execution happens, but nothing happens after ,

InjectorJob: starting at 2014-03-23 09:28:50
InjectorJob: Injecting urlDir: urls/seed.txt

I have waited for hours, I tried running the same with sudo. The same issue occurs. I have tried with default urls given in the tutorial as well. What can be the probable errors?

回答1:

What was missing was I didnt add Proxy and port details in the nutch-site.xml, as I was accessing through proxy. setting up the same for Ant or JVM is not enough