INJECT step keeps retrieving only single URL - trying to crawl CNN. I'm with default config (below is the nutch-site) - what could that be - shouldn't it be 10 docs according to my value?
<configuration>
<property>
<name>http.agent.name</name>
<value>crawler1</value>
</property>
<property>
<name>storage.data.store.class</name>
<value>org.apache.gora.hbase.store.HBaseStore</value>
<description>Default class for storing data</description>
</property>
<property>
<name>solr.server.url</name>
<value>http://x.x.x.x:8983/solr/collection1</value>
</property>
<property>
<name>plugin.includes</name>
<value>protocol-httpclient|urlfilter-regex|index-(basic|more)|query-(basic|site|url|lang)|indexer-solr|nutch-extensionpoints|protocol-httpclient|urlfilter-reg
ex|parse-(text|html|msexcel|msword|mspowerpoint|pdf)|summary-basic|scoring-opic|urlnormalizer-(pass|regex|basic)protocol-http|urlfilter-regex|parse-(html|tika|m
etatags)|index-(basic|anchor|more|metadata)</value>
</property>
<property>
<name>db.ignore.external.links</name>
<value>true</value>
</property>
<property>
<name>generate.max.count</name>
<value>10</value>
</property>
</configuration>