How to crawl images in Nutch? Or, is there any other open search engine which is producing the results with images?
可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
回答1:
change your regex-urlfilter.txt
in conf
-.(ico|ICO|css|CSS|sit|SIT|eps|EPS|wmf|WMF|zip|ZIP|ppt|PPT|xls|XLS|gz|GZ|rpm|RPM|tgz|TGZ|exe|EXE|js|JS|gif|GIF|png|PNG||jpg|JPG|jpeg|JPEG|bmp|BMP|mpg|MPG|mov|MOV)$
Delete jpeg
, jpg
, gif
or type
picture that you want to grep.
And then change suffix-urlfilter.txt
in conf
add #
to jpeg
, gif
or png
That worked for me!