What is a good Web search and web crawling engine

2020-07-21 19:20发布

问题:

I am working on an application where I need to integrate the search engine. This should do crawling also. Please suggest a good Java based search engine.

Thank you in advance.

回答1:

Nutch (Lucene) is an Open Source engine which should satisfy your needs.



回答2:

In the past I worked with terrier, a search engine written in Java:

Terrier is a highly flexible, efficient, effective, and robust search engine, readily deployable on large-scale collections of documents. Terrier implements state-of-the-art indexing and retrieval functionalities. Terrier provides an ideal platform for the rapid development of large-scale retrieval applications.



回答3:

I've spent the last 2 years developing our own high performance search engine with C. For Java I highly suggest Apache Lucene as Ajay mentioned above. For Java, it's the best in terms of speed, relevancy and features.