The ruby folks have Ferret. Someone know of any similar initiative for Python? We're using PyLucene at current, but I'd like to investigate moving to pure Python searching.
问题:
回答1:
Whoosh is a new project which is similar to lucene, but is pure python.
回答2:
The only one pure-python (not involving even C extension) search solution I know of is Nucular. It's slow (much slower than PyLucene) and unstable yet.
We moved from PyLucene-based home baked search and indexing to Solr but YMMV.
回答3:
I recently found pyndexter. It provides abstract interface to various different backend full-text search engines/indexers. And it ships with a default pure-python implementation.
These things can be disastrously slow though in Python.
回答4:
For some applications pure Python is overrated. Take a look at Xapian.
回答5:
lupy was a lucene port to pure python.The lupy people suggest that you use PyLucene. Sorry. Maybe you can use the Java sources in combination with Jython.
回答6:
+1 to the Xapian and Pyndexter answers.
Ferret is actually written in C with Ruby bindings on top. A pure Ruby search engine would be even slower than a pure Python one. I would love to see "someone else" write a Cython/Pyrex layer for Python interface to Ferret, but won't do it myself because why bother when there are Python bindings for Xapian.
回答7:
For non-pure Python, Sphinx Search with Python API works the fastest. From the benchmarks from multiple blogs, Sphinx Search is way faster than Lucene, uses way less memory and it is in C.
I am developing a multi-document search engine based on it, using python and web2py as framework.
回答8:
After weeks of searching for this, I found a nice Python solution: repoze.catalog. It's not strictly Python-only because it uses ZODB for storage, but it seems a better dependency to me than something like SOLR.