I want to build a highly scalable application where I intend to use Lucene as my search engine library. While browsing through the docs and faqs, I realize that it only allows one index writer to be open on a storage location by creating some write.lock in index directory. We can open multiple IndexReaders on that index.
I am interested in building an architecture where there are number of indexers running on different machines/servers and multiple searcher answering various types of queries on the indexes created by these indexers. Both searchers and indexers will be running on different computers.
In such scenario it will be preferable to have multiple indexers use same index storage location to index the documents. How to achieve this? Should I go with something like NFS (Networked File System)? Has this issue been taken care of by Solr or some other framework on top of Lucene? One obvious solution which comes to my mind is to create one index per indexer and then asking the searchers to make query across multiple index dirs. But these will lead to large number of different index dirs being created, as many as there are indexer servers, which I guess isn't much desirable. I want (# of index dirs) << (# of indexers) < (# of searchers)
What are the various alternatives do I have in this case?