I am working on integrating Lucene with our Spring-MVC based application. Currently we have it working, but rarely we get a cannot obtain lock
error. After which I have to manually delete the lock file and then it works normally.
How can I set a timeout for Locking the index in Java? I don't have any XML configuration for Lucene. I added the project library in maven via POM.xml and instantiated the required classes.
Code :
public void saveIndexes(String text, String tagFileName, String filePath, long groupId, boolean type, int objectId) {
try {
// path is the indexing directory.
File testDir;
Path suggestionsPath;
Directory suggestionsDir;
Path phraseSuggestPath;
Directory phraseSuggestDir;
Directory directory = org.apache.lucene.store.FSDirectory.open(path);
IndexWriterConfig config = new IndexWriterConfig(new SimpleAnalyzer());
IndexWriter indexWriter = new IndexWriter(directory, config);
org.apache.lucene.document.Document doc = new org.apache.lucene.document.Document();
if (filePath != null) {
File file = new File(filePath); // current directory
doc.add(new TextField("path", file.getPath(), Field.Store.YES));
}
doc.add(new StringField("id", String.valueOf(objectId), Field.Store.YES));
// doc.add(new TextField("id",String.valueOf(objectId),Field.Store.YES));
if (text == null) {
if (filePath != null) {
FileInputStream is = new FileInputStream(filePath);
BufferedReader reader = new BufferedReader(new InputStreamReader(is));
StringBuilder stringBuffer = new StringBuilder();
String line;
while ((line = reader.readLine()) != null) {
stringBuffer.append(line).append("\n");
}
stringBuffer.append("\n").append(tagFileName);
reader.close();
doc.add(new TextField("contents", stringBuffer.toString(), Field.Store.YES));
}
} else {
FieldType fieldType = new FieldType(TextField.TYPE_STORED);
fieldType.setTokenized(false);
doc.add(new Field("contents", text+"\n"+tagFileName, fieldType));
}
indexWriter.addDocument(doc);
indexWriter.commit();
indexWriter.flush();
indexWriter.close();
directory.close();
StandardAnalyzer analyzer = new StandardAnalyzer();
AnalyzingInfixSuggester wordSuggester = new AnalyzingInfixSuggester(suggestionsDir, analyzer);
ArrayList<String> words = new ArrayList<>();
if (text != null) {
text = html2text(text);
Pattern pt = Pattern.compile("[^\\w\\s]");
Matcher match = pt.matcher(text);
while (match.find()) {
String s = match.group();
text = text.replaceAll("\\" + s, "");
}
if (text.contains(" ")) {
Collections.addAll(words, text.split(" "));
} else {
words.add(text);
}
SuggestionIterator suggestionIterator = new SuggestionIterator(words.iterator());
wordSuggester.build(suggestionIterator);
wordSuggester.close();
suggestionsDir.close();
}
AnalyzingInfixSuggester phraseSuggester = new AnalyzingInfixSuggester(phraseSuggestDir, analyzer);
if (text != null) {
text = html2text(text);
ArrayList<String> phrases = new ArrayList<>();
phrases.add(text);
SuggestionIterator suggestionIterator = new SuggestionIterator(phrases.iterator());
phraseSuggester.build(suggestionIterator);
phraseSuggester.close();
phraseSuggestDir.close();
}
} catch (Exception ignored) {
}
}
Thank you.
I quote two things from IndexWriter Documentation,
and
So you can't open IndexWriter again if its already opened and not closed somewhere else. In your case, there happens to be some unlucky timing when two users are in same code block.
You can address this issue in two ways ,
1.Designate Critical Section: Mark code portion having writer opening , usage and close operation as critical section and apply Java synchronization on that critical section. Use some app Singleton bean to synchronize on. So when another user hits that block, he will wait till first user is done and lock is released.
2.Single Writer Instance:Develop a mechanism in your app to open and close writer only once for the life time of application and pass that single instance in service code so writer methods could get called by as many users as you wish since writer instance is made thread-safe by Lucene folks. I guess, this can be achieved by a Singleton Spring bean and by injecting that bean in your service.
Drawback in second approach is - multi server deployments for a single global index directory and if there are other applications trying to open writers on that global Index. This problem can be solved by wrapping your index writer instance creation code in some kind of global service that keeps returning the same instance to whichever application tries to use it.
This is not a simple issue that you are trying to solve by deleting lock files or by introducing time outs. You have to model your design as per IndexWriter documentation and not other way round.
Having single writer instance will introduce some performance improvements too.
Also, make a practice to do an empty commit just after creating the writer. This helped me in solving some issues in past.