Java, Lucene : Set lock timeout for IndexWriter in

2019-07-03 22:11发布

I am working on integrating Lucene with our Spring-MVC based application. Currently we have it working, but rarely we get a cannot obtain lock error. After which I have to manually delete the lock file and then it works normally.

How can I set a timeout for Locking the index in Java? I don't have any XML configuration for Lucene. I added the project library in maven via POM.xml and instantiated the required classes.

Code :

public void saveIndexes(String text, String tagFileName, String filePath, long groupId, boolean type, int objectId) {
        try {
            // path is the indexing directory. 
            File testDir;
            Path suggestionsPath;
            Directory suggestionsDir;

            Path phraseSuggestPath;
            Directory phraseSuggestDir;

            Directory directory = org.apache.lucene.store.FSDirectory.open(path);
            IndexWriterConfig config = new IndexWriterConfig(new SimpleAnalyzer());
            IndexWriter indexWriter = new IndexWriter(directory, config);

            org.apache.lucene.document.Document doc = new org.apache.lucene.document.Document();
            if (filePath != null) {
                File file = new File(filePath); // current directory
                doc.add(new TextField("path", file.getPath(), Field.Store.YES));
            }
            doc.add(new StringField("id", String.valueOf(objectId), Field.Store.YES));
            //  doc.add(new TextField("id",String.valueOf(objectId),Field.Store.YES));
            if (text == null) {
                if (filePath != null) {
                    FileInputStream is = new FileInputStream(filePath);
                    BufferedReader reader = new BufferedReader(new InputStreamReader(is));
                    StringBuilder stringBuffer = new StringBuilder();
                    String line;
                    while ((line = reader.readLine()) != null) {
                        stringBuffer.append(line).append("\n");
                    }
                    stringBuffer.append("\n").append(tagFileName);
                    reader.close();
                    doc.add(new TextField("contents", stringBuffer.toString(), Field.Store.YES));
                }
            } else {

                FieldType fieldType = new FieldType(TextField.TYPE_STORED);
                fieldType.setTokenized(false);
                doc.add(new Field("contents", text+"\n"+tagFileName, fieldType));
            }
            indexWriter.addDocument(doc);
            indexWriter.commit();
            indexWriter.flush();
            indexWriter.close();
            directory.close();

            StandardAnalyzer analyzer = new StandardAnalyzer();
            AnalyzingInfixSuggester wordSuggester = new AnalyzingInfixSuggester(suggestionsDir, analyzer);

            ArrayList<String> words = new ArrayList<>();
            if (text != null) {
                text = html2text(text);
                Pattern pt = Pattern.compile("[^\\w\\s]");
                Matcher match = pt.matcher(text);
                while (match.find()) {
                    String s = match.group();
                    text = text.replaceAll("\\" + s, "");
                }

                if (text.contains(" ")) {
                    Collections.addAll(words, text.split(" "));

                } else {
                    words.add(text);
                }
                SuggestionIterator suggestionIterator = new SuggestionIterator(words.iterator());
                wordSuggester.build(suggestionIterator);
                wordSuggester.close();
                suggestionsDir.close();
            }

            AnalyzingInfixSuggester phraseSuggester = new AnalyzingInfixSuggester(phraseSuggestDir, analyzer);
            if (text != null) {
                text = html2text(text);
                ArrayList<String> phrases = new ArrayList<>();
                phrases.add(text);
                SuggestionIterator suggestionIterator = new SuggestionIterator(phrases.iterator());
                phraseSuggester.build(suggestionIterator);
                phraseSuggester.close();
                phraseSuggestDir.close();
            }

        } catch (Exception ignored) {
        }
    }

Thank you.

标签: java lucene
1条回答
Emotional °昔
2楼-- · 2019-07-03 22:32

I quote two things from IndexWriter Documentation,

Opening an IndexWriter creates a lock file for the directory in use. Trying to open another IndexWriter on the same directory will lead to a LockObtainFailedException.

and

NOTE: IndexWriter instances are completely thread safe, meaning multiple threads can call any of its methods, concurrently. If your application requires external synchronization, you should not synchronize on the IndexWriter instance as this may cause deadlock; use your own (non-Lucene) objects instead.

So you can't open IndexWriter again if its already opened and not closed somewhere else. In your case, there happens to be some unlucky timing when two users are in same code block.

You can address this issue in two ways ,

1.Designate Critical Section: Mark code portion having writer opening , usage and close operation as critical section and apply Java synchronization on that critical section. Use some app Singleton bean to synchronize on. So when another user hits that block, he will wait till first user is done and lock is released.

2.Single Writer Instance:Develop a mechanism in your app to open and close writer only once for the life time of application and pass that single instance in service code so writer methods could get called by as many users as you wish since writer instance is made thread-safe by Lucene folks. I guess, this can be achieved by a Singleton Spring bean and by injecting that bean in your service.

Drawback in second approach is - multi server deployments for a single global index directory and if there are other applications trying to open writers on that global Index. This problem can be solved by wrapping your index writer instance creation code in some kind of global service that keeps returning the same instance to whichever application tries to use it.

This is not a simple issue that you are trying to solve by deleting lock files or by introducing time outs. You have to model your design as per IndexWriter documentation and not other way round.

Having single writer instance will introduce some performance improvements too.

Also, make a practice to do an empty commit just after creating the writer. This helped me in solving some issues in past.

查看更多
登录 后发表回答