- When a new item is added in MySQL, it must be also indexed by Lucene.
- When an existing item is removed from MySQL, it must be also removed from Lucene's index.
The idea is to write a script that will be called every x minutes via a scheduler (e.g. a CRON task). This is a way to keep MySQL and Lucene synchronized. What I managed until yet:
- For each new added item in MySQL, Lucene indexes it too.
- For each already added item in MySQL, Lucene does not reindex it (no duplicated items).
This is the point I'm asking you some help to manage:
- For each previously added item that has been then removed from MySQL, Lucene should also unindex it.
Here is the code I used, which tries to index a MySQL table tag (id [PK] | name)
:
public static void main(String[] args) throws Exception {
Class.forName("com.mysql.jdbc.Driver").newInstance();
Connection connection = DriverManager.getConnection("jdbc:mysql://localhost/mydb", "root", "");
StandardAnalyzer analyzer = new StandardAnalyzer(Version.LUCENE_36);
IndexWriterConfig config = new IndexWriterConfig(Version.LUCENE_36, analyzer);
IndexWriter writer = new IndexWriter(FSDirectory.open(INDEX_DIR), config);
String query = "SELECT id, name FROM tag";
Statement statement = connection.createStatement();
ResultSet result = statement.executeQuery(query);
while (result.next()) {
Document document = new Document();
document.add(new Field("id", result.getString("id"), Field.Store.YES, Field.Index.NOT_ANALYZED));
document.add(new Field("name", result.getString("name"), Field.Store.NO, Field.Index.ANALYZED));
writer.updateDocument(new Term("id", result.getString("id")), document);
}
writer.close();
}
PS: this code is for tests purpose only, no need to tell me how awful it is :)
EDIT:
One solution could be to delete any previsouly added document, and reindex all the database:
writer.deleteAll();
while (result.next()) {
Document document = new Document();
document.add(new Field("id", result.getString("id"), Field.Store.YES, Field.Index.NOT_ANALYZED));
document.add(new Field("name", result.getString("name"), Field.Store.NO, Field.Index.ANALYZED));
writer.addDocument(document);
}
I'm not sure it's the most optimized solution, is it?