This question is related to my another SO question.
To keep IndexWriter open for the duration of a partitioned step, I thought to add IndexWriter
in ExecutionContext
of partitioner and then close in a StepExecutionListenerSupport
's afterStep(StepExecution stepExecution)
method.
Challenge that I am facing in this approach is that ExecutionContext
needs Objects to be serializable.
In light of these two questions, Q1, Q2 -- it doesn't seem feasible because I can't add a no - arg constructor in my custom writer because IndexWriter
doesn't have any no - arg constructor.
public class CustomIndexWriter extends IndexWriter implements Serializable {
/*
private Directory d;
private IndexWriterConfig conf;
public CustomIndexWriter(){
super();
super(this.d, this.conf);
}
*/
public CustomIndexWriter(Directory d, IndexWriterConfig conf) throws IOException {
super(d, conf);
}
/**
*
*/
private static final long serialVersionUID = 1L;
private void readObject(ObjectInputStream input) throws IOException, ClassNotFoundException{
input.defaultReadObject();
}
private void writeObject(ObjectOutputStream output) throws IOException, ClassNotFoundException {
output.defaultWriteObject();
}
}
In above code, I can't add constructor shown as commented because no - arg constructor doesn't exist in Super class and can't access this
fields before super
.
Is there a way to achieve this?
You can always add a parameter-less constructor.
E.g:
public class CustomWriter extends IndexWriter implements Serializable {
private Directory lDirectory;
private IndexWriterConfig iwConfig;
public CustomWriter() {
super();
// Assign default values
this(new Directory("." + System.getProperty("path.separator")), new IndexWriterConfig());
}
public CustomWriter(Directory dir, IndexWriterConfig iwConf) {
lDirectory = dir;
iwConfig = iwConf;
}
public Directory getDirectory() { return lDirectory; }
public IndexWriterConfig getConfig() { return iwConfig; }
public void setDirectory(Directory dir) { lDirectory = dir; }
public void setConfig(IndexWriterConfig conf) { iwConfig = conf; }
// ...
}
EDIT:
Having taken a look at my own code (using Lucene.Net), the IndexWriter needs an analyzer, and a MaxFieldLength.
So the super-call would look something like this:
super(new Directory("." + System.getProperty("path.separator")), new StandardAnalyzer(), MaxFieldLength.UNLIMITED);
So adding these values as defaults should fix the issue. Maybe then add getter- and setter-methods for the analyzer and MaxFieldLength, so you have control over that at a later stage.
I am not sure how but this syntax works in Spring Batch and ExecutionContext
returns a non - null Object in StepExecutionListenerSupport
.
public class CustomIndexWriter implements Serializable {
private static final long serialVersionUID = 1L;
private transient IndexWriter luceneIndexWriter;
public CustomIndexWriter(IndexWriter luceneIndexWriter) {
this.luceneIndexWriter=luceneIndexWriter;
}
public IndexWriter getLuceneIndexWriter() {
return luceneIndexWriter;
}
public void setLuceneIndexWriter(IndexWriter luceneIndexWriter) {
this.luceneIndexWriter = luceneIndexWriter;
}
}
I put an instance of CustomIndexWriter
in step partitioner, partitioned step chunk works with writer by doing, getLuceneIndexWriter()
and then in StepExecutionListenerSupport
, I close this writer.
This way my spring batch partitioned step works with a single instance of Lucene Index Writer Object.
I was hoping that I will get a NullPointer if trying to perform operation on writer obtained by getLuceneIndexWriter()
but that doesn't happen ( despite it being transient
). I am not sure why this works but it does.
For Spring Batch job metadata, I am using in - memory repository and not db based repository. Not sure if this will continue to work once I start using db for metadata.