I have docx document with some placeholders. Now I should replace them with other content and save new docx document. I started with docx4j and found this method:
public static List<Object> getAllElementFromObject(Object obj, Class<?> toSearch) {
List<Object> result = new ArrayList<Object>();
if (obj instanceof JAXBElement) obj = ((JAXBElement<?>) obj).getValue();
if (obj.getClass().equals(toSearch))
result.add(obj);
else if (obj instanceof ContentAccessor) {
List<?> children = ((ContentAccessor) obj).getContent();
for (Object child : children) {
result.addAll(getAllElementFromObject(child, toSearch));
}
}
return result;
}
public static void findAndReplace(WordprocessingMLPackage doc, String toFind, String replacer){
List<Object> paragraphs = getAllElementFromObject(doc.getMainDocumentPart(), P.class);
for(Object par : paragraphs){
P p = (P) par;
List<Object> texts = getAllElementFromObject(p, Text.class);
for(Object text : texts){
Text t = (Text)text;
if(t.getValue().contains(toFind)){
t.setValue(t.getValue().replace(toFind, replacer));
}
}
}
}
But that only work rarely because usually the placeholders splits across multiple texts runs.
I tried UnmarshallFromTemplate but it work rarely too.
How this problem could be solved?
You can use VariableReplace
to acheive this which may not have existed at the time of the other answers.
This does not do a find/replace per se but works on placeholders eg $(myField)
java.util.HashMap mappings = new java.util.HashMap();
VariablePrepare.prepare(wordMLPackage);//see notes
mappings.put("myField", "foo");
wordMLPackage.getMainDocumentPart().variableReplace(mappings);
Note that you do not pass $(myField)
as the field name; rather pass the unescaped field name myField
- This is rather inflexible in that as it currently stands your placeholders must be of the format $(xyz)
whereas if you could pass in anything then you could use it for any find/replace. The ability to use this also exists for C# people in docx4j.NET
See here for more info on VariableReplace
or here for VariablePrepare
Good day, I made an example how to quickly replace text to something you need
by regexp. I find ${param.sumname} and replace it in document.
Note, you have to insert text as 'text only'!
Have fun!
WordprocessingMLPackage mlp = WordprocessingMLPackage.load(new File("filepath"));
replaceText(mlp.getMainDocumentPart());
static void replaceText(ContentAccessor c)
throws Exception
{
for (Object p: c.getContent())
{
if (p instanceof ContentAccessor)
replaceText((ContentAccessor) p);
else if (p instanceof JAXBElement)
{
Object v = ((JAXBElement) p).getValue();
if (v instanceof ContentAccessor)
replaceText((ContentAccessor) v);
else if (v instanceof org.docx4j.wml.Text)
{
org.docx4j.wml.Text t = (org.docx4j.wml.Text) v;
String text = t.getValue();
if (text != null)
{
t.setSpace("preserve"); // needed?
t.setValue(replaceParams(text));
}
}
}
}
}
static Pattern paramPatern = Pattern.compile("(?i)(\\$\\{([\\w\\.]+)\\})");
static String replaceParams(String text)
{
Matcher m = paramPatern.matcher(text);
if (!m.find())
return text;
StringBuffer sb = new StringBuffer();
String param, replacement;
do
{
param = m.group(2);
if (param != null)
{
replacement = getParamValue(param);
m.appendReplacement(sb, replacement);
}
else
m.appendReplacement(sb, "");
}
while (m.find());
m.appendTail(sb);
return sb.toString();
}
static String getParamValue(String name)
{
// replace from map or something else
return name;
}
This can be a problem. I cover how to mitigate broken-up text runs in this answer here: https://stackoverflow.com/a/17066582/125750
... but you might want to consider content controls instead. The docx4j source site has various content control samples here:
https://github.com/plutext/docx4j/tree/master/src/samples/docx4j/org/docx4j/samples