I'm using this to get the text of a PDF file using org.apache.pdfbox
File f = new File(fileName);
if (!f.isFile()) {
System.out.println("File " + fileName + " does not exist.");
return null;
}
try {
parser = new PDFParser(new FileInputStream(f));
} catch (Exception e) {
System.out.println("Unable to open PDF Parser.");
return null;
}
try {
parser.parse();
cosDoc = parser.getDocument();
pdfStripper = new PDFTextStripper();
pdDoc = new PDDocument(cosDoc);
parsedText = pdfStripper.getText(pdDoc);
} catch (Exception e) {
e.printStackTrace();
}
It works great for the PDFs I've used it on so far. Now I have a PDF form that has editable text fields in it. My code does not return the text inside the fields. I would like to get that text. Is there a way to get it using PDFBox?