I am working on a java project that needs to read a pdf file.
I know it is possible using some external libraries like itext.
But is it possible to read a pdf file using java inbuild features without using any external library?
I am working on a java project that needs to read a pdf file.
I know it is possible using some external libraries like itext.
But is it possible to read a pdf file using java inbuild features without using any external library?
Yes it is possible. For reading pdf file from java gone through Apache PDFBOX. This PDFBOX allows creation of new PDF documents, manipulation of existing documents and the ability to extract content from documents. Apache PDFBox also includes several command line utilities.
You can to recover the text of a PDF file with Apache PDFBox. In maven project pom.xml, we must add dependence
<dependency>
<groupId>org.apache.pdfbox</groupId>
<artifactId>pdfbox</artifactId>
<version>2.0.8</version>
</dependency>
The code:
try {
DLFileEntry fileEntry = DLFileEntryLocalServiceUtil.getFileEntry(folder.getGroupId(), folder.getFolderId(), fileName);
File file = DLFileEntryLocalServiceUtil.getFile(themeDisplay.getUserId(), fileEntry.getFileEntryId(), fileEntry.getVersion(), true);
PDDocument pddDocument=PDDocument.load(file);
PDFTextStripper textStripper = new PDFTextStripper();
String text = textStripper.getText(pddDocument);
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
To read/create a PDF, see the documentation:
https://pdfbox.apache.org/