I want to extract vector graphics (lines and points) out of a pdf with pdfclown. I have tried to wrap my head around the graphics sample but i cannot figure out how the object model works for this. Please can anyone explain the relationships?
可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
回答1:
You are right: till PDF Clown 0.1 series, high-level path modelling was not implemented (it would have been derived from ContentScanner.GraphicsWrapper).
Next release (0.2 series, due next month) will support the high-level representation of all the graphics contents, including path objects (PathElement), through the new ContentModeller. Here is an example:
import org.pdfclown.documents.contents.elements.ContentModeller;
import org.pdfclown.documents.contents.elements.GraphicsElement;
import org.pdfclown.documents.contents.elements.PathElement;
import org.pdfclown.documents.contents.objects.Path;
import java.awt.geom.GeneralPath;
for(GraphicsElement<?> element : ContentModeller.model(page, Path.class))
{
PathElement pathElement = (PathElement)element;
List<ContentMarker> markers = pathElement.getMarkers();
pathElement.getBox();
GeneralPath getPath = pathElement.getPath();
pathElement.isFilled();
pathElement.isStroked();
}
In the meantime, you can extract the low-level representation of the vector graphics iterating the content stream through ContentScanner as suggested in ContentScanningSample (available in the downloadable distribution), looking for path-related operations (BeginSubpath, DrawLine, DrawRectangle, DrawCurve, ...).