words are stemming while parsing using itext libra

2019-09-09 20:46发布

问题:

        public void renderText(TextRenderInfo renderInfo) {
        text = renderInfo.getText().toLowerCase();
          System.out.println("@   "+text);
        Vector curBaseline = renderInfo.getBaseline().getStartPoint();
        Vector topRight = renderInfo.getAscentLine().getEndPoint();

        Rectangle rect = new Rectangle(curBaseline.get(0), curBaseline.get(1),      topRight.get(0), topRight.get(1));
        float curFontSize = rect.getHeight();
        int size = (int) curFontSize;
        at[i][0] = "" + size;
        at[i++][1] = text;
        //System.out.println(text);
    }

i used this code to extract words from the pdf,but when i am getting words splitting like security as s e curity and words which contains v&e are dividing as 2 words how to modify the code so that i can get exact word as it is using itext library?