How to delete pages from pdf file using Python?

2020-05-23 09:01发布

I have some .pdf files with more than 500 pages, but I need only a few pages in each file. It is necessary to preserve document`s title pages. I know exactly the numbers of the pages that program should remove. How I can do it using Python 2.7 Environment, which is installed upon MS Visual Studio?

标签: python pdf
2条回答
家丑人穷心不美
2楼-- · 2020-05-23 09:45

Use pyPDF2:

https://github.com/mstamy2/PyPDF2

Documentation is at:

https://pythonhosted.org/PyPDF2/

It seems pretty intuitive.

查看更多
Emotional °昔
3楼-- · 2020-05-23 09:48

Try using PyPDF2.

Instead of deleting pages, create a new document and add all pages which you don't want to delete.

Some sample code (originally adapted from BinPress which is dead, archived here).

from PyPDF2 import PdfFileWriter, PdfFileReader
pages_to_keep = [1, 2, 10] # page numbering starts from 0
infile = PdfFileReader('source.pdf', 'rb')
output = PdfFileWriter()

for i in pages_to_keep:
    p = infile.getPage(i)
    output.addPage(p)

with open('newfile.pdf', 'wb') as f:
    output.write(f)

or

from PyPDF2 import PdfFileWriter, PdfFileReader
pages_to_delete = [3, 4, 5] # page numbering starts from 0
infile = PdfFileReader('source.pdf', 'rb')
output = PdfFileWriter()

for i in range(infile.getNumPages()):
    if i not in pages_to_delete:
        p = infile.getPage(i)
        output.addPage(p)

with open('newfile.pdf', 'wb') as f:
    output.write(f)
查看更多
登录 后发表回答