如何生成的文档查看器工作明文源代码的PDF的例子吗？(How to generate plain-t

我刚刚发现后的Adobe论坛：简单的文本字符串实例规格破碎。，所以我有兴趣，找到纯文本的源代码的PDF的例子。

因此，通过后，我终于找到：

该网页PDF参考和Adobe扩展PDF规范| Adobe开发人员连接 ; 其中包含：
- PDF格式的文档管理-可移植文档格式-第1部分：PDF 1.7，第一版（PDF32000_2008.pdf）

在PDF 1.7规范有699页的附录“_Annex H（信息）示例的PDF文件”; 并从那里，我想尝试“H.3简单的文本字符串示例”（“经典的Hello World”）。

于是，我就这样保存为hello.pdf （当你从PDF32000_2008.pdf复制_except笔记，你可能会得到“ %PDF-1. 4 -也就是说，之后插入空格” 1. ，它必须是removed_）：

%PDF-1.4
1 0 obj
  << /Type /Catalog
      /Outlines 2 0 R
      /Pages 3 0 R
  >>
endobj

2 0 obj
  << /Type /Outlines
      /Count 0
  >>
endobj

3 0 obj
  << /Type /Pages
      /Kids [ 4 0 R ]
      /Count 1
  >>
endobj

4 0 obj
  << /Type /Page
      /Parent 3 0 R
      /MediaBox [ 0 0 612 792 ]
      /Contents 5 0 R
      /Resources << /ProcSet 6 0 R
      /Font << /F1 7 0 R >>
  >>
>>
endobj

5 0 obj
  << /Length 73 >>
stream
  BT
    /F1 24 Tf
    100 100 Td
    ( Hello World ) Tj
  ET
endstream
endobj

......我试图打开它：

evince hello.pdf

......不过，了Evince不能打开它：“无法打开文件/ PDF文件被破坏”; 并且：

Error: PDF file is damaged - attempting to reconstruct xref table...
Error: Couldn't find trailer dictionary
Error: Couldn't read xref table

我也请与qpdf ：

$ qpdf --check hello.pdf
WARNING: hello.pdf: file is damaged
WARNING: hello.pdf: can't find startxref
WARNING: hello.pdf: Attempting to reconstruct cross-reference table
hello.pdf: unable to find trailer dictionary while recovering damaged file

我要去哪里错了呢？

提前非常感谢任何答案，
干杯!

Answer 1:

你应该附加一个（语法正确的） xref和trailer部分的文件的末尾。这意味着：在您的PDF的每个对象需要在外部参照表中一行，即使字节偏移不正确地指出。然后Ghostscript的，PDFTK或qpdf可以重新树立正确的外部参照和渲染的文件：

[...]
endobj
xref 
0 8 
0000000000 65535 f 
0000000010 00000 n 
0000000020 00000 n 
0000000030 00000 n 
0000000040 00000 n 
0000000050 00000 n 
0000000060 00000 n 
0000000070 00000 n 
trailer 
<</Size 8/Root 1 0 R>> 
startxref 
555 
%%EOF

Answer 2:

啊该死的 - 我抄袭只是部分的代码; 的OP代码是关于PG 701中的一个 - 再有是困惑我页脚; 否则代码继续上皮克702 :/

（ 编辑：又见介绍PDF - GNUpdf （存档于类似的，更详细的例子））

因此，这里是完整的代码：

%PDF-1.4
1 0 obj
  << /Type /Catalog
      /Outlines 2 0 R
      /Pages 3 0 R
  >>
endobj

2 0 obj
  << /Type /Outlines
      /Count 0
  >>
endobj

3 0 obj
  << /Type /Pages
      /Kids [ 4 0 R ]
      /Count 1
  >>
endobj

4 0 obj
  << /Type /Page
      /Parent 3 0 R
      /MediaBox [ 0 0 612 792 ]
      /Contents 5 0 R
      /Resources << /ProcSet 6 0 R
      /Font << /F1 7 0 R >>
  >>
>>
endobj

5 0 obj
  << /Length 73 >>
stream
  BT
    /F1 24 Tf
    100 100 Td
    ( Hello World ) Tj
  ET
endstream
endobj

6 0 obj
  [ /PDF /Text ]
endobj

7 0 obj
  << /Type /Font
    /Subtype /Type1
    /Name /F1
    /BaseFont /Helvetica
    /Encoding /MacRomanEncoding
  >>
endobj

xref
0 8
0000000000 65535 f
0000000009 00000 n
0000000074 00000 n
0000000120 00000 n
0000000179 00000 n
0000000364 00000 n
0000000466 00000 n
0000000496 00000 n

trailer
  << /Size 8
    /Root 1 0 R
  >>
startxref
625
%%EOF

事实上，随着错误消息都在说，外部参照部分失踪了 ！

然而，这还没有结束-而这份文件将在打开evince ，表示出仍然会抱怨：

$ evince hello.pdf 
Error: PDF file is damaged - attempting to reconstruct xref table...

......等会qpdf ：

$ qpdf --check hello.pdf
WARNING: hello.pdf: file is damaged
WARNING: hello.pdf (file position 625): xref not found
WARNING: hello.pdf: Attempting to reconstruct cross-reference table
checking hello.pdf
PDF Version: 1.4
File is not encrypted
File is not linearized
WARNING: hello.pdf (object 5 0, file position 436): attempting to recover stream length

所以，实际上得到适当的例子，为的Adobe论坛：简单的文本字符串实例规格破碎。指出，外部参照表需要重建（有正确的字节偏移量）。

而为了做到这一点，我们可以使用pdftk为“ 修复PDF文件的损坏的XREF表和流长度（如果可能） ”：

$ pdftk hello.pdf output hello_repair.pdf

...现在hello_repair.pdf在打开evince没有问题-和qpdf报告：

$ qpdf --check hello_repair.pdf
checking hello_repair.pdf
PDF Version: 1.4
File is not encrypted
File is not linearized
No errors found

好了，希望这可以帮助别人，
干杯!

文章来源: How to generate plain-text source-code PDF examples that work in a document viewer?