我最近实施从IBM沃森文档转换API。 我总是得到一个编码错误转换PDF文件!
#!/usr/bin/env python
#coding: utf-8
import json
from watson_developer_cloud import DocumentConversionV1
from io import open
document_conversion = DocumentConversionV1(
username='{XXXXXXXXXXX}',
password='{XXXXXXXXXXXXX}',
version='2015-12-15'
)
config = {
'conversion_target': 'ANSWER_UNITS',
# Use a custom configuration.
'word': {
'heading': {
'fonts': [
{'level': 1, 'min_size': 24},
{'level': 2, 'min_size': 16, 'max_size': 24}
]
}
}
}
with open(('sample.pdf'), 'r') as document:
response = document_conversion.convert_document(document=document, config=config)
print(json.dumps(response, indent=2))
在这里输入图像描述
enter code here
你的错误是配置JSON。 你还在使用这个词的配置,而不是PDF配置JSON:
{
"pdf": {
"heading": {
"fonts": [
{"level": 1, "min_size": 24},
{"level": 2, "min_size": 18, "max_size": 23, "bold": true},
{"level": 3, "min_size": 14, "max_size": 17, "italic": false},
{"level": 4, "min_size": 12, "max_size": 13, "name": "Times New Roman"}
]
}
}}
如果你想使用的回答单位请添加到您的配置文件,以及:
var config = {
conversion_target: "answer_units",
"pdf": {
"heading": {
"fonts": [{
"level": 1,
"min_size": 24,
"max_size": 80
},
{
"level": 2,
"min_size": 18,
"max_size": 24,
"bold": false,
"italic": false
},
{
"level": 2,
"min_size": 18,
"max_size": 24,
"bold": true
},
{
"level": 3,
"min_size": 13,
"max_size": 18,
"bold": false,
"italic": false
},
{
"level": 3,
"min_size": 13,
"max_size": 18,
"bold": true
},
{
"level": 4,
"min_size": 11,
"max_size": 13,
"bold": true,
"italic": false
}
]
}
}
}
说明: https://www.ibm.com/watson/developercloud/doc/document-conversion/customizing.html