html to .doc converter in Python?

2019-01-15 00:45发布

I am using pisa, which is an HTML to PDF conversion library for Python.

Does there exist the same thing for a Word document: an HTML to .doc conversion library for Python?

3条回答
可以哭但决不认输i
2楼-- · 2019-01-15 01:19

You could use win32com from the pywin32 python extensions for windows, to let MS Word convert it for you. A simple example:

import win32com.client

word = win32com.client.Dispatch('Word.Application')

doc = word.Documents.Add('example.html')
doc.SaveAs('example.doc', FileFormat=0)
doc.Close()

word.Quit()
查看更多
Evening l夕情丶
3楼-- · 2019-01-15 01:19

In case anybody else lands here attempting to convert the other way around, the above code works, but you need to modify the FileFormat value.

http://msdn.microsoft.com/en-us/library/ff839952.aspx

Example: Filtered html is 10, instead of 0.

查看更多
别忘想泡老子
4楼-- · 2019-01-15 01:31

Though I am not aware of a direct module that can allow you to convert this, however:

  1. You can convert HTML to plain text first using the html2text module.
  2. After that, you can use this the python-docx module to convert the text to a doc or a docx file.
查看更多
登录 后发表回答