Edit and create HTML file using Python

2019-01-18 15:22发布

问题:

I am really new to Python. I am currently working on an assignment for creating an HTML file using python. I understand how to read an HTML file into python and then edit and save it.

table_file = open('abhi.html', 'w')
table_file.write('<!DOCTYPE html><html><body>')
table_file.close()

The problem with the above piece is it's just replacing the whole HTML file and putting the string inside write(). How can I edit the file and the same time keep it's content intact. I mean, writing something like this, but inside the body tags

<link rel="icon" type="image/png" href="img/tor.png">

I need the link to automatically go in between the opening and closing body tags.

回答1:

You probably want to read up on BeautifulSoup:

import bs4

# load the file
with open("existing_file.html") as inf:
    txt = inf.read()
    soup = bs4.BeautifulSoup(txt)

# create new link
new_link = soup.new_tag("link", rel="icon", type="image/png", href="img/tor.png")
# insert it into the document
soup.head.append(new_link)

# save the file again
with open("existing_file.html", "w") as outf:
    outf.write(str(soup))

Given a file like

<html>
<head>
  <title>Test</title>
</head>
<body>
  <p>What's up, Doc?</p>
</body>
</html>  

this produces

<html>
<head>
<title>Test</title>
<link href="img/tor.png" rel="icon" type="image/png"/></head>
<body>
<p>What's up, Doc?</p>
</body>
</html> 

(note: it has munched the whitespace, but gotten the html structure correct).



回答2:

You are using write (w) mode which will erase the existing file (https://docs.python.org/2/tutorial/inputoutput.html#reading-and-writing-files). Use append (a) mode instead:

table_file = open('abhi.html', 'a')