Issue with unclosed img tag

2019-07-07 17:10发布

问题:

data presented in HTML format and submitted to server, that does some preprocessing.

It operates with "src" attribute of "img" tag.

After preprocessing and saving, all the preprocessed "img" tags are not self-closed.

For example, if "img" tag was following:

<img src="image.png" />

after preprocessing with Nokogiri or Hpricot, it will be:

<img src="/preprocessed_path/image.png">

The code is pretty simple:

doc = Hpricot(self.content)
doc.search("img").each do |tag|
  preprocess tag
end
self.content = doc.to_html

For Nokorigi, it looks the same.

How to resolve this issue ?


Update 1

Forget to mention - i have HTML 5 page, which i'm trying to validate with W3C Validator.

When "img" tag is inside a div, it complaints about following:

required character (found d) (expected i)
</div>

For example, try to validate following code:

<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
    <meta charset="UTF-8" />
</head>
<body>
    <div>
        <img src="image.png">
    </div>
</body>
</html>

You will get the same error:

Line 9, Column 4: required character (found d) (expected i)
</div>

回答1:

I think the problem is with your <html> where it delcares the xmlns attribute as XHTML. This seems like it would be contradictory to the fact that it's not an XHTML document. If you remove this xmlns attribute, it should be valid.

<!DOCTYPE html>
<html>
  <head>
  <meta charset="utf-8" />
  <title>something here</title>
</head>
<body>
  <div>
    <img src="image.png">
  </div>
</body>
</html>


回答2:

The problem is that your libraries are generating correct HTML, and the trailing "/" is not correct in HTML. Unless you're serving application/xhtml+xml, there's no point in having it there at all — the IMG tag is self-closing in all versions of HTML, and the "/" is meaningless. If you are serving application/xhtml+xml, I think you'll need to explicitly use Nokogiri's to_xhtml.



回答3:

In the preprocess function you are delegating to, do you not have control over each img tag? Can you not return what it is already return and append an explicit close tag?