Use Nokogiri to replace >?

2019-05-18 13:53发布

问题:

How can I use nokogiri to replace all img tags with image tags? This is to utilize Rails' ability to plugin the correct asset server automatically?

require 'nokogiri'

class ToImageTag

  def self.convert
    Dir.glob("app/views/**/*").each do |filename|
      doc = Nokogiri::HTML(File.open(filename))
      doc.xpath("//img").each |img_tags|
        # grab the src and all the attributes and move them to ERB
      end

    # rewrite the file
    end

  rescue => err
    puts "Exception: #{err}"
  end

end

回答1:

Somewhat inspired by maerics' response, I've created a script that does this. It doesn't have an issue with HTML entities because it only uses the nokogiri output as a guide for replacement. The actual replacement is done by using String#gsub!

https://gist.github.com/1254319



回答2:

The closest I can come up with is as follows:

# ......
Dir.glob("app/views/**/*").each do |filename|
  # Convert each "img" tag into a text node.
  doc = Nokogiri::HTML(File.open(filename))
  doc.xpath("//img").each do |img|
    image_tag = "<%= image_tag('#{img['src']}') %>"
    img.replace(doc.create_text_node(image_tag))
  end
  # Replace the new text nodes with ERB markup.
  s = doc.to_s.gsub(/(&lt;%|%&gt;)/) {|x| x=='&lt;%' ? '<%' : '%>'}
  File.open(filename, "w") {|f| f.write(s)}
end

This solution will wreak havoc on any files which contain the sequences "&lt%" or "%&gt;" (e.g. if you're describing ERB syntax in HTML). The problem is that you're trying to use an XML parser to replace an XML node with text that must be escaped, so I'm not sure you can do much better than this, unless there is some hidden "raw_outer_xml=(str)" method.

You're best overall bet is to write a custom SAX parser which simply echoes the data given to your callbacks (or stores it in a string buffer) unless it is a "start_element" with an "img", in which case it will write the ERB sequence.