I am trying to read a XML file from a third party with Nokogiri in my rails project.
One of the nodes I have ot parse contains an URL with unescaped ampersands (like foo.com/index.html?page=1&query=bar
)
I understand that this is considered malformed XML, and Nokogiri just tries to parse it anyway, resulting in foo.com/index.html?page=1=bar
.
How can I obtain the full URL? Can I tweak Nokogiri? Would you do a search&replace-prerun or what would be the best practice?