I'm writing a parser in Haskell for the site using the packages Text.XML and Text.XML.Cursor.
There are unclosed tags and get an error:
Main.hs: Error parsing XML file dat.html: 29:1-29:8: Expected end
element for: Name {nameLocalName = "br", nameNamespace = Nothing,
namePrefix = Nothing}, but received: EventEndElement (Name
{nameLocalName = "body", nameNamespace = Nothing, namePrefix =
Nothing})
What to do? How to ignore such tags?
A text object with unclosed tags is not well-formed and is therefore not XML.
So, forget about using any XML libraries, parsers, or tools. They are, by definition and design, not able to help you.
You have two options. Either,
- Repair the textual object to be well-formed by closing the unclosed
tags. You might do this manually or try using TIDY, or
- Define a new data format that allows unclosed tags, and write a
parser from the ground up for it.