I receive HTML pages from our creative team, and then use those to build aspx pages. One challenge I frequently face is getting the HTML I spit out to match theirs exactly. I almost always end up screwing up the nesting of <div>
s between my page and the master pages.
Does anyone know of a tool that will help in this situation -- something that will compare 2 pages and output the structural differences? I can't use a standard diff tool, because IDs change from what I receive from creative, text replaces lorem ipsum, etc..
You can use HTMLTidy to convert the HTML to well-formed XML so you can use XML Diff, as Gulzar suggested.
tidy -asxml index.html
If out output XML compliant HTML. Or at least translate your HTML product into XML compliancy, you at least could then XSL your output to remove the content and id tags. Apply the same transformation to their html, and then compare.
I was thinking on lines of XML Diff since HTML can be represented as an XML Document.
The challenge with HTML is that it might not be always well formed. Found one more here showing how to use XMLDiff class.
A copy of my own answer from here.
What about DaisyDiff (Java and PHP vesions available).
Following features are really nice:
- Works with badly formed HTML that can be found "in the wild".
- The diffing is more specialized in HTML than XML tree differs. Changing part of a text node will not cause the entire node to be changed.
- In addition to the default visual diff, HTML source can be diffed coherently.
- Provides easy to understand descriptions of the changes.
- The default GUI allows easy browsing of the modifications through keyboard shortcuts and links.
winmerge is a good visual diff program