Make output of diff-lcs human readable

2019-07-21 02:10发布

问题:

I'm using the diff-lcs gem to output a difference between two bodies of html content. Here's the sample content.

Version one:

<p>Paragraph one. Sentence one.</p>

<p>Paragraph two. Another sentence.</p>

<p>Paragraph three. I dare you to change me!</p>

Version two:

<p>Paragraph one. Sentence two.</p>

<p>Paragraph two. Another sentence.</p>

<p>Paragraph three. I dare you to update me!</p>

Using this:

seq1 = @versionOne.body
seq2 = @versionTwo.body

seq = Diff::LCS.diff(seq1, seq2)

You get this monster:

seq => [[#<Diff::LCS::Change:0x0000000be539f8 @action="-", @position=27, @element="t">, #<Diff::LCS::Change:0x0000000be538b8 @action="-", @position=28, @element="w">], [#<Diff::LCS::Change:0x0000000be53520 @action="+", @position=28, @element="n">, #<Diff::LCS::Change:0x0000000be53408 @action="+", @position=29, @element="e">], [#<Diff::LCS::Change:0x0000000be3aa70 @action="-", @position=110, @element="u">, #<Diff::LCS::Change:0x0000000be3a840 @action="-", @position=111, @element="p">, #<Diff::LCS::Change:0x0000000be34ee0 @action="-", @position=112, @element="d">, #<Diff::LCS::Change:0x0000000be349e0 @action="+", @position=110, @element="c">, #<Diff::LCS::Change:0x0000000be348a0 @action="+", @position=111, @element="h">], [#<Diff::LCS::Change:0x0000000be34580 @action="-", @position=114, @element="t">, #<Diff::LCS::Change:0x0000000be34210 @action="+", @position=113, @element="n">, #<Diff::LCS::Change:0x0000000be33f40 @action="+", @position=114, @element="g">], [#<Diff::LCS::Change:0x0000000be331d0 @action="-", @position=124, @element="">]]

The outputs of sdiff and other methods found in the documentation are similarly horrifying. I understand the structure of the array (of arrays) but there must be a simple way to show differences in a human readable and style-able manner.

PS - If someone wants to create a diff-lcs tag, that would be appreciated.

回答1:

What I was feeding diff-lcs was a regular string - an array of characters. If what I wanted was a compare of characters, I got what I wanted, but I wanted something more readable - a compare of words, lines or sentences. I chose sentences.

seq1 = @versionOne.body.split('.')
seq2 = @versionTwo.body.split('.')
compareDiff = Diff::LCS.sdiff(seq1, seq2)

This produced much more readable and parse-able content. Realistically, I'll also want to split by ! and ?. The structure however is not a normal array of arrays or hash. The output in the browser threw me off but it's an array of objects and you can parse it like anything else. This is the YAML formatted output I got in the rails console (no idea why it didn't show this in the browser):

---
- !ruby/object:Diff::LCS::ContextChange
  action: "="
  new_element: <p>Paragraph one
  new_position: 0
  old_element: <p>Paragraph one
  old_position: 0
- !ruby/object:Diff::LCS::ContextChange
  action: "!"
  new_element: " Sentence two"
  new_position: 1
  old_element: " Sentence one"
  old_position: 1
- !ruby/object:Diff::LCS::ContextChange
  action: "="
  new_element: |-
    </p>
    <p>Paragraph two
  new_position: 2
  old_element: |-
    </p>
    <p>Paragraph two
  old_position: 2
- !ruby/object:Diff::LCS::ContextChange
  action: "="
  new_element: " Another sentence"
  new_position: 3
  old_element: " Another sentence"
  old_position: 3
- !ruby/object:Diff::LCS::ContextChange
  action: "="
  new_element: |-
    </p>
    <p>Paragraph three
  new_position: 4
  old_element: |-
    </p>
    <p>Paragraph three
  old_position: 4
- !ruby/object:Diff::LCS::ContextChange
  action: "!"
  new_element: " I dare you to update me!</p>"
  new_position: 5
  old_element: " I dare you to change me!</p>"
  old_position: 5
 => nil

Super helpful! This will output a wiki-like diff:

sdiff = Diff::LCS.sdiff(seq2, seq1)

diffHTML = ''

sdiff.each do |diff|
  case diff.action
  when '='
    diffHTML << diff.new_element + "."
  when '!'
    # strip_tags only needed on old_element. removes pre-mature end tags.
    diffHTML << "<del>#{diff.old_element.strip_tags}</del> <add>#{diff.new_element}</add>. "
  end
end

@compareBody = diffHTML.html

...[format do block]

Then just style <del> and <add> as you wish. If you're looking for something easier, diffy might be it, but this is very flexible once you figure it out.