How can I compute a difference between two strings

2020-02-25 08:41发布

I want to create a function in Delphi that computes different levels of two strings. If two strings are equal (ignoring case), then it should return 0, but if they are not equal, it should return the number of different characters. This function can be very useful for checking spelling.

function GetDiffStringLevel(S1,S2:string):Integer;
begin
  if SameText(S1,S2) then Exit(0);
  // i want get different chars count
end

samples code:

Diff:=GetDiffStringLevel('Hello','Hello');// Diff:=0;
Diff:=GetDiffStringLevel('Hello','2Hello');// Diff:=1;
Diff:=GetDiffStringLevel('Hello','H2ello');// Diff:=1;
Diff:=GetDiffStringLevel('Hello','Hello W');// Diff:=2;
Diff:=GetDiffStringLevel('Hello','World');// Diff:=6; or 5

2条回答
成全新的幸福
2楼-- · 2020-02-25 09:11

What you want is known as the Levenshtein distance (the minimum number of edits to transform one string into the other, where an edit is either a character insertion, character deletion or character substitution). The wikipedia site has a pseudo code implementation.

Delphi implementation:

function LevenshteinDistance(String1 : String; String2 : String) : Integer;

var
  Length1, Length2      : Integer;
  WorkMatrix            : array of array of Integer;
  I, J                  : Integer;
  Cost                  : Integer;
  Val1, Val2, Val3      : Integer;

begin
String1 := TCharacter.ToUpper (String1);
String2 := TCharacter.ToUpper (String2);
Length1 := Length (String1);
Length2 := Length (String2);
SetLength (WorkMatrix, Length1+1, Length2+1);
for I := 0 to Length1 do
  WorkMatrix [I, 0] := I;
for J := 0 to Length2 do
  WorkMatrix [0, J] := J;
for I := 1 to Length1 do
  for J := 1 to Length2 do
    begin
    if (String1 [I] = String2 [J]) then
      Cost := 0
    else
      Cost := 1;
    Val1 := WorkMatrix [I-1, J] + 1;
    Val2 := WorkMatrix [I, J-1] + 1;
    Val3 := WorkMatrix[I-1, J-1] + Cost;
    if (Val1 < Val2) then
      if (Val1 < Val3) then
        WorkMatrix [I, J] := Val1
      else
        WorkMatrix [I, J] := Val3
    else
      if (Val2 < Val3) then
        WorkMatrix [I, J] := Val2
      else
        WorkMatrix [I, J] := Val3;
    end;
Result := WorkMatrix [Length1, Length2];
end;
查看更多
我想做一个坏孩纸
3楼-- · 2020-02-25 09:36

Fast and compact implementation.

About 3 times as fast as smasher's implementation with normal strings. More than 100 times as fast if one of the strings is empty.

Smasher's function is case insensitive though, which can be useful as well.

function LevenshteinDistance(const s, t: string): integer;inline;
var
  d   : array of array of integer;
  n, m, i, j : integer;
begin
  n := length(s);
  m := length(t);
  if n = 0 then Exit(m);
  if m = 0 then Exit(n);

  SetLength(d, n + 1, m + 1);
  for i := 0 to n do d[i, 0] := i;
  for j := 0 to m do d[0, j] := j;

  for i := 1 to n do
    for j := 1 to m do
      d[i, j] := Min(Min(d[i-1, j]+1, d[i,j-1]+1), d[i-1,j-1]+Integer(s[i] <> t[j]));

  Result := d[n, m];
end;

Note: the inline directive reduces the execution time to less than 70% on my machine, but only for the win32 target platform. If you compile to 64bits (Delphi XE2), inlining actually makes it a tiny bit slower.

查看更多
登录 后发表回答