Why does this string have a reference count of 4?

2019-04-18 16:26发布

This is a very Delphi specific question (maybe even Delphi 2007 specific). I am currently writing a simple StringPool class for interning strings. As a good little coder I also added unit tests and found something that baffled me.

This is the code for interning:

function TStringPool.Intern(const _s: string): string;
var
  Idx: Integer;
begin
  if FList.Find(_s, Idx) then
    Result := FList[Idx]
  else begin
    Result := _s;
    if FMakeStringsUnique then
      UniqueString(Result);
    FList.Add(Result);
  end;
end;

Nothing really fancy: FList is a TStringList that is sorted, so all the code does is looking up the string in the list and if it is already there it returns the existing string. If it is not yet in the list, it will first call UniqueString to ensure a reference count of 1 and then add it to the list. (I checked the reference count of Result and it is 3 after 'hallo' has been added twice, as expected.)

Now to the testing code:

procedure TestStringPool.TestUnique;
var
  s1: string;
  s2: string;
begin
  s1 := FPool.Intern('hallo');
  CheckEquals(2, GetStringReferenceCount(s1));
  s2 := s1;
  CheckEquals(3, GetStringReferenceCount(s1));
  CheckEquals(3, GetStringReferenceCount(s2));
  UniqueString(s2);
  CheckEquals(1, GetStringReferenceCount(s2));
  s2 := FPool.Intern(s2);
  CheckEquals(Integer(Pointer(s1)), Integer(Pointer(s2)));
  CheckEquals(3, GetStringReferenceCount(s2));
end;

This adds the string 'hallo' to the string pool twice and checks the string's reference count and also that s1 and s2 indeed point to the same string descriptor.

Every CheckEquals works as expected but the last. It fails with the error "expected: <3> but was: <4>".

So, why is the reference count 4 here? I would have expected 3:

  • s1
  • s2
  • and another one in the StringList

This is Delphi 2007 and the strings are therefore AnsiStrings.

Oh yes, the function StringReferenceCount is implemented as:

function GetStringReferenceCount(const _s: AnsiString): integer;
var
  ptr: PLongWord;
begin
  ptr := Pointer(_s);
  if ptr = nil then begin
    // special case: Empty strings are represented by NIL pointers
    Result := MaxInt;
  end else begin
    // The string descriptor contains the following two longwords:
    // Offset -1: Length
    // Offset -2: Reference count
    Dec(Ptr, 2);
    Result := ptr^;
  end;
end;

In the debugger the same can be evaluated as:

plongword(integer(pointer(s2))-8)^

Just to add to the answer from Serg (which seems to be 100% correct):

If I replace

s2 := FPool.Intern(s2);

with

s3 := FPool.Intern(s2);
s2 := '';

and then check the reference count of s3 (and s1) it is 3 as expected. It's just because of assigning the result of FPool.Intern(s2) to s2 again (s2 is both, a parameter and the destination for the function result) that causes this phenomenon. Delphi introduces a hidden string variable to assign the result to.

Also, if I change the function to a procedure:

procedure TStringPool.Intern(var _s: string);

the reference count is 3 as expected because no hidden variable is required.


In case anybody is interested in this TStringPool implementation: It's open source under the MPL and available as part of dzlib, which in turn is part of dzchart:

https://sourceforge.net/p/dzlib/code/HEAD/tree/dzlib/trunk/src/u_dzStringPool.pas

But as said above: It's not exactly rocket science. ;-)

1条回答
▲ chillily
2楼-- · 2019-04-18 17:30

Test this:

function RefCount(const _s: AnsiString): integer;
var
  ptr: PLongWord;
begin
  ptr := Pointer(_s);
  Dec(Ptr, 2);
  Result := ptr^;
end;

function Add(const S: string): string;
begin
  Result:= S;
end;

procedure TForm9.Button1Click(Sender: TObject);
var
  s1: string;
  s2: string;

begin
  s1:= 'Hello';
  UniqueString(s1);
  s2:= s1;
  ShowMessage(Format('%d', [RefCount(s1)]));   // 2
  s2:= Add(s1);
  ShowMessage(Format('%d', [RefCount(s1)]));   // 2
  s1:= Add(s1);
  ShowMessage(Format('%d', [RefCount(s1)]));   // 3
end;

If you write s1:= Add(s1) the compiler creates a hidden local string variable, and this variable is responsible for incrementing ref count. You should not bother about it.

查看更多
登录 后发表回答