Everything I've read indicates that TRTTIContext is thread-safe.
However, TRTTIContext.FindType seems to fail (returns nil) occasionally when multithreading. Using a TCriticalSection around it fixes the issue. Note that I'm using XE6, and the issue doesn't seem to exist in XE. Edit: Seems to exist in all Delphi editions that have the new RTTI units.
I've worked up a test project you can use to see for yourself. Create a new VCL project, drop a TMemo and a TButton, replace unit1 with below, and assign the Form1.OnCreate, Form1.OnDestroy and Button1.OnClick events. The key CS is the GRTTIBlock in TTestThread.Execute. Currently disabled, I get between 3 and 5 failures when I run with 200 threads. Enabling the GRTTIBlock CS removes the failures.
unit Unit1;
interface
uses
Winapi.Windows, Winapi.Messages, System.SysUtils, System.Variants, System.Classes, Vcl.Graphics,
Vcl.Controls, Vcl.Forms, Vcl.Dialogs, Vcl.StdCtrls, SyncObjs, Contnrs, RTTI;
type
TTestThread = class(TThread)
private
FFailed: Boolean;
FRan: Boolean;
FId: Integer;
protected
procedure Execute; override;
public
property Failed: Boolean read FFailed;
property Ran: Boolean read FRan;
property Id: Integer read FId write FId;
end;
TForm1 = class(TForm)
Memo1: TMemo;
Button1: TButton;
procedure Button1Click(Sender: TObject);
procedure FormCreate(Sender: TObject);
procedure FormDestroy(Sender: TObject);
private
FThreadBlock: TCriticalSection;
FMaxThreadCount: Integer;
FThreadCount: Integer;
FRanCount: Integer;
FFailureCount: Integer;
procedure Log(AStr: String);
procedure ThreadFinished(Sender: TObject);
procedure LaunchThreads;
end;
var
Form1: TForm1;
implementation
var
GRTTIBlock: TCriticalSection;
{$R *.dfm}
{ TTestThread }
procedure TTestThread.Execute;
var
ctx : TRTTIContext;
begin
// GRTTIBlock.Acquire;
try
FFailed := not Assigned(ctx.FindType('Unit1.TForm1'));
FRan := True;
finally
// GRTTIBlock.Release;
end;
end;
{ TForm1 }
procedure TForm1.Button1Click(Sender: TObject);
begin
Randomize;
LaunchThreads;
Log(Format('Threads: %d, Ran: %d, Failures: %d',
[FMaxThreadCount, FRanCount, FFailureCount]));
end;
procedure TForm1.FormCreate(Sender: TObject);
begin
FThreadBlock := TCriticalSection.Create;
end;
procedure TForm1.FormDestroy(Sender: TObject);
begin
FThreadBlock.Free;
end;
procedure TForm1.Log(AStr: String);
begin
Memo1.Lines.Add(AStr);
end;
procedure TForm1.ThreadFinished(Sender: TObject);
var
tt : TTestThread;
begin
tt := TTestThread(Sender);
Log(Format('Thread %d finished', [tt.Id]));
FThreadBlock.Acquire;
try
Dec(FThreadCount);
if tt.Failed then
Inc(FFailureCount);
if tt.Ran then
Inc(FRanCount);
finally
FThreadBlock.Release;
end;
end;
procedure TForm1.LaunchThreads;
var
c : Integer;
ol : TObjectList;
t : TTestThread;
begin
FRanCount := 0;
FFailureCount := 0;
FMaxThreadCount := 200;
ol := TObjectList.Create(False);
try
// get all the thread objects created and ready
for c := 1 to FMaxThreadCount do
begin
t := TTestThread.Create(True);
t.FreeOnTerminate := True;
t.OnTerminate := ThreadFinished;
t.Id := c;
ol.Add(t);
end;
FThreadCount := FMaxThreadCount;
// start them all up
for c := 0 to ol.Count - 1 do
begin
TTestThread(ol[c]).Start;
Log(Format('Thread %d started', [TTestThread(ol[c]).Id]));
end;
repeat
Application.ProcessMessages;
FThreadBlock.Acquire;
try
if FThreadCount <= 0 then
Break;
finally
FThreadBlock.Release;
end;
until False;
finally
ol.Free;
end;
end;
initialization
GRTTIBlock := TCriticalSection.Create;
finalization
GRTTIBlock.Free;
end.
I think I found the problem. It is inside
TRealPackage.FindType
andMakeTypeLookupTable
.MakeTypeLookupTable
checks forFNameToType
being assigned. If not it runsDoMake
. This one is protected with TMonitor and checksFNameToType
being assigned again after entering.So far so good. But then happens the mistake as inside
DoMake
FNameToType
gets assigned causing other threads to happily passMakeTypeLookupTable
and return toFindType
which then does return false inFNameToType.TryGetValue
and returns nil.Fix (hopefully for XE8?):
Since
FNameToType
is used outside of the lockedDoMake
as indicator that execution can continue it should not be assigned insideDoMake
until it's properly filled up.Edit: Reported as https://quality.embarcadero.com/browse/RSP-9815
As Stefan explains, the problem is down to a faulty implementation of the double checked locking pattern. I'd like to expand on his answer and try to make it more clear what is wrong.
The erroneous code looks like this:
The fault is that the code that populates the shared resource
FNameToType
is executed afterFNameToType
has been assigned. That code which populates the shared resource needs to execute beforeFNameToType
is assigned.Consider two threads, A and B. They are the first threads to call
MakeTypeLookupTable
. Thread A arrives first, finds thatFNameToType
isnil
and callsDoMake
. Thread A acquires the lock and reaches the code that assignsFNameToType
. Now, before thread A manages to run any more code, thread B arrives inMakeTypeLookupTable
. It testsFNameToType
and finds that it is notnil
, and so returns immediately. The calling code then makes use ofFNameToType
. However,FNameToType
is not yet in a fit state to use. It has not been populated because thread A has not yet returned.The most obvious fix from Embarcadero's side looks like this:
However, take note of the comment that says presumes double-checked locking ok. Well, double checked locking is fine when the machine has a strong enough memory model. So it's all good on x86 and x64. But ARM has a relatively weak memory model. So I have strong doubts as to whether or not this fix is enough on ARM. Indeed I do wonder where else in the RTL that Embarcadero have used double checked locking.
If
TRealPackage
had been declared in the interface section of the code then it would be easy enough to patchTRealPackage.MakeTypeLookupTable
to apply the change above. However, that is not the case. So in order to apply a work around I suggest the following:TRealPackage.MakeTypeLookupTable
. Because initialization happens single threaded you avoid the race condition.Declare the global context like this, say:
And force the call to
TRealPackage.MakeTypeLookupTable
like this:So long as all your RTTI code goes via this single shared context then you cannot fall foul of this race.