I need to split a string to a TStringList with fixed-length sub-strings.
Currently I use:
procedure StrToStringList(ASource: string; AList: TStrings; AFixedLen: Integer);
begin
Assert(Assigned(AList));
while Length(ASource) > AFixedLen do
begin
AList.Add(LeftStr(ASource, AFixedLen));
Delete(ASource, 1, AFixedLen);
end;
AList.Add(ASource);
end;
This works, but seems to be slow. Any better / faster idea?
Edited: Profiling of the results:
The speed gain is quite impressive.
Here are the results of my (subjective) profiling.
Data size: 290KB, FixedLen: 100:
- Original code: 58 ms
- Heffernan: 1 ms
- Deltics: 1 ms
Data size: 2805KB, FixedLen: 100:
- Original code: 5803 ms
- Heffernan: 5 ms
- Deltics: 4 ms
There are some immediately obvious optimisations possible in your code:
Do not modify the source string, simply extract the required
substrings. You can then make the input string parameter const,
allowing the compiler to further optimise calls made to the procedure
Since you are dealing with fixed lengths and an input string of
known length, then you can pre-calculate the required capacity of
the string list and avoid memory re-allocations as the list is added
to.
Here is how I would go about this:
procedure StrToStringList(const aSource: String;
const aList: TStrings;
const aFixedLen: Integer);
var
idx: Integer;
srcLen: Integer;
begin
aList.Capacity := (Length(aSource) div aFixedLen) + 1;
idx := 1;
srcLen := Length(aSource);
while idx <= srcLen do
begin
aList.Add(Copy(aSource, idx, aFixedLen));
Inc(idx, aFixedLen);
end;
end;
The capacity calculation here may result in an excess capacity of 1 where the input string divides exactly by the fixed length, but this is a negligible overhead imho and acceptable where the goal is optimal performance (the alternative is a conditional branch to modify or calculate the capacity differently to cater for what is likely to be the minority of cases).
I think it is wasteful to modify the input string. Avoid this like this:
var
Remaining: Integer;
StartIndex: Integer;
begin
Remaining := Length(ASource);
StartIndex := 1;
while Remaining > 0 do
begin
AList.Add(Copy(ASource, StartIndex, AFixedLen));
inc(StartIndex, AFixedLen);
dec(Remaining, AFixedLen);
end;
end;
This will reduce the amount of heap allocation, and the copying.
However, I would not be surprised if you observed little gain in performance. In order to perform any serious optimisation we'd likely need to see some example input data.
Don't use delete
inside the loop. It causes the whole string to move. Use an integer
based index variable starting at 1 and increase it each time by AFixedLen
after you have used copy
to extract the substring until you reached the end.