Similar question have been asked here but none fits to my need.
I made test cases to see which is faster. But I feel my linq code is still slow. How do I construct the linq code for faster performance?
Others said using double .Tolist()
can cause slower operation, when I test it, it shows that it is faster than any other tests.
Test:
Preparation
---------------------------------------------------------------
return Properties of UserInfo(userinf, true){
UserID = userinf.UserID;
FirstName = userinf.user.FirstName;
MiddleName = userinf.user.MiddleName;
LastName = userinf.user.LastName;
LoginID = userinf.user.LoginID;
Birthday = userinf.Birthday;
}
skip = 0;
take = 100;
total table records = 304;
Linq to Entity Framework
Fiddler: v2.4.0.0
https://127.0.0.1/..../RetrieveUserInfo?skip=0&take=100
{
"client":{
"SessionID":"5433ab64-7e0d-444f-b886-a901ea9a0601"
},
"session":{
"SessionID":"35b75daa-25ad-45a4-9f99-0e69ec3b66a4"
}
}
//Test 1
//1) 00:00:15.3068755 -- Attempt1
//2) 00:00:13.8207905 -- Attempt2
//3) 00:00:16.2489294 -- Attempt3
var list = (from usr in dbase.userinfoes
select usr).OrderBy(i => i.UserID).Skip(skip).Take(take).ToList();
userlist = (from i in list
select new UserInfo(i, true)).ToList();
///Test 2
//1) 00:00:15.3908803
//2) 00:00:14.8818512
//3) 00:00:19.4761140
var list = (from usr in dbase.userinfoes.AsEnumerable().OrderBy(i => i.UserID).Skip(skip).Take(take).ToList()
select new UserInfo(usr, true)).ToList();
//Test 3
//1) 00:00:30.1937270
//2) 00:00:24.1003784
//3) 00:00:28.8806519
var list = dbase.userinfoes.OrderBy(i => i.UserID).Skip(skip).Take(take).ToList();
userlist = (from i in list select new UserInfo(i, true)).ToList();
//Test 4
//1) 00:00:57.2652754
//2) 00:00:54.4051118
//3) 00:00:55.3251644
var list = (from usr in dbase.userinfoes
select usr).ToList();
userlist = (from i in list
select new UserInfo(i, true)).OrderBy(i => i.UserID).Skip(skip).Take(take).ToList();
//Test 5
//1) 00:01:06.8378229
//2) 00:01:01.2845053
//3) 00:00:55.0721499
var list = from usr in dbase.userinfoes
select usr;
userlist = (from i in list.AsEnumerable()
select new UserInfo(i, true)).OrderBy(i => i.UserID).Skip(skip).Take(take).ToList();
// Test 6
// VERY LONG. It tooks all records first and construct UserInfo one by one before doing the skip and take
var list = (from usr in dbase.userinfoes.AsEnumerable()
select new UserInfo(usr, true)).OrderBy(i => i.UserID).Skip(skip).Take(take).ToList();
//Test 7
// VERY LONG. It tooks all records first and construct UserInfo one by one before doing the skip and take
var list = from usr in dbase.userinfoes.AsEnumerable()
select new UserInfo(usr);
Proper code for faster search. Thanks to casperOne
for pointing out that the ordering, skip and take are all performed on the server is more faster.
Here is the final code:
var list = (from usr in dbase.userinfoes
.OrderBy(i => i.UserID)
.Skip(skip)
.Take(take)
.AsEnumerable()
select new UserInfo(usr, true)).ToList();
1) 00:00:10.9210513
2) 00:00:10.8270973
3) 00:00:10.8250151
Thanks to Richard Neil Ilagan
for the final code.
Here's why each one is performing as it is and why you're seeing what you're seeing:
Test 1:
///Test 2
//1) 00:00:15.3068755 -- Attempt1
//2) 00:00:13.8207905 -- Attempt2
//3) 00:00:16.2489294 -- Attempt3
var list = (from usr in dbase.userinfoes
select usr).OrderBy(i => i.UserID).Skip(skip).Take(take).ToList();
userlist = (from i in list
select new UserInfo(i, true)).ToList();
This is absolutely the fastest. It's the fastest because the ordering, skip and take are all performed on the server. Because you probably have indexes, the server is beefy (powerful), etc. it can handle these operations much faster than if you materialized the whole set on the client and then performed the operations there.
UserInfo
constructed only on the post-processed list.
Test 2:
///Test 2
//1) 00:00:15.3908803
//2) 00:00:14.8818512
//3) 00:00:19.4761140
var list = (
from usr in dbase.userinfoes.AsEnumerable().
OrderBy(i => i.UserID).Skip(skip).Take(take).ToList()
select new UserInfo(usr, true)
).ToList();
This should have a performance impact the same as test 7; the call to AsEnumerable
is forcing all subsequent operations to be performed in memory (and the call to OrderBy
is going to require you to materialize all of the instances before they're ordered).
This is a bit if an anomaly. I'd be curious to see what the SQL sent to the server is (assuming you're using SQL server or some SQL-based back end), to make sure that it's selecting all the records back.
UserInfo
constructed only on the post-processed list.
Test 3:
//Test 3
//1) 00:00:30.1937270
//2) 00:00:24.1003784
//3) 00:00:28.8806519
var list = dbase.userinfoes.OrderBy(i => i.UserID).
Skip(skip).Take(take).ToList();
userlist = (from i in list select new UserInfo(i, true)).ToList();
Again, the order by, skip, and take are taking place on the server. You're materializing the list twice (you have two calls to ToList
), which is the only explanation for the overhead I can see).
UserInfo
constructed only on the post-processed list.
Test 4:
//Test 4
//1) 00:00:57.2652754
//2) 00:00:54.4051118
//3) 00:00:55.3251644
var list = (from usr in dbase.userinfoes
select usr).ToList();
userlist = (from i in list select new UserInfo(i, true)).
OrderBy(i => i.UserID).Skip(skip).Take(take).ToList();
You're materializing the entire list in memory now, so there's more overhead now.
UserInfo
constructed on the pre-processed list.
Test 5:
//Test 5
//1) 00:01:06.8378229
//2) 00:01:01.2845053
//3) 00:00:55.0721499
var list = from usr in dbase.userinfoes
select usr;
userlist = (from i in list.AsEnumerable()
select new UserInfo(i, true)).
OrderBy(i => i.UserID).Skip(skip).Take(take).ToList();
Same as test two, you're doing all the operations on the client side.
UserInfo
constructed on the pre-processed list.
Test 6:
// Test 6
// VERY LONG. It tooks all records first and construct
// UserInfo one by one before doing the skip and take
var list = (from usr in dbase.userinfoes.AsEnumerable()
select new UserInfo(usr, true)).
OrderBy(i => i.UserID).Skip(skip).Take(take).ToList();
UserInfo
constructed on the pre-processed list.
Again, doing all the operations on the client side.
// Test 7
// VERY LONG. It tooks all records first and construct
// UserInfo one by one before doing the skip and take
var list = from usr in dbase.userinfoes.AsEnumerable()
select new UserInfo(usr);
Again, doing all the operations on the client side.
UserInfo
constructed on the pre-processed list.
There is one difference I notice in all of these tests, and that's where you call the constructor for the UserInfo
instance. In the places where the performance is good, you put off constructing the instance of UserInfo
as late as possible (after you perform order, take, skip operations), whereas when the performance is bad, you're constructing the UserInfo
instance up front, before these operations take place (when there is usually more calls to the UserInfo
constructor).
That said, it would seem that your performance issues could lie within the constructor for the UserInfo
class, and not in LINQ. Generally, when you let the IQueryable<T>
provider perform the operations on the underlying data source, it's generally faster than doing those operations in memory on the client.
Without seeing the constructor code though, it's impossible to tell, but your numbers certainly suggest that the problem lies there, and not in LINQ.