Our application has a process which fetches all users from Active Directory and updates the relevant SQL tables with their information. The process at nights and it was written a few years ago - so it's legacy code which works and "if it ain't broken, don't fix it". However we're introducing a new feature to our application which requires modifications to this code, and since it hasn't been touched for years I thought I might as well clean it up a little.
Said process runs ONLY during the night except for rare server faults, in which case we have to run it manually during the day. The process uses the good old System.DirectoryServices
library to do its job, and although it works, it runs quite slowly.
I thought about using the newer System.DirectoryServices.AccountManagement
library instead, so I started rewriting the whole process (a few hundreds lines of code) and I was amazed to see that PrincipalSearcher
dramatically outperforms DirectorySearcher
.
I've been trying to look for the reason and came upon the following SO answer which gives a comparison between the two, stating that DirectorySearcher
should be faster than PrincipalSearcher
.
I fired up a test project to make sure I was not hallucinating:
class Program
{
static void Main(string[] args)
{
// New stuff
var context = new PrincipalContext(ContextType.Domain, "mydomain.com");
var properties = new[] { "cn", "name", "distinguishedname", "surname", "title", "displayname" };
var i = 0;
var now = DateTime.Now;
new Thread(delegate()
{
while (true)
{
Console.Write("\r{0} ms, {1} results", (DateTime.Now - now).TotalMilliseconds, i);
Thread.Sleep(1000);
}
}).Start();
using (var searcher = new PrincipalSearcher(new UserPrincipal(context)))
{
var underlying = searcher.GetUnderlyingSearcher() as DirectorySearcher;
underlying.PageSize = 1000;
underlying.PropertiesToLoad.Clear();
underlying.PropertiesToLoad.AddRange(properties);
underlying.CacheResults = false;
using (var results = searcher.FindAll())
{
foreach (var result in results)
{
i++;
}
}
}
Console.WriteLine("It took {0}", (DateTime.Now - now).TotalMilliseconds);
now = DateTime.Now;
i = 0;
// Old stuff
var root = new DirectoryEntry("LDAP://DC=mydomain,DC=com");
var filter = "(&(objectCategory=user)(objectClass=user))";
using (var searcher = new DirectorySearcher(root, filter, properties))
{
searcher.PageSize = 1000;
searcher.CacheResults = false;
using (var results = searcher.FindAll())
{
foreach (var result in results)
{
i++;
}
}
}
Console.WriteLine("It took {0}", (DateTime.Now - now).TotalMilliseconds);
}
}
Querying some thousand users, the results were around 0.9ms per user with PrincipalSearcher
(around 30 seconds for ~34k users) and around 5.2ms per user with DirectorySearcher
(around 2 minutes and 30 seconds for ~34k users) - PrincipalSearcher
being almost six times faster.
I tried debugging and comparing the PrincipalSearcher
's underlying DirectorySearcher
with the one I created and they seemed pretty much similar.
I tried examining further ahead and it seems that if I use the search root from the PrincipalSearcher
's underlying searcher, then the DirectorySearcher
I create actually outperforms the PrincipalSearcher
:
// ...
DirectoryEntry psRoot;
using (var searcher = new PrincipalSearcher(new UserPrincipal(context)))
{
var underlying = searcher.GetUnderlyingSearcher() as DirectorySearcher;
psRoot = underlying.SearchRoot;
// ...
}
// ...
using (var searcher = new DirectorySearcher(psRoot, filter, properties))
{
// ...
}
While debugging I found out that the search roots are largely the same - i.e., they represent the same domain.
What could cause the search speed to slow down like this?
While writing this question I was tinkering with the test code and managed to find the issue. By providing the domain address when constructing the root
DirectoryEntry
:The search with
DirectorySearcher
outperformed that ofPrincipalSearcher
. I'm not exactly sure why - perhaps it's something to do with where the searcher looks for the results - but it definitely boosted the search speed.Take a look at my question and answer on the differences between the two methods.
PrincipalSearcher
is merely a wrapper aroundDirectorySearcher
. It was designed to make it easier to work with Active Directory while providing some automated speed enhancements.DirectorySearcher
can be much faster thanPrincipalSearcher
, but it requires a bit more work.The main reason you're seeing slow behavior from your "old stuff" code is that in when you used
PrincipalSearcher
in "new stuff", you got the underlyingDirectorySearcher
and fed itsPropertiesToLoad
collection. You did not do that in your "old stuff" code.As a result, your "old stuff" code was pulling every AD attribute for matched results (i.e. significantly more data being transferred), while your implementation using
PrincipalSearcher
was only reading 6 attributes.Doing that when using
PrincipalSearcher
is also generally not necessary as it handles caching and picking of attributes on its own. Really, when working withPrincipalSearcher
the only time you need to get the underlyingDirectorySearcher
is to set thePageSize
sincePrincipalSearcher
doesn't provide a standard way to set it.I suspect the reason you saw an improvement when specifying the domain is that it didn't have to do any work to figure out the domain name. You unfairly give "new stuff" a head start in that regards because you made the
PrincipalContext
before you started the clock so to speak.Some other things I noticed in your code that would actually skew the timing in the opposite direction is that in "new stuff", you don't do any filtering, and the initialization of your delegated thread to show the progress occurs AFTER you recorded the start time.