We have a benchmarking program for our reporting system that generates a specific report 2,000 times.
We have tried to reduce this down to just the time in our engine so:
- The template is read from disk once into memory, then that copy in memory is passed to the engine.
- File based datasources (XML, JSON) are also read into memory once and then that cached datasource is reused.
- The SQL database is SqlServer on the same machine. So there is I/O, but it should be fast.
- The OData datasource is on a different server so this has I/O that can take a bit of time.
Now here's where it gets interesting. Our engine is written in Java and we use IKVM to create a .NET version of it. IKVM creates DLLs that are true .NET clr byte codes. It's as though we wrote the same code in C#. So fundamentally we have the same code on Java & C#.
And the JSON (JPath) Java code is 100% the .NET JSON code via IKVM. The XML (XPath) Java code is 98% the same on the .NET side. The Sql code is seperate on the class to the connector (JDBC vs. ADO.NET) but everything outside of those connectors is identical. Same for OData.
So to sum up, same code and for XML/JSON totally CPU bound, while Sql a bit I/O bound and OData a bit more I/O bound.
Here's the results under Java. Pretty much a 50% increase in total production (each report generates in a single thread, it's the total pages across all reports across all threads we're measuring) for doubling the number of threads.
This is on a 4 core system that with hyper-threading appears as 8 cores to the program.
Java Engine Output (Pages Per Second)
But in .NET not as expected. XML & JSON small improvement from 2 - 4 threads and nothing after that. Sql decent improvement from 2 - 4 threads and again nothing after that. Only OData shows the expected improvement:
Our code to run this under multiple threads is:
ReportWorker[] workers = new ReportWorker[numThreads];
for (int ind = 0; ind < numThreads; ind++)
workers[ind] = new ReportWorker(ind, new CommandLine(cmdLine));
System.Threading.Thread[] threads = new System.Threading.Thread[numThreads];
for (int ind = 0; ind < numThreads; ind++)
threads[ind] = new System.Threading.Thread(workers[ind].DoWork);
for (int ind = 0; ind < numThreads; ind++)
threads[ind].Name = "Report Worker " + ind;
for (int ind = 0; ind < numThreads; ind++)
threads[ind].Start();
// we wait
for (int ind = 0; ind < numThreads; ind++)
threads[ind].Join();
When I run for 4 threads all 8 CPUs (using the Windows performance monitor) are sitting between 50% & 80%. When running 8 threads all 8 CPUs are running at 80% - 95%.
Is there something in the above threading code that would cause these results? This is a command line app - is there some reason it might not run the threads simultaneously?
Any suggestions about what to change or investigate are appreciated.