SQL Server CLR Threading

2019-02-10 06:25发布

问题:

I have been struggling with a SQL Server CLR stored procedure.

Background:

We are using SQL Server 2014 and a CLR stored procedure has been implemented which calls a customer's web service.

The threading was initially used not to slow the main thread of SQL Server CLR.

Although, now, I know that using threading under CLR is no best idea, it has been working correctly for 6 years (since SQL Server 2008). It has been migrated to SQL Server 2014 recently.

The problem

On my development machine, same as on test system we have no problem with the solution.

On the customer system, the thread, which calls the web service, is never executed for some reason.

I can see from the log files that everything is working correctly till the thread execution.

There is no specific error, nothing.

We have been trying to change the permissions, but without a success. Therefore I think its not a permission issue.

Questions

  1. Does anyone know how to change the behavior? We couldn't find any configuration which might does the trick.

  2. Would it be good idea to remove the threading completely, and having the calling of web services directly on SQL Server main thread?

Thank you for any advice, Petr

回答1:

Not sure about Question #1, though it might not matter given the recommendation for Question #2. Still, one difference between SQL Server 2008 (where it is working) and SQL Server 2014 (where it is not working) is the CLR version that SQL Server is linked to. SQL Server 2005 / 2008 / 2008 R2 are linked to CLR v2.0 while SQL Server 2012 and newer are linked to CLR v 4.0. Since you are not seeing the error and your client is, I would make sure that their system has been updated to the same .NET Framework version that you are running.

For Question #2, I would recommend removing the multi-threading. That has too much potential for problems, and requires the Assembly to be UNSAFE. If you remove the threading, you can set the Assembly to EXTERNAL_ACCESS.

If you want to reduce contention, then assuming the Web Service calls are to the same URI, then you need to increase the number of allowed concurrent web requests. That can be done by setting the ServicePointManager.DefaultConnectionLimit Property. The default value is 2. Which means, any additional requests will wait and wait until one of the current 2 is closed.

Also, be sure to properly Dispose of the WebRequest.


The concern about making external calls (i.e. the Web Service) that can potentially not complete quickly is that SQL Server uses Cooperative Multitasking wherein each thread is responsible for "yielding" control back to the Scheduler (effectively pausing it) at various points so that the Scheduler can shuffle things around and run other things that are currently "sleeping". This concern with regards to SQLCLR code can typically be mitigated by doing at least one of the following:

  • Perform data access / querying the instance
  • Calling thread.sleep(0);

However, an external call is not doing data access, and you cannot easily call thread.sleep(0) while waiting for the WebResponse to complete. Yes, you can call the WebService on a separate thread and while waiting for it to finish, assuming you are just looping and checking, the sleep(x) will allow for the yield.

But is doing the Web Service call asynchronously necessary? It certainly has the downside of requiring the Assembly to be marked as WITH PERMISSION_SET = UNSAFE. It greatly depends on how long the call usually takes, and how frequently it is being called. The more frequent the call, the more likely it is that any delays are, at least in part, caused by the low default value for how many concurrent connections are allowed per each URI. This relates to the recommendation I made at the top.

But if you want to see how SQL Server actually works, this should be fairly easy to test. On my laptop, I went to the Server "Properties" in Object Explorer, went to "Processors", unchecked the "automatically set processor affinity..." option, selected only a single CPU under "Processor Affinity" in the tree view in the middle of the dialog, clicked "OK", and then restarted the service. I then set up a web page that did nothing but call "sleep" for 60 seconds. I have a SQLCLR TVF that calls web pages so I ran that concurrently in two different tabs / sessions. In a 3rd tab / session, I ran:

SELECT SUM(so1.[schema_id]), so1.[type_desc], so2.[type_desc]
FROM sys.objects so1
CROSS JOIN sys.objects so2
CROSS JOIN sys.objects so3
CROSS JOIN sys.objects so4
CROSS JOIN sys.objects so5
WHERE so3.[create_date] <> so4.[modify_date]
GROUP BY so1.[type_desc], so2.[type_desc], so5.[name]
ORDER BY so2.[type_desc], so5.[name] DESC;

And finally, in a 4th tab, after kicking off the first 3, I ran the following to monitor the system:

SELECT * FROM sys.dm_os_schedulers WHERE [scheduler_id] = 0;

SELECT *
FROM sys.dm_exec_requests
WHERE [scheduler_id] = 0
AND [status] <> N'background'
ORDER BY [status] DESC, session_id;

The status for the 2 sessions running the SQLCLR function was always "running" and the status for the session running that ugly query in tab 3 was always "runnable". But just to be sure, running that ugly query again, when neither of the SQLCLR functions was executing, took the same 1 minute and 14 seconds that it did when running concurrently with the 2 sessions running the SQLCLR call to the web page that was sleeping for 60 seconds.

Please do not infer that there is no cost to running the SQLCLR code to make the web calls. Since those threads were busy the whole time, if the system was busy then it would have reduced the ability for SQL Server to allocate those threads to complete other queries faster. But it does seem safe to conclude that, at least on systems with low to moderate load, the benefit gained by adding the threading doesn't seem to be worth the cost of increased complexity (especially since now there is a not-yet-reproducable problem to debug).