SqlDependency error after a long time

2019-03-27 07:26发布

I got a windows service listening to inserts in a table using the SqlDependency class.

It works fine for several days but then suddenly stops working.

In the normal scenario I receive change events

e.Type = SqlNotificationType.Change
e.Info = SqlNotificationInfo.Insert
e.Source = SqlNotificationSource.Data

If nothing changes I get timeout events every 3600 seconds

e.Type = SqlNotificationType.Change
e.Info = SqlNotificationInfo.Error
e.Source = SqlNotificationSource.Timeout

or (don't know why there are two different timeout events)

e.Type = SqlNotificationType.Change
e.Info = SqlNotificationInfo.Unknown
e.Source = SqlNotificationSource.Timeout

This can work for a week or more but then suddenly I don't receive change events anymore and instead receive an event every 60 seconds with

e.Type = SqlNotificationType.Change
e.Info = SqlNotificationInfo.Error
e.Source = SqlNotificationSource.Client

The msdn documentation for SqlNotificationSource.Client says

A client-initiated notification occurred, such as a client-side time-out or as a result of attempting to add a command to a dependency that has already fired.

I think this means that a timout occured when creating the dependency.

The same code is running all the time and looks like this:

private void CreateDependency() {
    using (var connection = new SqlConnection(_connectionString)) {
        connection.Open();

        var command = new SqlCommand();
        command.CommandText = "SELECT ...";
        command.Connection = connection;

        new SqlDependency(command, "ServiceName", DependencyTimeout).OnChange += OnChange;

        command.ExecuteNonQuery();
    }
}

private void OnChange(object sender, SqlNotificationEventArgs e) {
    ((SqlDependency)sender).OnChange -= OnChange;

    if (e.Type == SqlNotificationType.Change && e.Info == SqlNotificationInfo.Insert) {
        _changeWorkerNotifier.Set(); // AutoResetEvent
    }

    CreateDependency();
}

If I restart my service it works fine again.

I did some investigation and found out that the errors seems to start after a scheduled backup on the server (which happens every day). Got another application where I around the same time get errors like

A transport-level error has occurred when receiving results from the server. (provider: TCP Provider, error: 0 - The semaphore timeout period has expired.)

My original guess was that SqlDependency.Start() creates a connection to the server which faults during the scheduled backup and never recovers. But the first line in the msdn documentation says

The SqlDependency listener will restart when an error occurs in the SQL Server connection.

Any ideas on how to solve this?
(I can of course let the service fail and have the service manager restart it. The problem is that the service does other things as well which need to close down properly so I can't just do Enviroment.Exit(-1) from the event handler)

1条回答
劳资没心,怎么记你
2楼-- · 2019-03-27 08:11

Found the cause and a solution.

First I found out that the line

The SqlDependency listener will restart when an error occurs in the SQL Server connection.

is only present in the .Net 4 documentation.

Some tests show that it is not just a change in the documentation! The Client/Error event does not appear when running with CLR4.

So the cause is a connection error which is handled inside the SqlDependency in .Net 4 but not in earlier versions.

In .Net 2-3.5 it is is possible to recover after the error with a SqlDependency.Stop() / SqlDependency.Start().

Don't really like the Stop/Start solution because I then need logic to break the loop if the connection error is not recoverable. I decided to just stop the service in case of error and let the service manager restart it. (which makes the problem visible in the event log etc)

My handler now looks like this:

private void OnChange(object sender, SqlNotificationEventArgs e) {
    ((SqlDependency)sender).OnChange -= OnChange;

    if (e.Source == SqlNotificationSource.Timeout) {
        // just restart notification
    }
    else if (e.Source != SqlNotificationSource.Data) {
        Logger.Error("Unhandled change notification {0}/{1} ({2})", e.Type, e.Info, e.Source);
        ServiceRunner.ShutDown(true);
    }
    else if (e.Type == SqlNotificationType.Change && e.Info == SqlNotificationInfo.Insert) {
        _changeWorkerNotifier.Set(); // AutoResetEvent
    }
    else {
        Logger.Log("Ignored change notification {0}/{1} ({2})", e.Type, e.Info, e.Source);
    }

    CreateDependency();
}

** EDIT **

Here is the code I call on startup

private void ClearOldSubscriptions() {
    using (var connection = new SqlConnection(_connectionString))
    using (var command = new SqlCommand()) {
        string sql =
            ////@"DECLARE @UniqueTimeout AS int = 3586; " +
            @"DECLARE @SubscriptionId AS int; " +
            @"DECLARE @Sql AS varchar(max); " +
            @"DECLARE SubscriptionCursor CURSOR LOCAL FAST_FORWARD " +
            @"    FOR " +
            @"        SELECT id " +
            @"        FROM sys.dm_qn_subscriptions " +
            @"      WHERE database_id = DB_ID() " +
            @"            AND timeout = @UniqueTimeout " +
            @"OPEN SubscriptionCursor; " +
            @"FETCH NEXT FROM SubscriptionCursor INTO @SubscriptionId; " +
            @"WHILE @@FETCH_STATUS = 0 " +
            @"BEGIN " +
            @"    SET @Sql = 'KILL QUERY NOTIFICATION SUBSCRIPTION ' + CONVERT(varchar, @SubscriptionId); " +
            @"    EXEC(@Sql); " +
            @" " +
            @"    FETCH NEXT FROM SubscriptionCursor INTO @SubscriptionId; " +
            @"END";

        command.Connection = connection;
        command.CommandType = CommandType.Text;
        command.CommandText = sql;
        command.Parameters.Add("@UniqueTimeout", SqlDbType.Int).Value = DependencyTimeout;

        connection.Open();

        command.ExecuteNonQuery();
    }
}

private void ClearNotificationQueue() {
    using (var connection = new SqlConnection(_connectionString))
    using (var command = new SqlCommand()) {
        string sql = 
            @"DECLARE @Conversation AS uniqueidentifier; " +
            @"DECLARE ConversationCursor CURSOR LOCAL FAST_FORWARD  " +
            @"    FOR " +
            @"        SELECT conversation_handle  " +
            @"        FROM {@Queue} " +
            @"     " +
            @"OPEN ConversationCursor; " +
            @"FETCH NEXT FROM ConversationCursor INTO @Conversation; " +
            @"WHILE @@FETCH_STATUS = 0  " +
            @"BEGIN " +
            @"    END CONVERSATION @Conversation WITH CLEANUP; " +
            @" " +
            @"    FETCH NEXT FROM ConversationCursor INTO @Conversation; " +
            @"END " +
            @"";
        sql = sql.Replace("{@Queue}", NotificationQueue);

        command.Connection = connection;
        command.CommandType = CommandType.Text;
        command.CommandText = sql;

        connection.Open();

        command.ExecuteNonQuery();
    }
}
查看更多
登录 后发表回答