Azure SQL Database sometimes unreachable from Azur

2019-05-25 03:17发布

问题:

I have a asp.net application deployed to Azure websites connecting to Azure SQL Database. This has been working fine for the last year, but last weekend I have started getting errors connecting to the database giving the following stacktrace.

[Win32Exception (0x80004005): Access is denied]

[SqlException (0x80131904): A network-related or instance-specific error occurred while establishing a connection to SQL Server. The server was not found or was not accessible. Verify that the instance name is correct and that SQL Server is configured to allow remote connections. (provider: Named Pipes Provider, error: 40 - Could not open a connection to SQL Server)]

This error comes and goes staying a few hours and then goes away for a few hours. The database is always accessible from my machine.

A few things that I have tried are:

  • Adding a "allow all" firewall rule (0.0.0.0-255.255.255.255) has no effect
  • Changing the connection string to use the database owner as credential has no effect.
  • what does have effect is to change the azure hosting level to something else. This resolves the issue temporarily and the website can access the database for a few hours more.

What could have have happened for this error to start showing up? The application hasn't been changed since August.

EDIT: Azure support found that there was socket and port exhaustion on the instance. What the root cause for that is, is still unkown.

回答1:

Craig is correct in that you need to implement SQLAzure Transient Fault Handling. You can find instructions here: https://msdn.microsoft.com/en-us/library/hh680899(v=pandp.50).aspx

From the article

You can instantiate a PolicyRetry object and wrap the calls that you make to SQL Azure using the ExecuteAction method using the methods show in the previous topics. However, the block also includes direct support for working with SQL Azure through the ReliableSqlConnection class.

The following code snippet shows an example of how to open a reliable connection to SQL Azure.

using Microsoft.Practices.EnterpriseLibrary.WindowsAzure.TransientFaultHandling;
using Microsoft.Practices.EnterpriseLibrary.WindowsAzure.TransientFaultHandling.AzureStorage;
using Microsoft.Practices.EnterpriseLibrary.WindowsAzure.TransientFaultHandling.SqlAzure;

...

// Get an instance of the RetryManager class.
var retryManager = EnterpriseLibraryContainer.Current.GetInstance<RetryManager>();

// Create a retry policy that uses a default retry strategy from the 
// configuration.
var retryPolicy = retryManager.GetDefaultSqlConnectionRetryPolicy();


using (ReliableSqlConnection conn = 
  new ReliableSqlConnection(connString, retryPolicy))
{
    // Attempt to open a connection using the retry policy specified
    // when the constructor is invoked.    
    conn.Open();
    // ... execute SQL queries against this connection ...
}

The following code snippet shows an example of how to execute a SQL command with retries.

using Microsoft.Practices.EnterpriseLibrary.WindowsAzure.TransientFaultHandling;
using Microsoft.Practices.EnterpriseLibrary.WindowsAzure.TransientFaultHandling.AzureStorage;
using Microsoft.Practices.EnterpriseLibrary.WindowsAzure.TransientFaultHandling.SqlAzure;
using System.Data;

...

using (ReliableSqlConnection conn = new ReliableSqlConnection(connString, retryPolicy))
{
    conn.Open();

    IDbCommand selectCommand = conn.CreateCommand();
    selectCommand.CommandText = 
      "UPDATE Application SET [DateUpdated] = getdate()";

    // Execute the above query using a retry-aware ExecuteCommand method which 
    // will automatically retry if the query has failed (or connection was 
    // dropped).
    int recordsAffected = conn.ExecuteCommand(selectCommand, retryPolicy);

}


回答2:

The answer from Azure support indicated that the connection issues I was experiencing was due to port/socket exhaustion. This was probably caused by another website on the same hosting plan.

Some answers to why the symptoms were removed by changing hosting service level:

  • Changing the hosting plan helped for a while since this moved the virtual machine and closed all sockets.
  • Changing the hosting plan from level B to level S helped since azure limits the number of sockets on level B.