Missing cache role behavior of Windows Azure Cachi

2019-09-01 10:11发布

问题:

I work on a hosted service which has Windows Azure Cache deployed on instances of the web role. The cache is enabled on production but in the compute emulator we disable it since we often experience slowdowns and exceptions with the cache emulator. In particular, in the compute emulator we do not load the caching module in the csdef and at runtime we detect if cache is enabled by creating the DataCacheFactory and catching the specific exception thrown when the role indicated in the client library configuration is not found in the csdef.

This worked correctly until Windows Azure Caching 2.0 -- when we upgraded to Windows Azure Caching 2.1 (and Azure SDK 2.1) the behavior changed:

  • we don't have the exception on DataCacheFactory constructor;
  • when we try to instantiate the DataCache from the DataCacheFactory the role seems to hang and after 3 minutes it returns with the following exception (the complete text can be found here):

    Microsoft.ApplicationServer.Caching.DataCacheException was unhandled by
    user code
    Message=ErrorCode<ERRCA0017>:SubStatus<ES0006>:There is a temporary failure.
    Please retry later. (<snip>). Additional Information :
    The client was trying to communicate with the server:
    net.tcp://WebRole:24233.
    InnerException: System.Net.Sockets.SocketException
    Message=No such host is known
    

Please not that this is not a duplicate of the following SO questions:

  • Cant get Azure Cache to work. “There is a temporary failure. Please retry later.”
  • Exception while using Windows Azure Caching : No such host is known
  • Azure Caching - Failure after upgrading to SDK 2.1 and caching 2.1

since

  • I'm sure that I'm using Azure SDK 2.1 (I've checked in debugging that the library versions were correct);
  • my problem arises only when I disable the cache role on purpose.

回答1:

Using the procedure described in the following SO answer and with the help of ILSpy I've been able to understand why this exception occurs: in Windows Azure Caching 2.1 when the role specified in the client configuration is not found it is considered an address and execution continues, while in older versions it throws an exception (which I caught to understand cache was not enabled).

The relevant log messages are:

WaWorkerHost.exe Information: 0 : INFORMATION:
<DistributedCache.CacheFactory.1> TryAutoDiscoverServersWithinDeployment
for Instance 'WebRole' failed to connect as RoleName type with exception
System.Reflection.TargetInvocationException: Exception has been thrown by
the target of an invocation. --->
Microsoft.ApplicationServer.Caching.DataCacheException:
ErrorCode<UnspecifiedErrorCode>:SubStatus<ES0001>:The role WebService
was not found in the current deployment.
at Microsoft.ApplicationServer.Caching.AzureClientHelper.RoleUtility.
  GetCacheRoleIPList(String roleName, String portIdentifier)
--- End of inner exception stack trace ---
at System.RuntimeMethodHandle.InvokeMethod(Object target, Object[] arguments,
  Signature sig, Boolean constructor)
at System.Reflection.RuntimeMethodInfo.UnsafeInvokeInternal(Object obj,
  Object[] parameters, Object[] arguments)
at System.Reflection.RuntimeMethodInfo.Invoke(Object obj, 
  BindingFlags invokeAttr, Binder binder, Object[] parameters,
  CultureInfo culture)
at System.Reflection.MethodBase.Invoke(Object obj, Object[] parameters)
at Microsoft.ApplicationServer.Caching.DataCacheFactory.
   AutoDiscoverServersWithinDeployment()
at Microsoft.ApplicationServer.Caching.DataCacheFactory.
   TryAutoDiscoverServersWithinDeployment()
Assuming it as EndPoint.

and

WaWorkerHost.exe Warning: 0 : WARNING: <DistributedCache.SocketClientChannel.1>
Request 1 to host net.tcp://webrolw:24233/ failed 
Status=ChannelOpenFailed[System.Net.Sockets.SocketException (0x80004005):
No such host is known

To resolve this issue you can:

  • analize the DataCacheFactory just created and see if in the Servers property there is any item in which the address is the same as the name of the cache role -- a sign that the indicated role has no cache configured;
  • in debug configurations of the hosted service lower the number of retries in the CacheReadyRetryPolicy property of the DataCacheFactory (which was causing the 3 minute delay before the exception) and if exceptions are thrown assume that the cache is unavailable.