SignalR with Redis Backplane Behind F5 - StatusCod

I'm using SignalR version 2.1.2 with SignalR.Redis 2.1.2 on Server 2012 R2, IIS 8.5 with WebSockets enabled.

All is running perfectly in my development environment. I can even stand up copies on different servers (e.g. http machine1/myapp/signalr, http machine2/myapp/signalr) of the site configured to use the same backplane, and both UI's get messages pubb'd to them perfectly.

I then moved "myapp" to our next environment, which is a cluster of 2 machines sitting behind an F5 load balancer, with a dns alias setup to route to the F5, and then round robin "myapp". The website itself can connect to signalr just fine, and can receive published messages it subscribes to, BUT when I try to publish to the site via the alias (e.g. http myappalias/signalr), I get a 400, Bad Request error response. Here is an example of the error.

  InnerException: Microsoft.AspNet.SignalR.Client.Infrastructure.StartException
       _HResult=-2146233088
       _message=Error during start request. Stopping the connection.
       HResult=-2146233088
       IsTransient=false
       Message=Error during start request. Stopping the connection.
       InnerException: System.AggregateException
            _HResult=-2146233088
            _message=One or more errors occurred.
            HResult=-2146233088
            IsTransient=false
            Message=One or more errors occurred.
            InnerException: Microsoft.AspNet.SignalR.Client.HttpClientException
                 _HResult=-2146233088
                 _message=StatusCode: 400, ReasonPhrase: 'Bad Request', Version: 1.1, Content: System.Net.Http.StreamContent, Headers:
{
  Pragma: no-cache
  Transfer-Encoding: chunked
  X-Content-Type-Options: nosniff
  Persistent-Auth: true
  Cache-Control: no-cache
  Date: Thu, 13 Nov 2014 22:30:22 GMT
  Server: Microsoft-IIS/8.5
  X-AspNet-Version: 4.0.30319
  X-Powered-By: ASP.NET
  Content-Type: text/html
  Expires: -1
}

Here is some test code I'm using to publish test messages to each environment, where it fails on "connection.Start().Wait()"

class Program
{
    static void Main(string[] args)
    {
        var connection = new HubConnection("http://myappalias/signalr");

        connection.Credentials = System.Net.CredentialCache.DefaultNetworkCredentials;

        var proxy = connection.CreateHubProxy("MyAppHub");

        connection.Start().Wait();

        ConsoleKeyInfo key = Console.ReadKey();

        do
        {


            proxy.Invoke("NewMessage", new Message() { Payload = "Hello" });

            Console.WriteLine("Message fired.");

            key = Console.ReadKey();

        } while (key.Key != ConsoleKey.Escape);
    }
}

Now, if I don't use the "myappalias", and instead hit the server head on, it works perfectly. It appears either the F5 is the problem, the client needs to be configured differently for this scenario or I have to do something different when setting up signlar's startup class. Here is an example of the startup class I'm using.

[assembly: OwinStartup(typeof(MyApp.Startup))]
namespace MyApp
{
    public class Startup
    {
        private static readonly ILog log = LogManager.GetLogger
        (System.Reflection.MethodBase.GetCurrentMethod().DeclaringType);

        public void Configuration(IAppBuilder app)
        {
            try
            {
                log.Debug(LoggingConstants.Begin);

                string redisServer = ConfigurationManager.AppSettings["redis:server"];

                int redisPort = Convert.ToInt32(ConfigurationManager.AppSettings["redis:port"]);

                HubConfiguration configuration = new HubConfiguration();
                configuration.EnableDetailedErrors = true;
                configuration.EnableJavaScriptProxies = false;
                configuration.Resolver = GlobalHost.DependencyResolver.UseRedis(redisServer, redisPort, string.Empty, "MyApp");

                app.MapSignalR("/signalr", configuration);   

                log.Info("SIGNALR - Startup Complete");
            }
            finally
            {
                log.Debug(LoggingConstants.End);
            }
        }

    }

}

I download the client source code, and wired that in directly instead of the nuget package, so I could step through everything. I seems it successfully negotiates, and then attempt to "connect" with SSE's and then LongPolling transports, but fails at both.

Question 1.1

Anyone know of an alternative to Signalr for .NET that supports scaling with load balancing in a less "I want to pull my hair out" kind of way?

回答1:

It should not be necessary to configure source address affinity to use SignalR behind a load balancer. It's certainly not wrong to set up session affinity, but that doesn't fix your underlying problem.

If you look closely at the content of the 400 response, you probably see a message similar to "The ConnectionId is in the incorrect format."

SignalR uses the server's machine key to create an anti-CSRF token, but this requires that all the servers in your farm share a machine key for the token to be properly decrypted in when SignalR requests hop servers. The /negotiate request that you see succeed is the request that retrieves the anti-CSRF token. When the SignalR client then uses the anti-CSRF token to make a /connect request, it failed because the /connect request was processed by a different server that didn't create the token and is unable to decrypt it.

This explains why setting up session affinity fixed your problem, but sharing a machine key will help you avoid this problem even if something goes wrong with session affinity.

Here is an issue that filed on GitHub by someone who experienced a similar issue: https://github.com/SignalR/SignalR/issues/2292.

回答2:

The problem was fixed by switching the profile for "MyApp" in the F5, to using the "source_addr" profile built into the F5 as a parent profile with a timeout of 1 hour. Here is a description of what that profile does:

Source address affinity persistence Also known as simple persistence, source address affinity persistence supports TCP and UDP protocols, and directs session requests to the same server based solely on the source IP address of a packet.

EDIT

This ended up "Working" for a while, but if I deploy a publisher (something that simply publishes through the signalr client) without republishing the Hub, the publisher times out trying to connect over and over and over again. uhg.