Please excuse the Obvious Self-Q/A, but this information is widely misunderstood, and almost always incorrectly answered. So I Wanted to place this information here for people searching for a definitive answer to this problem.
Even so, there's still some information I haven't been able to nail down. I will put this towards the end of the question (skip to that if you are not interested in the preamble).
How do I correctly configure a WCF NetTcp Duplex Reliable Session?
There are many questions and answers regarding this topic, and nearly all of them suggest setting inactivityTimeout="Infinite"
in your configuration. This doesn't really seem to work correctly, particularly for the case of NetTcp (It may work correctly for WSDualHttp Bindings, but I have never used those).
There are a number of other issues that are often associated with this: Including, Channel not faulting after client or server unexpectedly disconnected, Channel disconnecting after 10 minutes, Channel randomly disconnecting... Channel throwing exception when trying to open... Unable to configure Metadata on same endpoint...
Information I have figured out, through hard won trial and error.
Infinite
does not appear to be a valid configuration setting in all situations (and certainly, the visual studio validation schema doesn't know about it).- There are two special configuration converters, called
InfiniteIntConverter
andInfiniteTimeSpanConverter
which will sometimes work to convert the valueInfinite
to eitherInt.MaxValue
orTimeSpan.MaxValue
, but I haven't yet figured out the situations in which this appears to be valid as sometimes it works, and sometimes it doesn't. What's more, it appears that some libraries will allowInfinite
in the config, while others will not, so you can succeed in one part of a configuration, but fail in another. - You must configure BOTH
inactivityTimeout
andreceiveTimeout
, on both the client and the server. While these values do not HAVE to be the same, they probably should be as they will probably cause confusion if they are not. (technically, you can leaveinactivityTimeout
to its default value if you want, but you should be aware of its value, and what it does) inactivityTimeout
should NEVER be set to a large value, much lessInfinte
orTimeSpan.MaxValue
.inactivityTimeout
has two functions (and this is not widely understood). The first function defines the maximum amount of time that can elapse on a channel without receiving any "infrastructure" or "operation" messages. The second function defines the time period in which infrastructure messages are sent (half the time specified). If no infrastructure or operation messages have been received during the timeout period, the connection is aborted.receiveTimeout
species the maximum amount of time that can elapse between operation messages only. This value can be set to a large value, such asTimeSpan.MaxValue
(particularly if your channel runs internally over a trusted network or over a vpn). This value is what defines how long the reliable session will "stay alive" if there is no activity between client and server (other than infrastructure messages). ie, your client does not call any methods of the interface, and your server does not call back into the client.- setting a short
inactivityTimeout
and a largereceiveTimeout
keeps your reliable session "tacked up" even when there is no operational activity between your client and server. The short inactivity timeout (i like to keep the default 10 minutes or less) sends infrastructure "ping" messages to keep the TCP connection alive while the long receive timeout keeps the reliable session active. while at the same time providing a reasonable timeout in case of disconnection. - If you set
inactivityTimeout
to a large value, then the reliable session will not be reliable as it has no way to keep the Tcp connection alive, nor does it have any way to verify the integrity of the connection. It won't know if a user has disconnected unexpectedly until you try and send a message to that client and find out the connection is no longer there. This is why many people who use Infinite for this setting resort to creating a "Ping" method in their service, which is completely unnecessary if you've configured these settings correctly. - If you set
inactivityTimeout
to a value larger thanreceiveTimeout
then it will likewise also be unreliable, as you will still be governed by thereceiveTimeout
for operation messages. ie. if you forget to setreceiveTimeout
and leave it at the default 10 minutes, then if the user is idle for 10 minutes, the connection will be aborted. - When the client or server unexpectedly disconnects (app crashes, network failure, someone trips over the power cord, etc..), the other side may not notice right away. I have attached various
ChannelFaulted
event handlers in various test situations, and sometimes the connection is faulted right away... other times it doesn't seem to fault at all. What i have discovered through trial and error is that the when it doesn't seem to fault, it will actually fault after theinactivityTimeout
expires on that end. (so if it's set to 10 minutes, then after 10 minutes it will call theChannelFaulted
event). - I have not yet figured out why in some situations it notices the disconnection right away, and others it waits for the timer to expire. In both cases, I notice internal first chance communication exceptions thrown and handled by the framework, and there are calls to Abort the connection... but somehow the call to the event handler gets lost and it must wait for the timeout. My suspicion is this is somehow thread related.
- When trying to configure Metadata to work across the NetTcp channel, I have had sporadic results. Sometimes it works, sometimes it doesn't. I've read many reports that Metadata doesn't work over NetTcp and that you have to use an Http channel for the Metadata, but I have in fact had it work on several occasions using the net.tcp:// url to generate the proxy. Then I would change something, recompile and it would no longer work. Changing things back, it wouldn't work again. So it was very confusing what magic incantation was necessary to make Metadata function over net.tcp, shared with the endpoint on the same port (obviously with a different address).
- When configuring both a NetTcp and Metatdata endpoint on the same service, and specifying non-default settings for connection parameters like listenBacklog, and maxConnections, you also need to make sure the Metadata endpoint uses the same settings, which typically means you have to define a custom binding, since these settings are not available from the standard tcp mex binding. This includes setting
listenBacklog
andmaxPendingConnections
ontcpTransport
, andgroupName
andmaxOutboundConnectionsPerEndpoint
onconnectionPoolSettings
. - The default setting for the Ordered setting of ReliableSession is
True
. This uses a lot more overhead than turning it off. If you don't need ordered messages, i would suggest turning it off (need to set this on both sides)
-
Configuration I still need to understand:
How do I correctly configure the shared net.tcp Metadata endpoint? (I will add an example when I get a chance) Currently, i'm specifying an http get url to bypass the problem. It's so inconsistent as to why it sometimes works and sometimes does not. I kept getting the error `The URI Prefix is not recognized' when generating the proxy in Visual Studio.
Why does WCF sometimes Fault the channel immediately upon disconnect, and sometimes waits for inactivityTimeout
to expire? What controls/causes one vs the other behavior?