Transferring large payloads of data (Serialized Ob

2019-01-16 14:56发布

问题:

I have a case where I need to transfer large amounts of serialized object graphs (via NetDataContractSerializer) using WCF using wsHttp. I'm using message security and would like to continue to do so. Using this setup I would like to transfer serialized object graph which can sometimes approach around 300MB or so but when I try to do so I've started seeing a exception of type System.InsufficientMemoryException appear.

After a little research it appears that by default in WCF that a result to a service call is contained within a single message by default which contains the serialized data and this data is buffered by default on the server until the whole message is completely written. Thus the memory exception is being caused by the fact that the server is running out of memory resources that it is allowed to allocate because that buffer is full. The two main recommendations that I've come across are to use streaming or chunking to solve this problem however it is not clear to me what that involves and whether either solution is possible with my current setup (wsHttp/NetDataContractSerializer/Message Security). So far I understand that to use streaming message security would not work because message encryption and decryption need to work on the whole set of data and not a partial message. Chunking however sounds like it might be possible however it is not clear to me how it would be done with the other constraints that I've listed. If anybody could offer some guidance on what solutions are available and how to go about implementing it I would greatly appreciate it.

I should add that in my case I'm really not worried about interoperability with other clients as we own and control each side of the communication and use the shared interface pattern for data transfered to either side. So I'm open to any idea that fits inside of the constraints of using wsHttp with message security to transfer object graphs serialized using NetDataContractSerializer and I prefer a solution where I don't have to change my existing services and surrounding infrastructure drastically.

Related resources:

  • Chunking Channel
  • How to: Enable Streaming
  • Large attachments over WCF
  • Custom Message Encoder
  • Another spotting of InsufficientMemoryException
  • Non-Duplex Chunking Channel needed
  • Streaming large content with WCF and deferred execution

I'm also interested in any type of compression that could be done on this data but it looks like I would probably be best off doing this at the transport level once I can transition into .NET 4.0 so that the client will automatically support the gzip headers if I understand this properly.

Update (2010-06-29):

Some history on how I derived at the conclusion that the buffered message being too large was causing my problem. Originally I saw the CommunicationException below while testing.

The underlying connection was closed: The connection was closed unexpectedly.

Eventually after running this and doing some more logging I found the underlying InsufficientMemoryException exception that was causing the problem with the specified message.

Failed to allocate a managed memory buffer of 268435456 bytes. The amount of available memory may be low.

Which originated from the following method.

System.ServiceModel.Diagnostics.Utility.AllocateByteArray(Int32 size)

So in otherwords the failure came from allocating the array. When writing the same data serialized to disk it takes up around 146MB and if I cut it by half then I stop getting the error however I haven't dug much more into the specific threshold that breaks my buffer and whether it specific to my system or not.

Update (2010-12-06):

I guess at this point I'm looking for some clarifcation for the following. My understanding is that by default with WCF wsHttp with message security that a whole message (generally the whole set of data I'm returning) needs to be buffered on the server before the response is sent back to the client and thus causing my problems.

Possible solutions:

  • Constraining data size - By using some form of compression, encoding, or limiting of actual data returned by using some sort of paging like method in order to avoid consuming the maximum capacity of the outgoing buffer.
  • Streaming - Allows large amounts of data to be sent through WCF in a streaming fashion however this is not compatible with wsHttp or MessageSecurity since these techniques require buffering all the data.
  • Chunking Channel - Allows data to be broken up into separate messages but at this point I'm not sure of the constraints this has on service contract design and whether I can still use wsHttp with message binding.

Limiting the data that I can return only works up to a point and as with the Streaming option these options require coding a lot of the lower level work outside of the WCF service calls. So I guess what I need to know is whether any possible implementation of the chunking channel can side-step the large message issues by allowing a single set of data to be broken up into separate messages on the server and then pieced together on the client in such a way that I do not have to change the interface/shape of existing service contracts and in a way that the process is pretty much hidden from the client and server portion of each service implementation while still using message security and wsHttp. If the chunking-channel is going to require me to re-write my service contracts to expose streams then I don't see how this is really any different than the Streaming solution. If somebody can simply answer these questions for me I'll award them with the bounty and mark it as the answer.

回答1:

protobuf-net generally has a significant space-saving (as in: order-of-magnitude) on most data, and can attach to WCF. Unfortunately at the moment it doesn't support full graphs, only trees. However, I have plans there that I simply haven't had time to implement. I can't promise anything, but I could try to bump that work a bit sooner.

Otherwise; there may be ways to tweak your existing code to work with a tree instead of a graph.



回答2:

If you still want to use Message Security, I would recommend you to use MTOM to optimize the network bandwidth that needs be used to transfer the messages, and also the chunking channel for using smaller memory buffers when security is applied. Otherwise, WCF will try to load the whole message in memory to apply security, and therefore you will get the Insufficient memory exception.



回答3:

i used to implement kinda passing big text to/from wcf. my trig is convert it to stream and use GZipStream to compress it then send it as byte[], luckily its never exceed 10 MB.

In your case i recommend do fragmentation. Convert Serialized object to byte[] and then merg it and decompress

psudo

int transferSize = 5000000; // 5MB
byte[] compressed = ...;
var mem = new System.IO.MemoryStream(compressed);

for(int i = 0; i < compressed .length; i+= transferSize )
{
    byte[] buffer = new byte[transferSize];
    mem.Read(buffer, i, compressed);
    mem.Flush();
    sendFragmentToWCF(buffer);
}

edit 08 Dec 2010

ok based on my understanding, the situation is client is download some large serialize object throught WCF. i didn't particularly test on this solution, but guess it should work. The key is save serialized object to file and use Response transmit that file.

[WebMethod]
public void GetSerializedObject()
{
    string path = @"X:\temp.tmp";

    var serializer = new  System.Runtime.Serialization.NetDataContractSerializer();
    var file = new System.IO.FileStream(path, System.IO.FileMode.CreateNew);

    try
    {
        serializer.Serialize(file, ...);
        this.Context.Response.TransmitFile(path);
        this.Context.Response.Flush();
    }
    finally
    {
        file.Flush();
        file.Close();
        file.Dispose();
        System.IO.File.Delete(path);
    }
}

WCF shoud do file streaming automatically and u dont ahve to worry about serialized object size since we use file transmit. Dont forget to the config response limit.



回答4:

Some lighter, but not guaranteed solutions, would be to

  • use the DataContractSerializer instead since you own both sides. This does not require embedded type information, which is significantly large.
  • use [DataMember(EmitDefaultValue = false)] which is discussed in a question I asked - again because you own both sides; doing so will cut down some on the message size (how much depends of course on how many fields in the graph are defaults).
  • use [DataContract(IsReference=true)], especially if you have many repeated value objects or reference data
  • use some sort of throttling on the server to reduce memory pressure of simultaneous results

These are, of course trade-offs, for example with readability.



回答5:

Since nobody has put it out there, perhaps using WebSockets or long polling techniques may be able to solve this issue. I've only looked into these solutions briefly and haven't developed a solution around them but I wanted to propose these concepts for the record and I'll expand upon my answer at a later point if time permits.

The underlying idea would be to achieve something similar to how the idea of how the ChunkingChannel example works but while not requiring a full duplex channel which typically breaks the port 80 web based request/response model that is desirable to avoid having to make firewall and other related configurations for clients.

Other related material:

  • WCF HTTP Long Polling
  • WebSockets with WCF

Update: After researching more on this it appears that by using WebSockets, which is known known as NetHttpBinding, that I would inherently not be solving the original request which is to use wsHttp in WCF with message security. I'm going to keep my answer here however as information for others who may be looking into an alternative.