I have to transfer large files between computers on via unreliable connections using WCF.
Because I want to be able to resume the file and I don't want to be limited in my filesize by WCF, I am chunking the files into 1MB pieces. These "chunk" are transported as stream. Which works quite nice, so far.
My steps are:
- open filestream
- read chunk from file into byte[] and create memorystream
- transfer chunk
- back to 2. until the whole file is sent
My problem is in step 2. I assume that when I create a memory stream from a byte array, it will end up on the LOH and ultimately cause an outofmemory exception. I could not actually create this error, maybe I am wrong in my assumption.
Now, I don't want to send the byte[] in the message, as WCF will tell me the array size is too big. I can change the max allowed array size and/or the size of my chunk, but I hope there is another solution.
My actual question(s):
- Will my current solution create objects on the LOH and will that cause me problem?
- Is there a better way to solve this?
Btw.: On the receiving side I simple read smaller chunks from the arriving stream and write them directly into the file, so no large byte arrays involved.
Edit:
current solution:
for (int i = resumeChunk; i < chunks; i++)
{
byte[] buffer = new byte[chunkSize];
fileStream.Position = i * chunkSize;
int actualLength = fileStream.Read(buffer, 0, (int)chunkSize);
Array.Resize(ref buffer, actualLength);
using (MemoryStream stream = new MemoryStream(buffer))
{
UploadFile(stream);
}
}
I hope this is okay. It's my first answer on StackOverflow.
Yes absolutely if your chunksize is over 85000 bytes then the array will get allocated on the large object heap. You will probably not run out of memory very quickly as you are allocating and deallocating contiguous areas of memory that are all the same size so when memory fills up the runtime can fit a new chunk into an old, reclaimed memory area.
I would be a little worried about the Array.Resize call as that will create another array (see http://msdn.microsoft.com/en-us/library/1ffy6686(VS.80).aspx). This is an unecessary step if actualLength==Chunksize as it will be for all but the last chunk. So I would as a minimum suggest:
if (actualLength != chunkSize) Array.Resize(ref buffer, actualLength);
This should remove a lot of allocations. If the actualSize is not the same as the chunkSize but is still > 85000 then the new array will also be allocated on the Large object heap potentially causing it to fragment and possibly causing apparent memory leaks. It would I believe still take a long time to actually run out of memory as the leak would be quite slow.
I think a better implementation would be to use some kind of Buffer Pool to provide the arrays. You could roll your own (it would be too complicated) but WCF does provide one for you. I have rewritten your code slightly to take advatage of that:
BufferManager bm = BufferManager.CreateBufferManager(chunkSize * 10, chunkSize);
for (int i = resumeChunk; i < chunks; i++)
{
byte[] buffer = bm.TakeBuffer(chunkSize);
try
{
fileStream.Position = i * chunkSize;
int actualLength = fileStream.Read(buffer, 0, (int)chunkSize);
if (actualLength == 0) break;
//Array.Resize(ref buffer, actualLength);
using (MemoryStream stream = new MemoryStream(buffer))
{
UploadFile(stream, actualLength);
}
}
finally
{
bm.ReturnBuffer(buffer);
}
}
this assumes that the implementation of UploadFile Can be rewritten to take an int for the no. of bytes to write.
I hope this helps
joe
See also RecyclableMemoryStream.
From this article:
Microsoft.IO.RecyclableMemoryStream is a MemoryStream replacement that offers superior behavior for performance-critical systems. In particular it is optimized to do the following:
- Eliminate Large Object Heap allocations by using pooled buffers
- Incur far fewer gen 2 GCs, and spend far less time paused due to GC
- Avoid memory leaks by having a bounded pool size
- Avoid memory fragmentation
- Provide excellent debuggability
- Provide metrics for performance tracking
I'm not so sure about the first part of your question but as for a better way - have you considered BITS? It allows background downloading of files over http. You can provide it a http:// or file:// URI. It is resumable from the point that it was interrupted and downloads in chunks of bytes using the RANGE method in the http HEADER. It is used by Windows Update.You can subscribe to events that give information on progress and completion.
I have come up with another solution for this, let me know what you think!
Since I don't want to have large amounts of data in the memory I was looking for an elegant way to temporary store byte arrays or a stream.
The idea is to create a temp file (you don't need specific rights to do this) and then use it similar to a memory stream. Making the class Disposable will clean up the temp file after it has been used.
public class TempFileStream : Stream
{
private readonly string _filename;
private readonly FileStream _fileStream;
public TempFileStream()
{
this._filename = Path.GetTempFileName();
this._fileStream = File.Open(this._filename, FileMode.OpenOrCreate, FileAccess.ReadWrite);
}
public override bool CanRead
{
get
{
return this._fileStream.CanRead;
}
}
// and so on with wrapping the stream to the underlying filestream
...
// finally overrride the Dispose Method and remove the temp file
protected override void Dispose(bool disposing)
{
base.Dispose(disposing);
if (disposing)
{
this._fileStream.Close();
this._fileStream.Dispose();
try
{
File.Delete(this._filename);
}
catch (Exception)
{
// if something goes wrong while deleting the temp file we can ignore it.
}
}