I am using VSTS2008 + C# + .Net 3.5 to run this console application on x64 Server 2003 Enterprise with 12G physical memory.
Here is my code, and I find when executing statement bformatter.Serialize(stream, table), there is out of memory exception. I monitored memory usage through Perormance Tab of Task Manager and I find only 2G physical memory is used when exception is thrown, so should be not out of memory. :-(
Any ideas what is wrong? Any limitation of .Net serialization?
static DataTable MakeParentTable()
{
// Create a new DataTable.
System.Data.DataTable table = new DataTable("ParentTable");
// Declare variables for DataColumn and DataRow objects.
DataColumn column;
DataRow row;
// Create new DataColumn, set DataType,
// ColumnName and add to DataTable.
column = new DataColumn();
column.DataType = System.Type.GetType("System.Int32");
column.ColumnName = "id";
column.ReadOnly = true;
column.Unique = true;
// Add the Column to the DataColumnCollection.
table.Columns.Add(column);
// Create second column.
column = new DataColumn();
column.DataType = System.Type.GetType("System.String");
column.ColumnName = "ParentItem";
column.AutoIncrement = false;
column.Caption = "ParentItem";
column.ReadOnly = false;
column.Unique = false;
// Add the column to the table.
table.Columns.Add(column);
// Make the ID column the primary key column.
DataColumn[] PrimaryKeyColumns = new DataColumn[1];
PrimaryKeyColumns[0] = table.Columns["id"];
table.PrimaryKey = PrimaryKeyColumns;
// Create three new DataRow objects and add
// them to the DataTable
for (int i = 0; i <= 5000000; i++)
{
row = table.NewRow();
row["id"] = i;
row["ParentItem"] = "ParentItem " + i;
table.Rows.Add(row);
}
return table;
}
static void Main(string[] args)
{
DataTable table = MakeParentTable();
Stream stream = new MemoryStream();
BinaryFormatter bformatter = new BinaryFormatter();
bformatter.Serialize(stream, table); // out of memory exception here
Console.WriteLine(table.Rows.Count);
return;
}
thanks in advance, George
1) The OS is x64, but is the app x64 (or anycpu)? If not, it is capped at 2Gb.
2) Does this happen 'early on', or after the app has been running for some time (i.e. n serializations later)? Could it maybe be a result of large object heap fragmentation...?
As already discussed this is a fundamental issue with trying to get contiguous blocks of memory in the Gigabyte sort of size.
You will be limited by (in increasing difficulty)
You can find that you run out of space before the CLR limit of
2
because the backing buffer in the stream is expanded in a 'doubling' fashion and this swiftly results in the buffer being allocated in the Large Object Heap. This heap is not compacted in the same way the other heaps are(1) and as a result the process of building up to the theoretical maximum size of the buffer under2
fragments the LOH so that you fail to find a sufficiently large contiguous block before this happens.Thus a mitigation approach if you are close to the limit is to set the initial capacity of the stream such that it definitely has sufficient space from the start via one of the constructors.
Given that you are writing to the memory stream as part of a serialization process it would make sense to actually use streams as intended and use only the data required.
Perhaps if you tell us what you are serializing an object of this size for we might be able to tell you better ways to do it.
Interestingly, it actually goes up to 3.7GB before giving a memory error here (Windows 7 x64). Apparently, it would need about double that amount to complete.
Given that the application uses 1.65GB after creating the table, it seems likely that it's hitting the 2GB
byte[]
(or any single object) limit Marc Gravell is speaking of (1.65GB + 2GB ~= 3.7GB)Based on this blog, I suppose you could allocate your memory using the WINAPI, and write your own MemoryStream implementation using that. That is, if you really wanted to do this. Or write one using more than one array of course :)
Note:
DataTable
defaults to the xml serialization format that was used in 1.*, which is incredibly inefficient. One thing to try is switching to the newer format:Re the out-of-memory / 2GB; individual .NET objects (such as the
byte[]
behind aMemoryStream
) are limited to 2GB. Perhaps try writing to aFileStream
instead?(edit: nope: tried that, still errors)
I also wonder if you may get better results (in this case) using
table.WriteXml(stream)
, perhaps with compression such as GZIP if space is a premium.