Performance optimizing use of generated XmlSeriali

2019-01-18 17:14发布

We have a few XML files that are being read by our applications. The XML format is fixed, and thus we can read them very easily with XmlSerializer.

I use this code to read the XML files and convert them to classes:

public static T FromXml<T>(this string xml) where T : class
{
    if (string.IsNullOrEmpty(xml))
    {
        return default(T);
    }

    XmlSerializer xmlserializer = new XmlSerializer(typeof(T));

    XmlTextReader textReader = new XmlTextReader(new StringReader(xml));
    textReader.Normalization = false;

    XmlReaderSettings settings = new XmlReaderSettings();

    T value;

    using (XmlReader reader = XmlReader.Create(textReader, settings))
    {
        value = (T)xmlserializer.Deserialize(reader);
    }

    return value;
}

However, there are some performance issues. When calling this code for the first time a specific type for T is used, the XmlSerializer generates a Project.XmlSerializer.dll file.

This is fine, but costs some precious milliseconds (about 900ms in my case). This can be circumvented by generating that assembly on forehand, using the XML Serializer Generator (sgen). This brings down the time to about half. Primarily due to the reading and reflection of the assembly.

I want to optimize this further, by bringing the XmlSerializer classes inside the assembly the actual classes are in, but I can't find a way to let XmlSerializer know not to read an external assembly, but use the serializer from the current assembly.

Any thoughts how to do this or an alternative way to make this work? (I can't pre-load them since most of the serialized classes are used at start-up)


The analysis using ANTS Profiler (metrics from other machine, but same pattern):

enter image description here

Plain. Most of the time (300ms + 400ms = 700ms) is lost in generating and loading the XmlSerializer assembly.

enter image description here

With sgen generated assembly. Most of the time (336ms) is lost in loading the XmlSerializer assembly.

enter image description here

When including the actual source of the assembly inside the project, and calling the serializer directly, the action goes down to 456ms (was 1s in first, 556ms in second).

2条回答
▲ chillily
2楼-- · 2019-01-18 17:50

Note: OP posted a sample config: http://pastebin.com/d67nch3R

Based on the sample config and the type of issue you're experiencing there are a couple brute-force ways, pretty much guaranteed to do the trick, both boiling down to abandoning the XML serializer altogether

Route #1

Abandon XML serialization and use XDocument to get data out of the XML.

Route #2

Use json and Newtonsoft Json to store and load configs. It should perform a lot better than XML Serializer

The sample json counterpart would look like this:

{
  "Connections": {
    "-default": "Local\\SqlServer",
    "-forcedefault": "false",
    "group": {
      "-name": "Local",
      "connection": {
        "-name": "SqlServer",
        "database": {
          "-provider": "SqlServer",
          "-connectionString": "blah"
        }
      }
    }
  },
  "LastLanguage": "en",
  "UserName": "un",
  "SavePassword": "true",
  "AutoConnect": "false",
  "Password": "someObfuscatedHashedPassword==",
  "ConnectionName": "Somewhere\\Database",
  "LastAvailableBandwidth": "0",
  "LastAvailableLatency": "0",
  "DateLastConnectionSuccesful": "2014-08-13T15:21:35.9663654+02:00"
}

And load it:

UserSettings settings = JsonConvert.DeserializeObject<UserSettings>(File.ReadAllText("settings.json"))
查看更多
做个烂人
3楼-- · 2019-01-18 18:08

Unless you are doing the serialization at the very app startup, one way would be to force CLR to load and even compile whatever classes you're using ahead of time, possibly in a thread which would run in background as soon as you've started your app.

Something like, for example:

foreach (Assembly a in assembliesThatShouldBeCompileed)
    foreach (Type type in a.GetTypes())
        if (!type.IsAbstract && type.IsClass)
        {
            foreach (MethodInfo method in type.GetMethods(
                                BindingFlags.DeclaredOnly |
                                BindingFlags.NonPublic |
                                BindingFlags.Public |
                                BindingFlags.Instance |
                                BindingFlags.Static))
            {
                if (method.ContainsGenericParameters || 
                    method.IsGenericMethod || 
                    method.IsGenericMethodDefinition)
                    continue;

                if ((method.Attributes & MethodAttributes.PinvokeImpl) > 0)
                    continue;

                System.Runtime.CompilerServices
                   .RuntimeHelpers.PrepareMethod(method.MethodHandle);
            }
        }

It's strange, however, that your profiling seems to indicate that there is not much difference if the SGEN'd code is in a separate assembly, while loading seems to be the bottleneck. I wonder how the graph looks like for the case where they are in the same assembly?

查看更多
登录 后发表回答