Construct a LINQ GroupBy query using expression tr

2019-03-27 17:33发布

问题:

I have stuck on this problem for a week and no solution found.

I have a POCO like below:

public class Journal {
    public int Id { get; set; }
    public string AuthorName { get; set; }
    public string Category { get; set; }
    public DateTime CreatedAt { get; set; }
}

I want to know during a specific date span ( grouped by months or years ) the amount of journals count by a AuthorName or a Category.

After I send the queryed object to JSON serializer then generated JSON data like below ( just using JSON to demonstrate the data I want to get, how to serializer a object to a JSON is not my problem )

data: {
    '201301': {
        'Alex': 10,
        'James': 20
    },
    '201302': {
        'Alex': 1,
        'Jessica': 9
    }
}

OR

data: {
    '2012': {
         'C#': 230
         'VB.NET': 120,
         'LINQ': 97
     },
     '2013': {
         'C#': 115
         'VB.NET': 29,
         'LINQ': 36
     }
}

What I know is to write a LINQ query in "method way" like:

IQueryable<Journal> query = db.GroupBy(x=> new 
    {
        Year = key.CreatedAt.Year,
        Month = key.CreatedAt.Month
    }, prj => prj.AuthorName)
    .Select(data => new {
        Key = data.Key.Year * 100 + data.Key.Month, // very ugly code, I know
        Details = data.GroupBy(y => y).Select(z => new { z.Key, Count = z.Count() })
    });

The conditions that grouped by months or years, AuthorName or Category will be passed by two string type method parameters. What I don't know is how to use "Magic String" parameters in a GroupBy() method . After some googling, it seems that I cannot group data by passing a magic string like "AuthorName". What I should to do is build a expression tree and pass it to the GroupBy() method.

Any solution or suggestion is appreciate.

回答1:

Ooh, this looks like a fun problem :)

So first, let's set up our faux-source, since I don't have your DB handy:

// SETUP: fake up a data source
var folks = new[]{"Alex", "James", "Jessica"};
var cats = new[]{"C#", "VB.NET", "LINQ"};
var r = new Random();
var entryCount = 100;
var entries = 
    from i in Enumerable.Range(0, entryCount)
    let id = r.Next(0, 999999)
    let person = folks[r.Next(0, folks.Length)]
    let category = cats[r.Next(0, cats.Length)]
    let date = DateTime.Now.AddDays(r.Next(0, 100) - 50)
    select new Journal() { 
        Id = id, 
        AuthorName = person, 
        Category = category, 
        CreatedAt = date };    

Ok, so now we've got a set of data to work with, let's look at what we want...we want something with a "shape" like:

public Expression<Func<Journal, ????>> GetThingToGroupByWith(
    string[] someMagicStringNames, 
    ????)

That has roughly the same functionality as (in pseudo code):

GroupBy(x => new { x.magicStringNames })

Let's dissect it one piece at a time. First, how the heck do we do this dynamically?

x => new { ... }

The compiler does the magic for us normally - what it does is define a new Type, and we can do the same:

    var sourceType = typeof(Journal);

    // define a dynamic type (read: anonymous type) for our needs
    var dynAsm = AppDomain
        .CurrentDomain
        .DefineDynamicAssembly(
            new AssemblyName(Guid.NewGuid().ToString()), 
            AssemblyBuilderAccess.Run);
    var dynMod = dynAsm
         .DefineDynamicModule(Guid.NewGuid().ToString());
    var typeBuilder = dynMod
         .DefineType(Guid.NewGuid().ToString());
    var properties = groupByNames
        .Select(name => sourceType.GetProperty(name))
        .Cast<MemberInfo>();
    var fields = groupByNames
        .Select(name => sourceType.GetField(name))
        .Cast<MemberInfo>();
    var propFields = properties
        .Concat(fields)
        .Where(pf => pf != null);
    foreach (var propField in propFields)
    {        
        typeBuilder.DefineField(
            propField.Name, 
            propField.MemberType == MemberTypes.Field 
                ? (propField as FieldInfo).FieldType 
                : (propField as PropertyInfo).PropertyType, 
            FieldAttributes.Public);
    }
    var dynamicType = typeBuilder.CreateType();

So what we've done here is define a custom, throwaway type that has one field for each name we pass in, which is the same type as the (either Property or Field) on the source type. Nice!

Now how do we give LINQ what it wants?

First, let's set up an "input" for the func we'll return:

// Create and return an expression that maps T => dynamic type
var sourceItem = Expression.Parameter(sourceType, "item");

We know we'll need to "new up" one of our new dynamic types...

Expression.New(dynamicType.GetConstructor(Type.EmptyTypes))

And we'll need to initialize it with the values coming in from that parameter...

Expression.MemberInit(
    Expression.New(dynamicType.GetConstructor(Type.EmptyTypes)),
    bindings), 

But what the heck are we going to use for bindings? Hmm...well, we want something that binds to the corresponding properties/fields in the source type, but remaps them to our dynamicType fields...

    var bindings = dynamicType
        .GetFields()
        .Select(p => 
            Expression.Bind(
                 p, 
                 Expression.PropertyOrField(
                     sourceItem, 
                     p.Name)))
        .OfType<MemberBinding>()
        .ToArray();

Oof...nasty looking, but we're still not done - so we need to declare a return type for the Func we're creating via Expression trees...when in doubt, use object!

Expression.Convert( expr, typeof(object))

And finally, we'll bind this to our "input parameter" via Lambda, making the whole stack:

    // Create and return an expression that maps T => dynamic type
    var sourceItem = Expression.Parameter(sourceType, "item");
    var bindings = dynamicType
        .GetFields()
        .Select(p => Expression.Bind(p, Expression.PropertyOrField(sourceItem, p.Name)))
        .OfType<MemberBinding>()
        .ToArray();

    var fetcher = Expression.Lambda<Func<T, object>>(
        Expression.Convert(
            Expression.MemberInit(
                Expression.New(dynamicType.GetConstructor(Type.EmptyTypes)),
                bindings), 
            typeof(object)),
        sourceItem);                

For ease of use, let's wrap the whole mess up as an extension method, so now we've got:

public static class Ext
{
    // Science Fact: the "Grouper" (as in the Fish) is classified as:
    //   Perciformes Serranidae Epinephelinae
    public static Expression<Func<T, object>> Epinephelinae<T>(
         this IEnumerable<T> source, 
         string [] groupByNames)
    {
        var sourceType = typeof(T);
    // define a dynamic type (read: anonymous type) for our needs
    var dynAsm = AppDomain
        .CurrentDomain
        .DefineDynamicAssembly(
            new AssemblyName(Guid.NewGuid().ToString()), 
            AssemblyBuilderAccess.Run);
    var dynMod = dynAsm
         .DefineDynamicModule(Guid.NewGuid().ToString());
    var typeBuilder = dynMod
         .DefineType(Guid.NewGuid().ToString());
    var properties = groupByNames
        .Select(name => sourceType.GetProperty(name))
        .Cast<MemberInfo>();
    var fields = groupByNames
        .Select(name => sourceType.GetField(name))
        .Cast<MemberInfo>();
    var propFields = properties
        .Concat(fields)
        .Where(pf => pf != null);
    foreach (var propField in propFields)
    {        
        typeBuilder.DefineField(
            propField.Name, 
            propField.MemberType == MemberTypes.Field 
                ? (propField as FieldInfo).FieldType 
                : (propField as PropertyInfo).PropertyType, 
            FieldAttributes.Public);
    }
    var dynamicType = typeBuilder.CreateType();

        // Create and return an expression that maps T => dynamic type
        var sourceItem = Expression.Parameter(sourceType, "item");
        var bindings = dynamicType
            .GetFields()
            .Select(p => Expression.Bind(
                    p, 
                    Expression.PropertyOrField(sourceItem, p.Name)))
            .OfType<MemberBinding>()
            .ToArray();

        var fetcher = Expression.Lambda<Func<T, object>>(
            Expression.Convert(
                Expression.MemberInit(
                    Expression.New(dynamicType.GetConstructor(Type.EmptyTypes)),
                    bindings), 
                typeof(object)),
            sourceItem);                
        return fetcher;
    }
}

Now, to use it:

// What you had originally (hand-tooled query)
var db = entries.AsQueryable();
var query = db.GroupBy(x => new 
    {
        Year = x.CreatedAt.Year,
        Month = x.CreatedAt.Month
    }, prj => prj.AuthorName)
    .Select(data => new {
        Key = data.Key.Year * 100 + data.Key.Month, // very ugly code, I know
        Details = data.GroupBy(y => y).Select(z => new { z.Key, Count = z.Count() })
    });    

var func = db.Epinephelinae(new[]{"CreatedAt", "AuthorName"});
var dquery = db.GroupBy(func, prj => prj.AuthorName);

This solution lacks the flexibility of "nested statements", like "CreatedDate.Month", but with a bit of imagination, you could possibly extend this idea to work with any freeform query.