Generic method to calcuate Euclidean Distance usin

2019-08-17 14:54发布

问题:

I'd like a method that can calculate the Euclidean distance using expressions and order an IQueryable:

sqrt[(q1 - p1)^2 + (q2 - p2)^2 + ... + (qn - pn)^2]

This is the method signature I've come up with:

public static IOrderedQueryable<T> EuclideanDistanceOrder<T>(
    this IQueryable<T> query, IEnumerable<Expression<Func<T, double>>> expressions)
    {
        var orderedQuery = query.OrderBy(i => Math.Sqrt(expressions.Aggregate((total, item) => total + Math.Pow(item, 2))));
        return orderedQuery;
    }

I'm not sure what to do with item and total (since they are Expression<Func<T, double>>). I've tried this a few different ways, including using Expression.Power and Expression.Add. I've tried defining the expressions to be composed separately:

Expression<Func<double, double>> power = i => Math.Pow(i, 2);
Expression<Func<List<Expression<Func<T, double>>>, double>> dist = (items) => Math.Sqrt(items.Sum(power));

But I still don't know what to do with power.

Can I get some help? Is there a better way to approach this?

回答1:

To get it working with EF or LinqToSQL you would have to pass all information as Expressions even property accessors for P and Q. So that is why I have modified your method declaration:

public static class Extension
{
    public static IOrderedQueryable<T> EuclideanDistanceOrder<T>(
        this IQueryable<T> query, 
        IEnumerable<Expression<Func<T, double>>> pExpressions, 
        IEnumerable<Expression<Func<T, double>>> qExpressions)
    {
        var parameter = Expression.Parameter(typeof(T));
        var pBodies = pExpressions
            .Select(x => ReplaceParameter(x.Body, parameter))
            .ToArray();

        var qBodies = qExpressions
            .Select(x => ReplaceParameter(x.Body, parameter))
            .ToArray();

        var distances = pBodies
            .Select((x, i) => CreateDistance(x, qBodies[i]))
            .ToArray();

        var squers = distances
            .Select(x => CreateSquerExpression(x))
            .ToArray();

        var sum = squers.First();
        for (int i = 1; i < squers.Count(); i++)
        {
            sum = Expression.Add(sum, squers[i]);
        }
        var funcExpression = Expression.Lambda<Func<T, double>>(sum, parameter);
        //the sqrt is irrelevant to order of this sequence
        return query.OrderBy(funcExpression);
    }

    private static Expression CreateDistance(Expression p, Expression q)
    {
        return Expression.Subtract(q, p);
    }

    private static Expression CreateSquerExpression(Expression x)
    {
        var method = typeof(Math).GetMethod("Pow", BindingFlags.Static | BindingFlags.Public);
        return Expression.Call(method, x, Expression.Constant(2.0));
    }

    private static Expression ReplaceParameter(Expression expression, ParameterExpression parameter)
    {
        var unaryExpression = expression as UnaryExpression;
        MemberExpression memberExpression;
        if (unaryExpression != null)
        {
            memberExpression = unaryExpression.Operand as MemberExpression;
        }
        else
        {
            memberExpression = expression as MemberExpression;
        }

        if (memberExpression == null)
            throw new NotImplementedException();

        if (!(memberExpression.Expression is ParameterExpression) || !(memberExpression.Member is PropertyInfo))
            throw new NotImplementedException();

        return Expression.Property(parameter, (PropertyInfo)memberExpression.Member);
    }
}

While invoking like this:

    var list = new[]{ new Item
    {
        P1 = 0,
        Q1 = 0,
        P2 = 3,
        Q2 = 1,
    },
    new Item
    {
        P1 = 0,
        Q1 = 0,
        P2 = 2,
        Q2 = 1,
    }
};

var query = list.AsQueryable();

var result = query.EuclideanDistanceOrder(new Expression<Func<Item, double>>[]{
    x => x.P1,
    x => x.P2
},
new Expression<Func<Item, double>>[]{
    x => x.Q1,
    x => x.Q2
}).ToArray();

internal class Item
{
    public double P1 { get; set; }
    public double Q1 { get; set; }
    public double P2 { get; set; }
    public double Q2 { get; set; }
}

It works for liq to objects. I'm only not sure if EF or linqtoSql will map Math.Power method to sql. If not it is easy enough to change to multiplication.



回答2:

I haven't been able to test this, but it seems that it ought to work. There's no square-root at the end, but the order should be the same either way.

public static IOrderedQueryable<T> EuclideanDistanceOrder<T>(this IQueryable<T> query, IEnumerable<Expression<Func<T, double>>> expressions)
{
    var parameter = Expression.Parameter(typeof(T), "item");
    var seed = Expression.Lambda<Func<T, double>>(Expression.Constant((double)0), parameter);
    return query.OrderBy(expressions.Aggregate(seed, GetAggregateExpression));
}

private static Expression<Func<T, double>> GetAggregateExpression<T>(Expression<Func<T, double>> sum, Expression<Func<T, double>> item)
{
    var parameter = Expression.Parameter(typeof(T), "item");
    return Expression.Lambda<Func<T, double>>(Expression.Add(Expression.Invoke(sum, parameter), Expression.Power(Expression.Invoke(item, parameter), Expression.Constant((double)2))), parameter);
}

Edit:

Since you can't use Expression.Invoke(), you'll need to inline the bodies of the Expressions passed into EuclideanDistanceOrder. There doesn't seem to be any "nice" way to do this, so I've written a Replace method to do it. I've only implemented Replace for some of the more common Expression types, hopefully this will be enough to cover your usage, but you may need to implement it for other Expression types.

public static IOrderedQueryable<T> EuclideanDistanceOrder<T>(this IQueryable<T> query, IEnumerable<Expression<Func<T, double>>> expressions)
{
    var parameter = Expression.Parameter(typeof(T), "item");
    var seed = Expression.Constant((double)0);
    var agg = expressions.Aggregate((Expression)seed, (s, item) => Expression.Add(s, Expression.Power(Replace(item.Body, item.Parameters[0], parameter), Expression.Constant((double)2))));
    return query.OrderBy(Expression.Lambda<Func<T, double>>(agg, parameter));
}

private static Expression Replace(Expression expression, ParameterExpression original, ParameterExpression replacement)
{
    if (expression is BinaryExpression)
    {
        var binaryExpression = (BinaryExpression)expression;
        return Expression.MakeBinary(expression.NodeType, Replace(binaryExpression.Left, original, replacement), Replace(binaryExpression.Right, original, replacement), binaryExpression.IsLiftedToNull, binaryExpression.Method, binaryExpression.Conversion);
    }
    if (expression is ConditionalExpression)
    {
        var conditionalExpression = (ConditionalExpression)expression;
        return Expression.Condition(Replace(conditionalExpression.Test, original, replacement), Replace(conditionalExpression.IfTrue, original, replacement), Replace(conditionalExpression.IfFalse, original, replacement), conditionalExpression.Type);
    }
    if (expression is ConstantExpression)
    {
        return expression;
    }
    if (expression is MemberExpression)
    {
        var memberExpression = (MemberExpression)expression;
        return Expression.MakeMemberAccess(Replace(memberExpression.Expression, original, replacement), memberExpression.Member);
    }
    if (expression is ParameterExpression)
    {
        var parameterExpression = (ParameterExpression)expression;
        return parameterExpression == original ? replacement : parameterExpression;
    }
    if (expression is UnaryExpression)
    {
        var unaryExpression = (UnaryExpression)expression;
        return Expression.MakeUnary(unaryExpression.NodeType, Replace(unaryExpression.Operand, original, replacement), unaryExpression.Type, unaryExpression.Method);
    }
    throw new Exception(string.Format("Unsupported expression type: {0}", expression.NodeType));
}

So if for example, our input expressions are:

p => p.X1 - p.X2
p => p.Y1 - p.Y2

The original implementation would've constructed:

i => 0 + expressions[0](i) ^ 2 + expressions[1](i) ^ 2

The new implementation takes the original expression, and replaces the input parameter (p in the above) with the parameter that will be passed to the final lambda (i), and uses the body of the expression directly in the output:

i => 0 + (i.X1 - i.X2) ^ 2 + (i.Y1 - i.Y2) ^ 2