Algorithm for rating the monotonicity of an array-第2页回答

EDIT: Wow, many great responses. Yes, I am using this as a fitness function for judging the quality of a sort performed by a genetic algorithm. So cost-of-evaluation is important (i.e., it has to be fast, preferably O(n).)

As part of an AI application I am toying with, I'd like to be able to rate a candidate array of integers based on its monotonicity, aka its "sortedness". At the moment, I'm using a heuristic that calculates the longest sorted run, and then divides that by the length of the array:

public double monotonicity(int[] array) {
    if (array.length == 0) return 1d;

    int longestRun = longestSortedRun(array);
    return (double) longestRun / (double) array.length;
}

public int longestSortedRun(int[] array) {

    if (array.length == 0) return 0;

    int longestRun = 1;
    int currentRun = 1;

    for (int i = 1; i < array.length; i++) {
        if (array[i] >= array[i - 1]) {
            currentRun++;
        } else {
            currentRun = 1;
        }

        if (currentRun > longestRun) longestRun = currentRun;
    }

    return longestRun;
}

This is a good start, but it fails to take into account the possibility that there may be "clumps" of sorted sub-sequences. E.g.:

{ 4, 5, 6, 0, 1, 2, 3, 7, 8, 9}

This array is partitioned into three sorted sub-sequences. My algorithm will rate it as only 40% sorted, but intuitively, it should get a higher score than that. Is there a standard algorithm for this sort of thing?

标签： math artificial-intelligence genetic-algorithm information-theory

11条回答

Ridiculous、

2楼-- · 2019-03-26 03:30

It highly depends on what you're intending to use the measure for, but one easy way to do this is to feed the array into a standard sorting algorithm and measure how many operations (swaps and/or comparisons) need to be done to sort the array.

0人赞添加讨论(0) 举报

甜甜的少女心

3楼-- · 2019-03-26 03:31

I have the same problem (monotonicity scoring), and I suggest you to try Longest Increasing Subsequence. The most efficient algorithm run in O(n log n), not so bad.

Taking example from the question, the longest increasing sequence of {4, 5, 6, 0, 1, 2, 3, 7, 8, 9} is {0, 1, 2, 3, 7, 8, 9} (length of 7). Maybe it rate better (70%) than your longest-sorted-run algorithm.

0人赞添加讨论(0) 举报

趁早两清

4楼-- · 2019-03-26 03:32

I expect that the choice of function to use depends very strongly on what you intend to use it for. Based on your question, I would guess that you are using a genetic system to create a sorting program, and this is to be the ranking function. If that is the case, then speed of execution is crucial. Based on that, I bet your longest-sorted-subsequence algorithm would work pretty well. That sounds like it should define fitness pretty well.

0人赞添加讨论(0) 举报

Fickle 薄情

5楼-- · 2019-03-26 03:33

This seems like a good candidate for ~~Levenshtein~~ Damerau–Levenshtein distance - the number of swaps needed to sort the array. This should be proportional to how far each item is from where it should be in a sorted array.

Here's a simple ruby algorithm that sums the squares of the distances. It seems a good measure of sortedness - the result gets smaller every time two out-of-order elements are swapped.

ap = a.sort
sum = 0
a.each_index{|i| j = ap.index(a[i])-i 
  sum += (j*j)
}
dist = sum/(a.size*a.size)

0人赞添加讨论(0) 举报

Bombasti

6楼-- · 2019-03-26 03:38

I would suggest looking at the Pancake Problem and the reversal distance of the permutations. These algorithms are often used to find the distance between two permutations (the Identity and the permuted string). This distance measure should take into account more clumps of in order values, as well as reversals (monotonically decreasing instead of increasing subsequences). There are also approximations that are polynomial time[PDF].

It really all depends on what the number means and if this distance function makes sense in your context though.

0人赞添加讨论(0) 举报

上一页 1 2

Algorithm for rating the monotonicity of an array

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间