Algorithm for rating the monotonicity of an array

EDIT: Wow, many great responses. Yes, I am using this as a fitness function for judging the quality of a sort performed by a genetic algorithm. So cost-of-evaluation is important (i.e., it has to be fast, preferably O(n).)

As part of an AI application I am toying with, I'd like to be able to rate a candidate array of integers based on its monotonicity, aka its "sortedness". At the moment, I'm using a heuristic that calculates the longest sorted run, and then divides that by the length of the array:

public double monotonicity(int[] array) {
    if (array.length == 0) return 1d;

    int longestRun = longestSortedRun(array);
    return (double) longestRun / (double) array.length;
}

public int longestSortedRun(int[] array) {

    if (array.length == 0) return 0;

    int longestRun = 1;
    int currentRun = 1;

    for (int i = 1; i < array.length; i++) {
        if (array[i] >= array[i - 1]) {
            currentRun++;
        } else {
            currentRun = 1;
        }

        if (currentRun > longestRun) longestRun = currentRun;
    }

    return longestRun;
}

This is a good start, but it fails to take into account the possibility that there may be "clumps" of sorted sub-sequences. E.g.:

{ 4, 5, 6, 0, 1, 2, 3, 7, 8, 9}

This array is partitioned into three sorted sub-sequences. My algorithm will rate it as only 40% sorted, but intuitively, it should get a higher score than that. Is there a standard algorithm for this sort of thing?

标签： math artificial-intelligence genetic-algorithm information-theory

11条回答

一纸荒年 Trace。

2楼-- · 2019-03-26 03:11

How about counting the number of steps with increasing value vs. the number of total steps. That's O(n).

0人赞添加讨论(0) 举报

Evening l夕情丶

3楼-- · 2019-03-26 03:12

What you're probably looking for is Kendall Tau. It's a one-to-one function of the bubble sort distance between two arrays. To test whether an array is "almost sorted", compute its Kendall Tau against a sorted array.

0人赞添加讨论(0) 举报

Bombasti

4楼-- · 2019-03-26 03:14

Some experiments with a modifier Ratcliff & Obershelp

>>> from difflib import SequenceMatcher as sm
>>> a = [ 4, 5, 6, 0, 1, 2, 3, 7, 8, 9 ]
>>> c = [ 0, 1, 9, 2, 8, 3, 6, 4, 7, 5 ]
>>> b = [ 4, 5, 6, 0, 1, 2, 3, 7, 8, 9 ]
>>> b.sort()
>>> s = sm(None, a, b)
>>> s.ratio()
0.69999999999999996
>>> s2 = sm(None, c, b)
>>> s2.ratio()
0.29999999999999999

So kind of does what it needs to. Not too sure how to prove it though.

0人赞添加讨论(0) 举报

贼婆χ

5楼-- · 2019-03-26 03:23

Something like these? http://en.wikipedia.org/wiki/Rank_correlation

0人赞添加讨论(0) 举报

The star\"

6楼-- · 2019-03-26 03:26

Compute the lenghts of all sorted sub-sequences, then square them and add them. If you want to calibrate how much enphasis you put on largest, use a power different than 2.

I'm not sure what's the best way to normalize this by length, maybe divide it per length squared?

0人赞添加讨论(0) 举报

爷的心禁止访问

7楼-- · 2019-03-26 03:28

Here's one I just made up.

For each pair of adjacent values, calculate the numeric difference between them. If the second is greater than or equal to the first, add that to the sorted total, otherwise add to the unsorted total. When done, take the ratio of the two.

0人赞添加讨论(0) 举报

1 2 下一页

Algorithm for rating the monotonicity of an array

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间