Finding median of list in Python

2019-01-03 05:01发布

How do you find the median of a list in Python? The list can be of any size and the numbers are not guaranteed to be in any particular order.

If the list contains an even number of elements, the function should return the average of the middle two.

Here are some examples (sorted for display purposes):

median([1]) == 1
median([1, 1]) == 1
median([1, 1, 2, 4]) == 1.5
median([0, 2, 5, 6, 8, 9, 9]) == 6
median([0, 0, 0, 0, 4, 4, 6, 8]) == 2

18条回答
爱情/是我丢掉的垃圾
2楼-- · 2019-01-03 05:41

I posted my solution at Python implementation of "median of medians" algorithm , which is a little bit faster than using sort(). My solution uses 15 numbers per column, for a speed ~5N which is faster than the speed ~10N of using 5 numbers per column. The optimal speed is ~4N, but I could be wrong about it.

Per Tom's request in his comment, I added my code here, for reference. I believe the critical part for speed is using 15 numbers per column, instead of 5.

#!/bin/pypy
#
# TH @stackoverflow, 2016-01-20, linear time "median of medians" algorithm
#
import sys, random


items_per_column = 15


def find_i_th_smallest( A, i ):
    t = len(A)
    if(t <= items_per_column):
        # if A is a small list with less than items_per_column items, then:
        #
        # 1. do sort on A
        # 2. find i-th smallest item of A
        #
        return sorted(A)[i]
    else:
        # 1. partition A into columns of k items each. k is odd, say 5.
        # 2. find the median of every column
        # 3. put all medians in a new list, say, B
        #
        B = [ find_i_th_smallest(k, (len(k) - 1)/2) for k in [A[j:(j + items_per_column)] for j in range(0,len(A),items_per_column)]]

        # 4. find M, the median of B
        #
        M = find_i_th_smallest(B, (len(B) - 1)/2)


        # 5. split A into 3 parts by M, { < M }, { == M }, and { > M }
        # 6. find which above set has A's i-th smallest, recursively.
        #
        P1 = [ j for j in A if j < M ]
        if(i < len(P1)):
            return find_i_th_smallest( P1, i)
        P3 = [ j for j in A if j > M ]
        L3 = len(P3)
        if(i < (t - L3)):
            return M
        return find_i_th_smallest( P3, i - (t - L3))


# How many numbers should be randomly generated for testing?
#
number_of_numbers = int(sys.argv[1])


# create a list of random positive integers
#
L = [ random.randint(0, number_of_numbers) for i in range(0, number_of_numbers) ]


# Show the original list
#
# print L


# This is for validation
#
# print sorted(L)[int((len(L) - 1)/2)]


# This is the result of the "median of medians" function.
# Its result should be the same as the above.
#
print find_i_th_smallest( L, (len(L) - 1) / 2)
查看更多
劫难
3楼-- · 2019-01-03 05:42
def median(array):
    """Calculate median of the given list.
    """
    # TODO: use statistics.median in Python 3
    array = sorted(array)
    half, odd = divmod(len(array), 2)
    if odd:
        return array[half]
    return (array[half - 1] + array[half]) / 2.0
查看更多
萌系小妹纸
4楼-- · 2019-01-03 05:43

It is very simple;

def median(alist):
    first = 0
    last = len(alist)-1
    midpoint = (first + last)//2
    return midpoint

And you can use the return value like this median = median(anyList)

查看更多
混吃等死
5楼-- · 2019-01-03 05:47

You can use the list.sort to avoid creating new lists with sorted and sort the lists in place.

Also you should not use list as a variable name as it shadows python's own list.

def median(l):
    half = len(l) // 2
    l.sort()
    if not len(l) % 2:
        return (l[half - 1] + l[half]) / 2.0
    return l[half]
查看更多
姐就是有狂的资本
6楼-- · 2019-01-03 05:50
def midme(list1):

    list1.sort()
    if len(list1)%2>0:
            x = list1[int((len(list1)/2))]
    else:
            x = ((list1[int((len(list1)/2))-1])+(list1[int(((len(list1)/2)))]))/2
    return x


midme([4,5,1,7,2])
查看更多
太酷不给撩
7楼-- · 2019-01-03 05:52

For :

Use numpy.median() to make a one-line function:

>>> from numpy import median
>>> median([1, -4, -1, -1, 1, -3])
-1.0

Or, to write a function:

def median(lst):
    n = len(lst)
    if n < 1:
            return None
    if n % 2 == 1:
            return sorted(lst)[n//2]
    else:
            return sum(sorted(lst)[n//2-1:n//2+1])/2.0

>>> median([-5, -5, -3, -4, 0, -1])
-3.5

For , use statistics.median:

>>> from statistics import median
>>> median([5, 2, 3, 8, 9, -2])
4.0
查看更多
登录 后发表回答