I have a list of names, e.g. ['Agrajag', 'Colin', 'Deep Thought', ... , 'Zaphod Beeblebrox', 'Zarquon']
. Now I want to partition this list into approximately equally sized sublists, so that the boundaries of the subgroups are at the first letter of the names, e.g A-F, G-L, M-P, Q-Z, not A-Fe, Fi-Mo, Mu-Pra, Pre-Z.
I could only come up with a statically sized parition that doesn't take size of the subgroups into account:
import string, itertools
def _group_by_alphabet_key(elem):
char = elem[0].upper()
i = string.ascii_uppercase.index(char)
if i > 19:
to_c = string.ascii_uppercase[-1];
from_c = string.ascii_uppercase[20]
else:
from_c = string.ascii_uppercase[i/5*5]
to_c = string.ascii_uppercase[i/5*5 + 4]
return "%s - %s" % (from_c, to_c)
subgroups = itertools.groupby(name_list, _group_by_alphabet_key)
Any better ideas?
P.S.: this may sound somewhat like homework, but it actually is for a webpage where members should be displayed in 5-10 tabs of equally sized groups.
Here's something that might work. I feel certain there's a simpler way though... probably involving
itertools
. Note thatnum_pages
only roughly determines how many pages you'll actually get.EDIT: Whoops! There was a bug -- it was cutting off the last group! The below should be fixed, but note that the length of the last page will be slightly unpredictable. Also, I added
.upper()
to account for possible lowercase names.EDIT2: The previous method of defining letter_groups was inefficient; the below dict-based code is more scalable:
Since your
name_list
has to be sorted forgroupby
to work, can't you just check every Nth value and build your divisions that way?And using
"A"
as your leftmost endpoint and"Z"
as your rightmost endpoint, you can construct the N divisions accordingly and they should all have the same size.right_endpoints[0]
.right_endpoints[0]
, the next right endpoint would beright_endpoints[1]
.The issue you may run into is what if two of these
right_endpoints
are the same...edit: example