I have four values
age = 23
gender = "M"
city ="Delhi"
religion = "Muslim"
I need these arranged by every combination with empty values like -
23 * * *
23 M * *
23 M Delhi *
23 M Delhi Muslim
* M * *
* M Delhi *
* M Delhi Muslim
* * Delhi *
* * Delhi Muslim
* * * Muslim
* * * *
I need this arranged by the number of dimensions in ascending order in a list. So combinations with one value should be on top. I have some 30+ attributes, so i need an automated way to do this in Python
Any ideas ?
NPE's answer solves the problem by constructing the full list of subsets in memory and then sorting it. This requires O(2n) space and O(n2 2n) time. If this is unacceptable, then here's a way to generate the subsets in O(n) space and O(n 2n) time.
from itertools import combinations
def subsets(s, placeholder = None):
"""
Generate the subsets of `s` in order of size.
Use `placeholder` for missing elements (default: None).
"""
s = list(s)
n = len(s)
r = range(n)
for i in range(n + 1):
for c in combinations(r, i):
result = [placeholder] * n
for j in c:
result[j] = s[j]
yield result
>>> from pprint import pprint
>>> pprint(list(subsets([23, 'M', 'Delhi', 'Muslim'])))
[[None, None, None, None],
[23, None, None, None],
[None, 'M', None, None],
[None, None, 'Delhi', None],
[None, None, None, 'Muslim'],
[23, 'M', None, None],
[23, None, 'Delhi', None],
[23, None, None, 'Muslim'],
[None, 'M', 'Delhi', None],
[None, 'M', None, 'Muslim'],
[None, None, 'Delhi', 'Muslim'],
[23, 'M', 'Delhi', None],
[23, 'M', None, 'Muslim'],
[23, None, 'Delhi', 'Muslim'],
[None, 'M', 'Delhi', 'Muslim'],
[23, 'M', 'Delhi', 'Muslim']]
How about the following:
In [21]: attrib = (23, "M", "Delhi", "Muslim")
In [25]: comb = list(itertools.product(*((a, None) for a in attrib)))
In [26]: comb
Out[26]:
[(23, 'M', 'Delhi', 'Muslim'),
(23, 'M', 'Delhi', None),
(23, 'M', None, 'Muslim'),
(23, 'M', None, None),
(23, None, 'Delhi', 'Muslim'),
(23, None, 'Delhi', None),
(23, None, None, 'Muslim'),
(23, None, None, None),
(None, 'M', 'Delhi', 'Muslim'),
(None, 'M', 'Delhi', None),
(None, 'M', None, 'Muslim'),
(None, 'M', None, None),
(None, None, 'Delhi', 'Muslim'),
(None, None, 'Delhi', None),
(None, None, None, 'Muslim'),
(None, None, None, None)]
Now, if I understood your sorting requirements correctly, the following should do it:
In [27]: sorted(comb, key=lambda x:sum(v is not None for v in x))
Out[27]:
[(None, None, None, None),
(23, None, None, None),
(None, 'M', None, None),
(None, None, 'Delhi', None),
(None, None, None, 'Muslim'),
(23, 'M', None, None),
(23, None, 'Delhi', None),
(23, None, None, 'Muslim'),
(None, 'M', 'Delhi', None),
(None, 'M', None, 'Muslim'),
(None, None, 'Delhi', 'Muslim'),
(23, 'M', 'Delhi', None),
(23, 'M', None, 'Muslim'),
(23, None, 'Delhi', 'Muslim'),
(None, 'M', 'Delhi', 'Muslim'),
(23, 'M', 'Delhi', 'Muslim')]
I've used None
where you used *
, but it's trivial to use the latter.
Of course with 30 attributes you're looking at ~1 billion combinations, so the flattening of the list with subsequent sorting probably won't work. However, what can you usefully do with 1 billion entries anyway?