我试图找出如果我的问题,使用内置排序()函数是可解的,或者如果我需要做自己 - 使用CMP老同学本来是比较容易的。
我的数据集的样子:
x = [
('business', Set('fleet','address'))
('device', Set('business','model','status','pack'))
('txn', Set('device','business','operator'))
....
排序规则应基本对于N&Y所有值,其中Y> N,X [N] [0]不是在X [Y] [1]
虽然我使用Python 2.6,其中CMP的说法仍然是可用的,我试图让这个Python 3的安全。
因此,这可以使用一些魔法的λ和关键参数做什么?
- == UPDATE == -
感谢礼和温斯顿! 我真的没有想到用钥匙将工作,或者如果我能怀疑这将是一个鞋拔解决方案,它是不理想的。
因为我的问题是对数据库表的依赖关系,我不得不稍作除了利的代码从一个依赖列表中删除的项目(在一个精心设计的数据库,这不会发生,但谁住在那个神奇的完美的世界?)
我的解决方案:
def topological_sort(source):
"""perform topo sort on elements.
:arg source: list of ``(name, set(names of dependancies))`` pairs
:returns: list of names, with dependancies listed first
"""
pending = [(name, set(deps)) for name, deps in source]
emitted = []
while pending:
next_pending = []
next_emitted = []
for entry in pending:
name, deps = entry
deps.difference_update(set((name,)), emitted) # <-- pop self from dep, req Py2.6
if deps:
next_pending.append(entry)
else:
yield name
emitted.append(name) # <-- not required, but preserves original order
next_emitted.append(name)
if not next_emitted:
raise ValueError("cyclic dependancy detected: %s %r" % (name, (next_pending,)))
pending = next_pending
emitted = next_emitted
你想要什么叫做拓扑排序 。 虽然可以使用内置实行sort()
这是相当尴尬,而且最好是直接在Python实现拓扑排序。
为什么要尴尬? 如果你研究的维基页面上的两个算法,它们都依赖于一个正在运行的一套“标节点”的,一个概念,很难扭曲成一种形式sort()
可以使用,因为key=xxx
(甚至cmp=xxx
)无国籍比较函数效果最好,特别是因为timsort不保证该元素将被检查的顺序。我(很)确保其不使用任何解决方案sort()
将要结束了冗余计算每个呼叫的一些信息以键/ CMP功能,以避开无国籍问题。
以下是我一直在使用(一些JavaScript库的依赖关系排序)ALG:
编辑:大大返工此基础上温斯顿·尤尔特的解决方案
def topological_sort(source):
"""perform topo sort on elements.
:arg source: list of ``(name, [list of dependancies])`` pairs
:returns: list of names, with dependancies listed first
"""
pending = [(name, set(deps)) for name, deps in source] # copy deps so we can modify set in-place
emitted = []
while pending:
next_pending = []
next_emitted = []
for entry in pending:
name, deps = entry
deps.difference_update(emitted) # remove deps we emitted last pass
if deps: # still has deps? recheck during next pass
next_pending.append(entry)
else: # no more deps? time to emit
yield name
emitted.append(name) # <-- not required, but helps preserve original ordering
next_emitted.append(name) # remember what we emitted for difference_update() in next pass
if not next_emitted: # all entries have unmet deps, one of two things is wrong...
raise ValueError("cyclic or missing dependancy detected: %r" % (next_pending,))
pending = next_pending
emitted = next_emitted
旁注:有可能鞋拔一个cmp()
函数转换成key=xxx
,如在本蟒错误跟踪概述消息 。
我做了拓扑排序是这样的:
def topological_sort(items):
provided = set()
while items:
remaining_items = []
emitted = False
for item, dependencies in items:
if dependencies.issubset(provided):
yield item
provided.add(item)
emitted = True
else:
remaining_items.append( (item, dependencies) )
if not emitted:
raise TopologicalSortFailure()
items = remaining_items
我认为它有点更直截了当比利的版本,我不知道效率。
在寻找坏的格式和这个陌生的Set
类型...(我已经把他们作为元组和正确分隔列表项...)...和使用networkx
库使事情方便...
x = [
('business', ('fleet','address')),
('device', ('business','model','status','pack')),
('txn', ('device','business','operator'))
]
import networkx as nx
g = nx.DiGraph()
for key, vals in x:
for val in vals:
g.add_edge(key, val)
print nx.topological_sort(g)
这是温斯顿的建议,与文档字符串和一个小的调整,扭转dependencies.issubset(provided)
与provided.issuperset(dependencies)
。 这种变化允许你通过dependencies
中的每个输入对作为任意迭代而不一定一set
。
我用例涉及dict
的键是项目的字符串,每个键是一个数值list
上说关键要看该项目的名称。 一旦我确定, dict
非空,我可以通过它的iteritems()
的改进算法。
再次感谢温斯顿。
def topological_sort(items):
"""
'items' is an iterable of (item, dependencies) pairs, where 'dependencies'
is an iterable of the same type as 'items'.
If 'items' is a generator rather than a data structure, it should not be
empty. Passing an empty generator for 'items' (zero yields before return)
will cause topological_sort() to raise TopologicalSortFailure.
An empty iterable (e.g. list, tuple, set, ...) produces no items but
raises no exception.
"""
provided = set()
while items:
remaining_items = []
emitted = False
for item, dependencies in items:
if provided.issuperset(dependencies):
yield item
provided.add(item)
emitted = True
else:
remaining_items.append( (item, dependencies) )
if not emitted:
raise TopologicalSortFailure()
items = remaining_items