Confidence calculation in association rule [closed

2019-09-30 08:34发布

问题:

supportData = {('ELF'): 0.75, ('CAT'): 0.75, ('BAT', 'CAT', 'ELF'): 0.5, ('ARK',    'BAT'): 0.25, ('ARK', 'ELF'): 0.25, ('CAT', 'ELF'): 0.5, ('DOG'): 0.25, ('BAT', 'CAT'): 0.5, ('BAT', 'ELF'): 0.75, ('ARK'): 0.5, ('ARK', 'CAT'): 0.5, ('BAT'): 0.75}

L = [('ARK'), ('CAT'), ('CAT'), ('ELF'),('ARK', 'CAT'), ('BAT', 'ELF'), ('BAT', 'CAT'), ('CAT', 'ELF'),('BAT', 'CAT', 'ELF')]
for freqSet in L:

    H =  list(freqSet)

    if len(H) == 1:
        pass
    else:
            for conseq in H:
            freqsetlist = list(freqSet)
            freqsetlist.remove(conseq)
            if len(freqsetlist) == 1:
               conf = supportData[freqSet]/supportData[tuple(freqsetlist)[0]]
               if conf >= 0.1:
                  print freqsetlist,'-->',conseq,'conf:',conf
            else:
               conf = supportData[freqSet]/supportData[tuple(freqsetlist)[:]]
               if conf >= 0.1:
                  print freqsetlist,'-->',conseq,'conf:',conf

Output

KeyError: ('R','K')

Can someone point out why I am getting this error? It seems the error occur when len(freqsetlist) is > 1. That is when calculating tuple with 3 element

回答1:

That is the representation of the object, if you want a different representation, you will have to construct it yourself:

>>> k = ['van']
>>> "({})".format(", ".join(k))
'(van)'

Note that this implies you are using Python's representation of an object as a part of your program, this is a bad idea, and you should always construct what you need manually rather than try and use Python's representation, which is intended for debugging.

Edit: The comma is Python's way of showing it's a tuple, as brackets signify grouping of operations rather than tuples by default. You could make your own tuple subclass and change the __repr__()/__str__() if you really wanted, but that would be incredibly pointless (and unpythonic in the case of __repr__() as it should evaluate to the object).



回答2:

  supportData = {('nas','fat'): 0.5, ('nas'): 1.0, ('fat'):0.6, ('van'):0.72, ('jos'):0.55,('van','jos'):0.10}

  itemSets = [('nas','fat'),('van','jos')]

  for freqSet in itemSets: H = [''.join(list(item)) for item in freqSet]

  for conseq in H:

    freqsetlist = list(freqSet)
    freqsetlist.remove(conseq)
    conf = supportData[freqSet]/supportData[tuple(freqsetlist)[0]]