plot actual set items in python, not the number of

2020-04-21 05:21发布

I wrote this small function:

def sets():
    set1 = random.sample(range(1, 50), 10)
    set2 = random.sample(range(1, 50), 10)
    return(set1,set2)

sets()

The output of this function looks like this:

([24, 29, 43, 42, 45, 28, 26, 3, 8, 21],
 [22, 37, 38, 44, 25, 42, 29, 7, 35, 9])

I want to plot this in a two way Venn diagram. I know how to plot the NUMBERS of overlap between the sets using the matplotlib, i.e. using this exact code; however I want to plot the ACTUAL VALUES in the plot instead.

i.e. the overlap between the two should read: 29,42 as these are the two items in common, and not the number 2, to represent the number of numbers that overlap.

Would someone know how to do this?

2条回答
爷的心禁止访问
2楼-- · 2020-04-21 05:28

A possible solution is to output the labels instead of the set size. With the matplotlib_venn package, you can do something like this:

import matplotlib.pyplot as plt
from matplotlib_venn import venn2
import random

set1 = set(random.sample(range(1, 50), 10))
set2 = set(random.sample(range(1, 50), 10))
venn = venn2([set1,set2], ('Group A', 'Group B'))

venn.get_label_by_id('100').set_text('\n'.join(map(str,set1-set2)))
venn.get_label_by_id('110').set_text('\n'.join(map(str,set1&set2)))
venn.get_label_by_id('010').set_text('\n'.join(map(str,set2-set1)))
plt.axis('on')
plt.show()

We're accessing the labels by a binary ID, which denotes the sets. enter image description here

查看更多
▲ chillily
3楼-- · 2020-04-21 05:28

The default behaviour of the venn2 package is to print the size of the overlap of the two sets. Here's the line of the source code where those sizes are added to the Venn diagram plot: https://github.com/konstantint/matplotlib-venn/blob/master/matplotlib_venn/_venn2.py#L247

To make this print the overlapping numbers you'll have to change the compute_venn2_subsets(a,b) function in this file. Replace the returned argument of compute_venn2_subsets(a,b) with:

([val for val in a if val not in b], [val for val in a if val in b], [val for val in b if val not in a])

instead of the set sizes that it's returning right now. If you only want to print the overlapping columns, then make compute_venn2_subsets(a,b) return

("", [val for val in a if val in b], "")
查看更多
登录 后发表回答