可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

I am trying to create a wordcloud in python after cleaning text file ,

I got the required results i.e words which are mostly used in the text file but unable to plot.

My code:

import collections
from wordcloud import WordCloud
import matplotlib.pyplot as plt

file = open('example.txt', encoding = 'utf8' )
stopwords = set(line.strip() for line in open('stopwords'))
wordcount = {}

for word in file.read().split():
    word = word.lower()
    word = word.replace(".","")
    word = word.replace(",","")
    word = word.replace("\"","")
    word = word.replace("“","")
    if word not in stopwords:
        if word not in wordcount:
            wordcount[word] = 1
        else:
            wordcount[word] += 1

d = collections.Counter(wordcount)
for word, count in d.most_common(10):
    print(word , ":", count)

#wordcloud = WordCloud().generate(text)
#fig = plt.figure()
#fig.set_figwidth(14)
#fig.set_figheight(18)

#plt.imshow(wordcloud.recolor(color_func=grey_color, random_state=3))
#plt.title(title, color=fontcolor, size=30, y=1.01)
#plt.annotate(footer, xy=(0, -.025), xycoords='axes fraction', fontsize=infosize, color=fontcolor)
#plt.axis('off')
#plt.show()

Edit: Plotted the wordcloud with following code:

wordcloud = WordCloud(background_color='white',
                          width=1200,
                          height=1000
                         ).generate((d.most_common(10)))


plt.imshow(wordcloud)
plt.axis('off')
plt.show()

But getting TypeError: expected string or buffer

when I tried the above code with .generate(str(d.most_common(10)))

The wordcloud formed is showing apostrophe(') sign after several words

using Jupyter Notebook | python3 | Ipython

回答1:

First download this file Symbola.ttf in the current folder of the following script.

Architecture file:

file.txt Symbola.ttf my_word_cloud.py

file.txt:

foo buzz bizz foo buzz bizz foo buzz bizz foo buzz bizz foo buzz bizz
foo foo foo foo foo foo foo foo foo foo bizz bizz bizz bizz foo foo

my_word_cloud.py:

import io
from collections import Counter
from os import path

import matplotlib.pyplot as plt
from wordcloud import WordCloud

d = path.dirname(__file__)

# It is important to use io.open to correctly load the file as UTF-8
text = io.open(path.join(d, 'file.txt')).read()

words = text.split()
print(Counter(words))

# Generate a word cloud image
# The Symbola font includes most emoji
font_path = path.join(d, 'Symbola.ttf')
word_cloud = WordCloud(font_path=font_path).generate(text)

# Display the generated image:
plt.imshow(word_cloud)
plt.axis("off")
plt.show()

Result:

Counter({'foo': 17, 'bizz': 9, 'buzz': 5})

See a lot of other examples, here I created a simple example for you:

https://github.com/amueller/word_cloud/tree/master/examples

回答2:

most_common(x) is not a method of WordCloud. However, you can pass the parameter

max_words =

and this should do what you're attempting.

Creating wordcloud using python

问题:

回答1:

回答2:

收藏的人(0)

Creating wordcloud using python

问题:

回答1:

回答2:

收藏的人(0)

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮