Does anyone know where this ImportError is originating from and how to fix it? I'm working from a CSV file to do some text mining. At this point, I'm simply trying to tokenize the words in some job descriptions in the file and then vectorize and count the dimensions. However, I am getting this error. The original code follows this error message for you to see. I've tried uninstalling Anaconda and reinstalling it as well as all of the packages. This code runs absolutely fine on my PC (an old Gateway) but does not run on my Mac (2012) with Lion OSX.
---------------------------------------------------------------------------
ImportError Traceback (most recent call last)
<ipython-input-49-7fcd55a48eba> in <module>()
----> 1 from sklearn.feature_extraction.text import CountVectorizer
2 cv = CountVectorizer(lowercase=True)
3 vector = cv.fit_transform(words).toarray()
4 print vector.shape
//anaconda/lib/python2.7/site-packages/sklearn/__init__.py in <module>()
35 # process, as it may not be compiled yet
36 else:
---> 37 from . import __check_build
38 from .base import clone
39 __check_build # avoid flakes unused variable error
ImportError: cannot import name __check_build
from nltk.tokenize import word_tokenize
create a list of words for all postings
words = []
for p in postList[:100]:
temp = word_tokenize(p[2])
temp2 = [w.lower() for w in temp]
string = ''
for w in temp2:
string += w + ', '
string = string[:-1]
words.append(string)
print words
from sklearn.feature_extraction.text import CountVectorizer
cv = CountVectorizer(lowercase=True)
vector = cv.fit_transform(words).toarray()
print vector.shape