ValueError with spacy.load('en_core_web_sm'

2019-08-17 06:25发布

问题:

I'm getting ValueError: could not broadcast input array from shape (96) into shape (128) for spacy.load('en_core_web_sm')

I manually downloaded and installed the model as i'm working on a work computer with download restrictions.

I have followed the instructions to download and copy from this link: https://github.com/explosion/spaCy/issues/3113

  1. Copy the folder Python35\lib\site-packages\en_core_web_sm create a folder named en in Python35\Lib\site-packages\spacy\data , paste the copied contents to en, and rename the folder as en_core_web_sm-2.0.0.

  2. Copy the __init__.py file in en_core_web_sm and paste it in en (that is, the init.py file must be in both Python35\Lib\site-packages\spacy\data\en and Python35\Lib\site-packages\spacy\data\en\en_core_web_sm-2.0.0

I am able to run spacy.load('en_core_web_sm') but am giving a ValueError instead. Appreciate all help. Thanks!

回答1:

I had the same error. Updated spacy to version 2.1.3. Now it is working properly.

If you are using Anaconda: conda install -c conda-forge spacy



回答2:

In order to let you use the en_core_web_sm model via the shortcut link 'en', spaCy creates a symlink. This means you need to have permissions to do this. See here for more details: https://spacy.io/usage/models#usage-link

A note in case others come across this issue later: Copy-pasting the folder and renaming it is really only the last resort if you can't run the command with admin permissions and you need to be able to load the model via spacy.load('en'). This is usually not the case – you can just install the model and load it via its full name, spacy.load('en_core_web_sm'). In fact, I often prefer this syntax, since it's more explicit and you immediately know which model is loaded.

--Copied from the same link you have mentioned in the question. No Copyright violation.