How write code and run python's files using sp

2019-08-20 05:19发布

问题:

I want to implement a new model language for spaCY. I have installed spaCy (using the guide of the official web site) on my Windows SO but I haven't understand where and how I could write and run my future files. Help me, Thanks.

回答1:

I hope I understand your question correctly: If you only want to use spaCy, you can simply create a Python file, import spacy and run it.

However, if you want to add things to the spaCy source – for example to add new language data that doesn't yet exist – you need to compile spaCy from source. On Windows, this needs a little more preparation – but it's not that difficult:

  1. Install the Visual C++ Build Tools, which include the compiler you need.
  2. Fork and clone the spaCy repository on GitHub.
  3. Navigate to that directory and install spaCy's dependencies (other packages plus developer requirements like Cython) by running pip install -r requirements.txt.
  4. Then run python setup.py build_ext --inplace from the same directory. This will build and compile spaCy into the directory.
  5. Make sure your PYTHONPATH is set to the new spaCy directory. This is important so Python knows that you want to execute this exact version of spaCy, and not some other one you have installed somewhere else. On Windows, I normally use this command: set PYTHONPATH=C:\path\to\spacy\directory. There's also this thread with more info. (I'm no Windows expert, though – so if anyone reads this and disagrees, feel free to correct me here.)

You can now edit the source, add files and run them. If you want to add a new language, I'd recommend starting by adding a new directory to spacy/lang and creating an __init__.py. You can find more info on how this should look in the usage guide on adding languages.

To test if everything works, start the Python interpreter and import and initialise your language. For example, let's assume you've added Icelandic. You should then be able to do this:

from spacy.lang.is import Icelandic
nlp = Icelandic()