A short description about my working environment: win 7 x64, python 2.7 x64, scrapy 0.22, cx_Freeze 4.3.2.
First, I developed a simple crawl-spider and it works fine. Then, using the core scrapy API, I created an external script main.py, which can run spider, and it also works as required. Here is the code of the script:
# external main.py using scrapy core API, 'test' is just replaced name of my project
from twisted.internet import reactor
from scrapy.crawler import Crawler
from scrapy import log, signals
from test.spiders.testSpider import TestSpider
from test import settings, pipelines
from scrapy.utils.project import get_project_settings
spider = TestSpider(domain='test.com')
settings = get_project_settings()
crawler = Crawler(settings)
crawler.signals.connect(reactor.stop, signal=signals.spider_closed)
crawler.configure()
crawler.crawl(spider)
crawler.start()
log.start()
reactor.run()
So now i'm trying to make binary for all of this with cx_Freeze using setup.py like in another topic here. Here is the code:
from cx_Freeze import setup, Executable
includes = ['scrapy', 'pkg_resources', 'lxml.etree', 'lxml._elementpath']
build_options = {'compressed' : True,
'optimize' : 2,
'namespace_packages' : ['zope', 'scrapy', 'pkg_resources'],
'includes' : includes,
'excludes' : []}
executable = Executable(script='main.py',
copyDependentFiles=True,
includes=includes)
setup(name='Stand-alone scraper',
version='0.1',
description='Stand-alone scraper',
options= {'build_exe': build_options},
executables=[executable])
It's normally compiling into exe-file. Problems starts when i try to run it:
Traceback (most recent call last):
File "C:\Python27\lib\site-packages\cx_Freeze\initscripts\Console.py", line 27, in <module>
exec code in m.__dict__
File "main.py", line 2, in <module>
from scrapy.crawler import Crawler
File "C:\Python27\lib\site-packages\scrapy\__init__.py", line 6, in <module>
__version__ = pkgutil.get_data(__package__, 'VERSION').strip()
File "C:\Python27\lib\pkgutil.py", line 591, in get_data
return loader.get_data(resource_name)
IOError: [Errno 2] No such file or directory: 'scrapy\\VERSION'
I solved this problem just moving scrapy\version file from original source (python\lib\site-packages\scrapy) to library.zip\scapy in build-folder. After second run of main.exe i got another message:
Traceback (most recent call last):
File "C:\Python27\lib\site-packages\cx_Freeze\initscripts\Console.py", line 27, in <module>
exec code in m.__dict__
File "main.py", line 11, in <module>
crawler = Crawler(settings)
File "C:\Python27\lib\site-packages\scrapy\crawler.py", line 20, in __init__
self.stats = load_object(settings['STATS_CLASS'])(self)
File "C:\Python27\lib\site-packages\scrapy\utils\misc.py", line 42, in load_object
raise ImportError("Error loading object '%s': %s" % (path, e))
ImportError: Error loading object 'scrapy.statscol.MemoryStatsCollector': No module named statscol
I didn't find any solution of this, and just try to import module from error message in the my main.py. Briefly -it didn't work. Every new import i got a new message with another module (totally i tried to import 15 :)) modules, until got error about aes module in cryptography. I also tryied to use cx_freeze alternatives like py2exe and pyinstaller, but same result.
Can anybody help me to solve this problem? Thank you for reading to this point.