I tried to download DrQA, a system for reading comprehension applied to open-domain question answering developed by Facebook research group. It included to call for the provided trained models and data for Wikipedia question answering. Yet when I tried to launch the interactive session this AttributeError
appeared:
(drqa_env) mike@mike-thinks:~/Programing/DrQA$ python scripts/pipeline/interactive.py
07/16/2018 11:38:35 PM: [ Running on CPU only. ]
07/16/2018 11:38:35 PM: [ Initializing pipeline... ]
07/16/2018 11:38:35 PM: [ Initializing document ranker... ]
07/16/2018 11:38:35 PM: [ Loading /home/mike/Programing/DrQA/data/wikipedia/docs-tfidf-ngram=2-hash=16777216-tokenizer=simple.npz ]
Traceback (most recent call last):
File "scripts/pipeline/interactive.py", line 83, in <module>
tokenizer=args.tokenizer
File "/home/mike/Programing/DrQA/drqa/pipeline/drqa.py", line 109, in __init__
self.ranker = ranker_class(**ranker_opts)
File "/home/mike/Programing/DrQA/drqa/retriever/tfidf_doc_ranker.py", line 37, in __init__
matrix, metadata = utils.load_sparse_csr(tfidf_path)
File "/home/mike/Programing/DrQA/drqa/retriever/utils.py", line 33, in load_sparse_csr
loader = np.load(filename)
File "/home/mike/Programing/drqa_env/lib/python3.6/site-packages/numpy/lib/npyio.py", line 414, in load
pickle_kwargs=pickle_kwargs)
File "/home/mike/Programing/drqa_env/lib/python3.6/site-packages/numpy/lib/npyio.py", line 173, in __init__
_zip = zipfile_factory(fid)
File "/home/mike/Programing/drqa_env/lib/python3.6/site-packages/numpy/lib/npyio.py", line 103, in zipfile_factory
return zipfile.ZipFile(file, *args, **kwargs)
File "/usr/lib/python3.6/zipfile.py", line 1108, in __init__
self._RealGetContents()
File "/usr/lib/python3.6/zipfile.py", line 1175, in _RealGetContents
raise BadZipFile("File is not a zip file")
zipfile.BadZipFile: File is not a zip file
Exception ignored in: <bound method NpzFile.__del__ of <numpy.lib.npyio.NpzFile object at 0x7fa18140e9e8>>
Traceback (most recent call last):
File "/home/mike/Programing/drqa_env/lib/python3.6/site-packages/numpy/lib/npyio.py", line 210, in __del__
self.close()
File "/home/mike/Programing/drqa_env/lib/python3.6/site-packages/numpy/lib/npyio.py", line 201, in close
if self.zip is not None:
AttributeError: 'NpzFile' object has no attribute 'zip'
It seems to mean that Numpy object has no attribute zip. I think It tried to unzip the models file but don't have the proper tools.
Do you know how to cope with it ? I already had problem with this repository while downloading its very large trained models repo which I talk about in a GitHub issue.
Update
According to abarnet, it looks like the problem is that the zipfile module can't unzip it (and then NumPy's error handling isn't quite as beautiful as it could be). It might be a corrupted file, so I tried to unzip the file that created the error:
mike@mike-thinks:~/Programing/DrQA/data/wikipedia$ unzip -t docs-tfidf-ngram=2-hash=16777216-tokenizer=simple.npz
Archive: docs-tfidf-ngram=2-hash=16777216-tokenizer=simple.npz
End-of-central-directory signature not found. Either this file is not
a zipfile, or it constitutes one disk of a multi-part archive. In the
latter case the central directory and zipfile comment will be found on
the last disk(s) of this archive.
unzip: cannot find zipfile directory in one of docs-tfidf-ngram=2-hash=16777216-tokenizer=simple.npz or
docs-tfidf-ngram=2-hash=16777216-tokenizer=simple.npz.zip, and cannot find docs-tfidf-ngram=2-hash=16777216-tokenizer=simple.npz.ZIP, period.