I am building a python 3.6
AWS Lambda deploy package and was facing an issue with SQLite
.
In my code I am using nltk
which has a import sqlite3
in one of the files.
Steps taken till now:
Deployment package has only python modules that I am using in the root. I get the error:
Unable to import module 'my_program': No module named '_sqlite3'
Added the _sqlite3.so from
/home/my_username/anaconda2/envs/py3k/lib/python3.6/lib-dynload/_sqlite3.so
into package root. Then my error changed to:Unable to import module 'my_program': dynamic module does not define module export function (PyInit__sqlite3)
Added the SQLite precompiled binaries from
sqlite.org
to the root of my package but I still get the error as point #2.
My setup: Ubuntu 16.04
, python3 virtual env
AWS lambda env: python3
How can I fix this problem?
Depending on what you're doing with NLTK, I may have found a solution.
The base nltk module imports a lot of dependencies, many of which are not used by substantial portions of its feature set. In my use case, I'm only using the
nltk.sent_tokenize
, which carries no functional dependency on sqlite3 even though sqlite3 gets imported as a dependency.I was able to get my code working on AWS Lambda by changing
to
This dynamically creates empty modules for
sqlite
andsqlite.dbapi2
. Whennltk.corpus.reader.panlex_lite
tries to importsqlite
, it will get our empty module instead of the standard library version. That means the import will succeed, but it also means that when nltk tries to use the sqlite module it will fail.If you're using any functionality that actually depends on sqlite, I'm afraid I can't help. But if you're trying to use other nltk functionality and just need to get around the lack of sqlite, this technique might work.
You need the sqlite3.so file (as others have pointed out), but the most robust way to get it is to pull from the (semi-official?) AWS Lambda docker images available in lambci/lambda. For example, for Python 3.7, here's an easy way to do this:
First, let's grab the sqlite3.so (library file) from the docker image:
Next, we'll make a zipped executable with our requirements and code:
Finally, we add the library file to our image:
If you want to use AWS SAM build/packaging, instead copy it into the top-level of the lambda environment package (i.e., next to your other python files).
This is a bit of a hack, but I've gotten this working by dropping the
_sqlite3.so
file from Python 3.6 on CentOS 7 directly into the root of the project being deployed with Zappa to AWS. This should mean that if you can include_sqlite3.so
directly into the root of your ZIP, it should work, so it can be imported by this line incpython
:https://github.com/python/cpython/blob/3.6/Lib/sqlite3/dbapi2.py#L27
Not pretty, but it works. You can find a copy of
_sqlite.so
here:https://github.com/Miserlou/lambda-packages/files/1425358/_sqlite3.so.zip
Good luck!
My solution may or may not apply to you (as it depends on Python 3.5), but hopefully it may shed some light for similar issue.
sqlite3
comes with standard library, but is not built with the python3.6 that AWS use, with the reason explained byapathyman
and other answers.The quick hack is to include the share object
.so
into your lambda package:find ~ -name _sqlite3.so
In my case:
/home/user/anaconda3/pkgs/python-3.5.2-0/lib/python3.5/lib-dynload/_sqlite3.so
However, that is not totally sufficient. You will get:
ImportError: libpython3.5m.so.1.0: cannot open shared object file: No such file or directory
Because the
_sqlite3.so
is built with python3.5, it also requires python3.5 share object. You will also need that in your package deployment:find ~ -name libpython3.5m.so*
In my case:
/home/user/anaconda3/pkgs/python-3.5.2-0/lib/libpython3.5m.so.1.0
This solution is likely not work if you are using
_sqlite3.so
that is built with python3.6, because the libpython3.6 built by AWS will likely not support this. However, this is just my educational guess. If anyone has successfully done, please let me know.This isn't a solution, but I have an explanation why.
Python 3 has support for sqlite in the standard library (stable to the point of pip knowing and not allowing installation of pysqlite). However, this library requires the sqlite developer tools (C libs) to be on the machine at runtime. Amazon's linux AMI does not have these installed by default, which is what AWS Lambda runs on (naked ami instances). I'm not sure if this means that sqlite support isn't installed or just won't work until the libraries are added, though, because I tested things in the wrong order.
Python 2 does not support sqlite in the standard library, you have to use a third party lib like pysqlite to get that support. This means that the binaries can be built more easily without depending on the machine state or path variables.
My suggestion, which you've already done I see, is to just run that function in python 2.7 if you can (and make your unit testing just that much harder :/).
Because of the limitations (it being something baked into python's base libs in 3) it is more difficult to create a lambda-friendly deployment package. The only thing I can suggest is to either petition AWS to add that support to lambda or (if you can get away without actually using the sqlite pieces in nltk) copying anaconda by putting blank libraries that have the proper methods and attributes but don't actually do anything.
If you're curious about the latter, check out any of the
fake/_sqlite3
files in an anaconda install. The idea is only to avoid import errors.From AusIV's answer, This version works for me in AWS Lambda and NLTK, I created a dummysqllite file to mock the required references.