可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
I've been trying to run my code in AWS Lambda which imports pandas. So here is what I've done.
I have a python file which contains a simple code as follows(This file has the lambda handler)
import json
print('Loading function')
import pandas as pd
def lambda_handler(event, context):
return "Welcome to Pandas usage in AWS Lambda"
- I have zipped this python file along with numpy, pandas and pytz libraries as a deployment package (Did all these in Amazon EC2 linux machine)
- Then uploaded the package into S3
- Created a lambda function(runtime=python3.6) and uploaded the deployment package from S3
But when I test the lambda function in AWS Lambda, I get the below error:
Unable to import module 'lambda_function': Missing required dependencies ['numpy']
I already have numpy in the zipped package but still I get this error. I tried to follow the hints given at Pandas & AWS Lambda but no luck.
Did anyone ran into the same issue. Would appreciate any hint or suggestions to solve this problem.
Thanks
回答1:
EDIT: I figured out finally how to run pandas & numpy in a AWS Lambda python 3.6 runtime environment.
I have uploaded my deployment package to the following repo:
git clone https://github.com/pbegle/aws-lambda-py3.6-pandas-numpy.git
Simply add your lambda_function.py
to the zip file by running:
zip -ur lambda.zip lambda_function.py
Upload to S3 and source to lambda.
ORIGINAL:
The only way I have gotten Pandas to work in a lambda function is by compiling the pandas (and numpy) libraries in an AWS Linux EC2 instance following the steps from this blog post and then using the python 2.7 runtime for my lambda function.
回答2:
To include numpy in your lambda zip follow the instructions on this page in the AWS docs...
How do I add Python packages with compiled binaries to my deployment package and make the package compatible with AWS Lambda?
To paraphrase the instructions using numpy as an example:
- Open the module pages at pypi.org.
https://pypi.org/project/numpy/
Choose Download files.
Download:
For Python 2.7, module-name-version-cp27-cp27mu-manylinux1_x86_64.whl
e.g. numpy-1.15.2-cp27-cp27m-manylinux1_x86_64.whl
For Python 3.6, module-name-version-cp36-cp36m-manylinux1_x86_64.whl
e.g. numpy-1.15.2-cp36-cp36m-manylinux1_x86_64.whl
- Uncompress the wheel file on the /path/to/project-dir folder.
You can use the unzip command on the command line to do this. There are other ways obviously.
unzip numpy-1.15.2-cp36-cp36m-manylinux1_x86_64.whl
When the wheel file is uncompressed, your deployment package will be compatible with Lambda.
Hope that all makes sense ;)
The end result might look something like this.
Note: you should not include the whl file in the deployment package.
回答3:
After doing a lot of research I was able to make it work with Lambda layers.
Create or open a clean directory and follow the steps below:
Prerequisites: Make sure you have Docker up and running
- Create a requirements.txt file with the following:
pandas==0.23.4
pytz==2018.7
- Create a get_layer_packages.sh file with the following:
#!/bin/bash
export PKG_DIR="python"
rm -rf ${PKG_DIR} && mkdir -p ${PKG_DIR}
docker run --rm -v $(pwd):/foo -w /foo lambci/lambda:build-python3.6 \
pip install -r requirements.txt --no-deps -t ${PKG_DIR}
- Run the following commands in the same directory:
chmod +x get_layer_packages.sh
./get_layer_packages.sh
zip -r pandas.zip .
Upload the layer to a S3 bucket.
Upload the layer to AWS by running the command below:
aws lambda publish-layer-version --layer-name pandas-layer --description "Description of your layer"
--content S3Bucket*=<bucket name>*,S3Key=*<layer-name>*.zip
--compatible-runtimes python3.6 python3.7
Go to Lambda console and upload your code as a zip file or use the inline editor.
Click on Layers > Add a layer> Search for the layer (pandas-layer) from the Compatible layers and select the version.
Also add the AWSLambda-Python36-SciPy1x layer which is available by default for importing numpy.
Selecting the layer from the console
- Test the code. It should work now!!!!
Thanks to this medium article https://medium.com/@qtangs/creating-new-aws-lambda-layer-for-python-pandas-library-348b126e9f3e
回答4:
To get additional libraries in Lambda we need to compile them on Amazon Linux (this is important if the underlying library is based on C or C++ like for Numpy) and package them in a ZIP file together with the python script you want to run in Lambda.
To get the Amazon Linux compiled version of the libraries. You can either find a version that someone already compiled, like the one by @pbegle, or compile it yourself. To compile it ourself there are two options:
- compile the libraries on an EC2 instance https://streetdatascience.com/2016/11/24/using-numpy-and-pandas-on-aws-lambda/
- compile the libraries on a docker version of Lambda environment
https://serverlesscode.com/post/scikitlearn-with-amazon-linux-container/
Following the last option with Docker, it is possible to make it work using the instructions in the blog post above and by adding:
pip install --use-wheel pandas
in the script to compile the libraries:
https://github.com/ryansb/sklearn-build-lambda/blob/master/build.sh#L21
回答5:
Slightly duplicate of Cannot find MySQL in NodeJS using AWS Lambda
You need to package your libraries with Lambda. As lambda runs on a public cloud, you cannot configure it.
Now in your case, as you are using pandas, you need to package Pandas with your zip. Get a path to pandas(for example: /Users/dummyUser/anaconda/lib/python3.6/site-packages) and copy the library to the place where you have your lambda function code. Inside your code, refer to pandas from your local copy. While uploading, zip the whole set(code + libraries), and upload as you will. It should work.
回答6:
I've been struggling with a similar error while trying to use the python3.6 engine. When I switched to 2.7 it worked fine for me. I used Amazon AMI to create my zip file, but it has only python3.5, not 3.6. I guess the version mismatch was the reason. But it's just a guess, I haven't tried the process on a python3.6 installation yet.
回答7:
AWS Lambda use Amazon Linux operating system. Idea is download Pandas and NumPy compatible with Amazon Linux. What you download using pip
is specific to Windows or Mac. You need to download the compatible version for Linux, so that your Lambda function can understand it. These files are called wheel
files.
Create new local directory with lambda_function.py
file. Install Pandas to local directory with pip:
$ pip install -t . pandas
Navigate to https://pypi.org/project/pandas/#files. Search for and download newest *manylinux1_x86_64.whl
package. In my case, I'm using Python 3.6 on my Lambda function, so I downloaded the following:
Download whl files to directory with lambda_function.py
. Remove pandas
, numpy
, and *.dist-info
directories. Unzip whl files.
$ rm -r pandas numpy *.dist-info
$ unzip numpy-1.16.1-cp36-cp36m-manylinux1_x86_64.whl
$ unzip pandas-0.24.1-cp36-cp36m-manylinux1_x86_64.whl
Remove whl files, *.dist-info
, and __pycache__
. Prepare zip.zip
archive:
$ rm -r *.whl *.dist-info __pycache__
$ zip -r zip.zip .
Upload the zip.zip
file in your Lambda function.
Source: https://medium.com/@korniichuk/lambda-with-pandas-fd81aa2ff25e
回答8:
This is similar to Randeep's answer but you don't need to use Lambda Layers if you don't want to do that.
As others have stated, this is not working because pandas/numpy require binaries to be built and the operating system of your build machine (Linux, Mac, Windows) does not match the operating system of Lambda (Amazon Linux).
To solve this, you can use docker to download/build your dependencies and package them up on Amazon Linux. Amazon provides a Docker image for this purpose. See below for how I built my python package for Python 3.6 runtime (they have other dockers for all other runtimes):
Put all of your dependencies into a requirements.txt
file, for example:
openpyxl
boto3
pandas
Create a script (i.e. named build.sh
) that will build your package, here is what mine looked like:
#!/bin/bash
# remove old build artifacts
rm -rf build
rm lambda_package.zip
# make build dir and copy my lambda handler file into it
mkdir build
cp lambda_daily_util_gen.py build/
# Use requirements file to download/build dependencies into the build folder
cd build
pip install -r ../requirements.txt --target .
# Create an lambda package with my files and all dependencies
zip -r9 ../lambda_package.zip .
Ensure you have the Amazon Linux lambda build image pulled:
$ docker pull lambci/lambda
Run your build script inside of the docker container:
Mac/Linux:
$ docker run --rm -v "$PWD":/var/task lambci/lambda:build-python3.6 /var/task/build.sh
Windows:
docker run --rm -v ${PWD}:/var/task lambci/lambda:build-python3.6 chmod +x build.sh;./build.sh
You should now see a file named lambda_package.zip
that was built on Amazon Linux you can upload to AWS.
Hope that helps.
回答9:
with the serverless framework, you can easily package and deploy your dependencies correctly.
you only need to;
install serverless
npm install -g serverless
create a serverless.yml in the root of your project with the following:
service: numpy-test
# define the environment of your lambda
provider:
name: aws
runtime: python3.6
# specify the function you want to deploy
functions:
numpy:
# path to your lambda_handler function
handler: path/to/function.lambda_handler
# add a plugin that allows serverless to package python libraries
# specified in the requirements.txt or Pipfile
plugins:
- serverless-python-requirements
# this section makes sure your libraries get build correctly
# for an aws lambda environment
custom:
pythonRequirements:
dockerizePip: non-linux
adjust the path/to/function.lambda_handler
make sure docker is running and execute
serverless deploy
once the deployment is finished, go to the AWS console look for the function numpy-test-dev-numpy and test your function.
this article explains the necessary steps in detail.
回答10:
Your code always give this error
because lambda does not contain any external library it having a library which by default come with Python.
if you are using any external library like pandas, numpy or any other. you need to install that library on Aws Lambda
before using it
see you code
import json
print('Loading function')
import pandas as pd
def lambda_handler(event, context):
return "Welcome to Pandas usage in AWS Lambda"
here no installation of pandas library so your code is not working.
my suggestion is use your code as follows. write all you code inside the lambda function
import json
def lambda_handler(event, context):
#install python libray here
print('Loading function')
import pandas as pd
return "Welcome to Pandas usage in AWS Lambda"
So final code look as follows
def lambda_handler(event, context):
import pip
def install(package):
if hasattr(pip, 'main'):
pip.main(['install', package])
else:
pip._internal.main(['install', package])
if __name__ == '__main__':
install('pandas')
#install python libray here
print('Loading function')
import pandas as pd
return "Welcome to Pandas usage in AWS Lambda"