I am experiencing an issue where Lambda functions occasionally time out without any error message other than a notification that the function timed out.
In order to find the root of the issue, I added logging at various points throughout my function and determined that everything functions properly until the first getItem() request to read data from DynamoDB. The read seems to be taking more than the 3.00 second timeout.
Naturally, I checked my DynamoDB table to see if there were any throttled reads or errors. DynamoDB's metrics show no throttles or errors, and read times remain in the double-digit milliseconds at most.
Clearly something is going wrong or getting dropped along the way. How can I fix this issue or at least catch it and retry the read?
This is a read-oriented function for a web API, so response times are critical. Hence, an increased timeout will not solve the issue.
dynamodb.getItem({
"TableName": "tablename",
"Key": { "keyname": { "S": "keyvalue" } },
"AttributesToGet": [ "attributeA", "attributeB" ]
}, function(err, data) {
if(err){
context.done(err);
} else {
if("Item" in data){
nextFunction(event, context);
} else {
context.done("Invalid key");
}
}
});
If your are launching your Lambda in VPC, try to launched in a Private Subnet instead of Public Subnet. I had the same problem and launching Lambda in a Private Subnet worked for me.
After significantly increasing the timeout, I found that a network error is eventually thrown:
This issue appears to be caused by an issue between Node.js and OpenSSL according to this thread. It sounds like the issue affects Node.js 4.x and up but not 0.10. This means you can either resolve the issue by downgrading the Lambda runtime to Node.js 0.10 or adding the following code when using aws-sdk:
Ran into a random lambda timeout issues while "put"ting data from lambda to DynamoDB. Lambda resides in a VPC (per organization policy).
Issue: Some (random) lambda containers would consistently fail while putting data and times out (set to 30 sec), while other containers got done putting data in a few milliseconds.
Root cause: There were two subnets (as suggested by AWS) configured. One was a private subnet and other was a public subnet. When a new lambda container is spun-off, it would randomly select one of the subnets. If it choose public subnet, it would consistently fail. If it choose private subnet, it would be done in a few milliseconds.
Solution: Remove public subnet and, rather, have two private subnets configured.