We got a Play 1.2.5 application and we've had some problems with the application becoming unresponsive.
After setting proper memory settings for the application the problem hasn't reoccurred (a couple of days ATM) but I'd like to get idea of the actual reason and if there's some way to see it in the logs.
In our setup, we got
- Play 1.2.5 application running on AWS (Ubuntu 12.04)
- MySQL RDS database
- Apache server working as a proxy (handling SSL etc).
This has happened for various calls but I have an example of monitoring healthcheck with simple renderText-implementation (just 200 & "OK"). We've had these "every now and then". Application has returned responsive without booting.
Apache access log had:
(IP addr) - - [01/Mar/2013:09:31:16 +0200] "GET /monitor/healthcheck HTTP/1.1" 502 4305 "-" "NING/1.0"
Apache error log had:
[Fri Mar 01 09:36:16 2013] [error] [client (IP addr)] (70007)The timeout specified has expired: proxy: error reading status line from remote server localhost:8080
[Fri Mar 01 09:36:16 2013] [error] [client (IP addr)] proxy: Error reading from remote server returned by /monitor/healthcheck
(Apache has 300s=5m proxy timeout length)
Play logs haven't had anything there (we got request URL logging at the controller so at least the request hasn't found it's way up there OR the logging has had problems)
The first thought is running out of threads. This seems pretty unlikely to me, since:
- We are under development -> pretty low traffic
- This has occurred also in cases that the logs don't have previous traffic for a couple of hours
- We got 10 threads (
play.pool=10
) - We don't have async WS calls used (those seem to be somewhat buggy with Play 1.2.X)
- No calls blocking for long time
- With random testing after variuos usage there doesnt seem to be threads hanging (examined with jstack everything seems to be ~OK)
(Maybe related, maybe not) : One time we checked jstack so that it didn't respond for a cacll:
$ jstack 7842
7842: Unable to open socket file: target process not responding or HotSpot VM not loaded
The -F option can be used when the target process is not responding
However, before trying -F we tried again and got a proper response so if the JVM was in some unresponsive state, it made it OK pretty soon.
With some assistance, we set up proper memory settings and since that (last Friday 2013-03-01) we haven't had this problem.
jvm.memory=-Xms64m -Xmx512m -XX:PermSize=64m -XX:MaxPermSize=256m
However, we didn't have any memory issues printed in the log. I'm still a bit worried since I don't have a clue about the actual reason, so:
- What might be the cause
- Some memory issue, why not found in logs?
- Some (nondeterministic) thing that would leave threads blocked for long time
- Is there some way to see the cause in logs if this happens again?
- Some settings needed to get memory issues in the log?
UPDATE : Seems to probably be issue with MySQL connection testing hanging. Created another more focused question and will try to remember to update this also after the issue is solved.