Elasticsearch unclosed client. Live threads after

2019-09-09 06:13发布

I am using Elasticsearch 1.5.1 and Tomcat 7. Web application creates a TCP client instance as Singleton during server startup through Spring Framework.

Just noticed that I failed to close the client during server shutdown. Through analysis on various tools like VisualVm, JConsole, MAT in Eclipse, it is evident that threads created by the elasticsearch client are live even after server(tomcat) shutdown.

Note: after introducing client.close() via Context Listener destroy methods, the threads are killed gracefully.

But my query here is,

  1. how to check the memory occupied by these live threads?
  2. Memory leak impact due to this thread?

We have got few Out of memory:Perm gen errors in PROD. This might be a reason but still I would like to measure and provide stats for this.

Any suggestions/help please.

1条回答
Root(大扎)
2楼-- · 2019-09-09 06:57

Typically clients run in a different process than the services they communicate with. For example, I can open a web page in a web browser, and then shutdown the webserver, and the client will remain open.

This has to do with the underlying design choices of TCP/IP. Glossing over the details, under most cases a client only detects it's server is gone during the next request to the server. (Again generally speaking) it does not continually poll the server to see if it is alive, nor does the server generally send a "please disconnect" message on shutting down.

The reason that clients don't generally poll servers is because it allows the server to handle more clients. With a polling approach, the server is limited by the number of clients running, but without a polling approach, it is limited by the number of clients actively communicating. This allows it to support more clients because many of the running clients aren't actively communicating.

The reason that servers typically don't send an "I'm shutting down" message is because many times the server goes down uncontrollably (power outage, operating system crash, fire, short circuit, etc) This means that an protocol which requires such a message will leave the clients in a corrupt state if the server goes down in an uncontrolled manner.

So losing a connection is really a function of a failed request to the server. The client will still typically be running until it makes the next attempt to do something.

Likewise, opening a connection to a server often does nothing most of the time too. To validate that you really have a working connection to a server, you must ask it for some data and get a reply. Most protocols do this automatically to simplify the logic; but, if you ever write your own service, if you don't ask for data from the server, even if the API says you have a good "connection", you might not. The API can report back a good "connections" when you have all the stuff configured on your machine successfully. To really know if it works 100% on the other machine, you need to ask for data (and get it).

Finally servers sometimes lose their clients, but because they don't waste bandwidth chattering with clients just to see if they are there, often the servers will put a "timeout" on the client connection. Basically if the server doesn't hear from the client in 10 minutes (or the configured value) then it closes the cached connection information for the client (recreating the connection information as necessary if the client comes back).

From your description it is not clear which of the scenarios you might be seeing, but hopefully this general knowledge will help you understand why after closing one side of a connection, the other side of a connection might still think it is open for a while.

There are ways to configure the network connection to report closures more immediately, but I would avoid using them, unless you are willing to lose a lot of your network bandwidth to keep-alive messages and don't want your servers to respond as quickly as they could.

查看更多
登录 后发表回答