ArangoDB Too many open files

2019-06-25 23:31发布

问题:

since a few days we encounter a problem with our ArangoDB installation. A few minutes/up to an hour after start up all connections to the database are refused. The arango log file says that there are "Too many open files". A "lsof | grep arango | wc -l" shows that the database has around 50,000 open file handles, which is a lot under the max. allowed by the linux system (around 3m). Has anyone an idea where this error comes from?

We are using a Ubuntu Linux with a 3.13 kernel. 30 GB RAM and three cores. The database is still very small with around 1,5m entries and a size of 50GB.

Thx, secana

EDIT: "netstat -anpt | fgrep 2480" shows:

root@syssec-graphdb-001-test:~# netstat -anpt | fgrep 2480
tcp        0      0 10.215.17.193:2480      0.0.0.0:*               LISTEN               7741/arangod
tcp        0      0 10.215.17.193:2480      10.215.50.30:53453      ESTABLISHED          7741/arangod
tcp        0      0 10.215.17.193:2480      10.215.50.31:49299      ESTABLISHED          7741/arangod
tcp        0      0 10.215.17.193:2480      10.215.50.30:53155      ESTABLISHED          7741/arangod

"ulimit -n" has a result of 1024, so I think that the ~50,000 are all arango processes together.

Last lines in log file before the database died:

2015-05-26T12:20:43Z [9672] ERROR cannot open datafile '/data/arangodb/databases/database-235999516/collection-28464454696/datafile-18806474509149.db': 'Too many open files'
2015-05-26T12:20:43Z [9672] ERROR cannot open datafile '/data/arangodb/databases/database-235999516/collection-28464454696/datafile-18806474509149.db': Too many open files
2015-05-26T12:20:43Z [9672] DEBUG [arangod/VocBase/collection.cpp:1632] cannot open '/data/arangodb/databases/database-235999516/collection-28464454696', check failed
2015-05-26T12:20:43Z [9672] ERROR cannot open document collection from path '/data/arangodb/databases/database-235999516/collection-28464454696'

回答1:

It looks like it will make sense to increase the max. number of open files a process is allowed to manage. Given the stated database size of around 50 GB, the (presumably default) value of 1024 seems to be too low.

arangod will require one file descriptor for each parallel client connection. That may not be many, but in the face of HTTP keep-alive connections this could already account for several file descriptors.

Additionally, each datafile of an active collection will need to be memory-mapped and cost one file descriptor as well. With the default datafile size of 32 MB, a database size of 50 GB (on disk) will already consume 1,600 file descriptors:

50 GB database size / (32 MB default size / 1 datafile) = 1600 datafiles

Increasing the ulimit -n value for the arangod user and environment therefore will make sense. You can confirm that arangod can actually use the configured number of file descriptors by starting it with option --server.descriptors-minimum <value>, e.g.

--server.descriptors-minimum 32768 

for that many file descriptors. If arangod cannot effectively use that specified amount of file descriptors, it will fail at start with a fatal error. Of course that option can also be put into the arangod.conf file.

Additionally, the default size for (new) datafiles can be increased via the journalSize parameter for collections. That won't help right now, but will lower the number of required file descriptors for data saved in the future.



回答2:

For emergencies when you can't restart the database, like in my case, you will find very useful this blog post that explains how you can change the ulimit of a running process.

If your distribution has util-linux-2.21, you can use the "prlimit" tool, or you can compile the small example C program in the blog post that worked great for me.

To check the actual limits of a process you can use:

cat /proc/<PID>/limits

Good luck!



标签: arangodb