I'm using rufus-scheduler to run a number of frequent jobs that do some various tasks with ActiveRecord objects. If there is any sort of network or postgresql hiccup, even after recovery, all the threads will throw the following error until the process is restarted:
ActiveRecord::ConnectionTimeoutError (could not obtain a database connection within 5 seconds (waited 5.000122687 seconds). The max pool size is currently 5; consider increasing it.
The error can easily be reproduced by restarting postgres. I've tried playing (up to 15) with the pool size, but no luck there.
That leads me to believe the connections are just in a stale state, which I thought would be fixed with the call to clear_stale_cached_connections!
.
Is there a more reliable pattern to do this?
The block that is passed is a simple select and update active record call, and happens to matter what the AR object is.
The rufus job:
scheduler.every '5s' do
db do
DataFeed.update #standard AR select/update
end
end
wrapper:
def db(&block)
begin
ActiveRecord::Base.connection_pool.clear_stale_cached_connections!
#ActiveRecord::Base.establish_connection # this didn't help either way
yield block
rescue Exception => e
raise e
ensure
ActiveRecord::Base.connection.close if ActiveRecord::Base.connection
ActiveRecord::Base.clear_active_connections!
end
end
Rufus scheduler starts a new thread for every job.
ActiveRecord on the other hand cannot share connections between threads, so it needs to assign a connection to a specific thread.
When your thread doesn't have a connection yet, it will get one from the pool.
(If all connections in the pool are in use, it will wait untill one is returned from another thread. Eventually timing out and throwing ConnectionTimeoutError)
It is your responsibility to return it back to the pool when you are done with it, in a Rails app, this is done automatically. But if you are managing your own threads (as rufus does), you have to do this yourself.
Lucklily, there is an api for this:
If you put your code inside a with_connection block, it will get a connection form the pool, and release it when it is done
ActiveRecord::Base.connection_pool.with_connection do
#your code here
end
In your case:
def db
ActiveRecord::Base.connection_pool.with_connection do
yield
end
end
Should do the trick....
http://api.rubyonrails.org/classes/ActiveRecord/ConnectionAdapters/ConnectionPool.html#method-i-with_connection
The reason can be that you have many threads which are using all connections, if DataFeed.update
method takes more than 5 seconds, than your block can be overlapped.
try
scheduler.every("5s", :allow_overlapping => false) do
#...
end
Also try release connection instead of closing it.
ActiveRecord::Base.connection_pool.release_connection
I don't really know about rufus-scheduler, but I got some ideas.
The first problem could be a bug on rufus-scheduler that does not checkout database connection properly. If it's the case the only solution is to clear stale connections manually as you already do and to inform the author of rufus-scheduler about your issue.
Another problem that could happen is that your DataFeed operation takes a really long time and because it is performed every 5 secondes Rails is running out of database connections, but it's rather unlikely.