How do I clear stuck/stale Resque workers?

2019-01-15 23:59发布

As you can see from the attached image, I've got a couple of workers that seem to be stuck. Those processes shouldn't take longer than a couple of seconds.

enter image description here

I'm not sure why they won't clear or how to manually remove them.

I'm on Heroku using Resque with Redis-to-Go and HireFire to automatically scale workers.

15条回答
爷、活的狠高调
2楼-- · 2019-01-16 00:42

If you are using newer versions of Resque, you'll need to use the following command as the internal APIs have changed...

Resque::WorkerRegistry.working.each {|work| Resque::WorkerRegistry.remove(work.id)}
查看更多
萌系小妹纸
3楼-- · 2019-01-16 00:43

Here's how you can purge them from Redis by hostname. This happens to me when I decommission a server and workers do not exit gracefully.

Resque.workers.each { |w| w.unregister_worker if w.id.start_with?(hostname) }
查看更多
干净又极端
4楼-- · 2019-01-16 00:49

I've cleared them out from redis-cli directly. Luckily redistogo.com allows access from environments outside heroku. Get dead worker ID from the list. Mine was

55ba6f3b-9287-4f81-987a-4e8ae7f51210:2

Run this command in redis directly.

del "resque:worker:55ba6f3b-9287-4f81-987a-4e8ae7f51210:2:*"

You can monitor redis db to see what it's doing behind the scenes.

redis xxx.redistogo.com> MONITOR
OK
1380274567.540613 "MONITOR"
1380274568.345198 "incrby" "resque:stat:processed" "1"
1380274568.346898 "incrby" "resque:stat:processed:c65c8e2b-555a-4a57-aaa6-477b27d6452d:2:*" "1"
1380274568.346920 "del" "resque:worker:c65c8e2b-555a-4a57-aaa6-477b27d6452d:2:*"
1380274568.348803 "smembers" "resque:queues"

Second last line deletes the worker.

查看更多
仙女界的扛把子
5楼-- · 2019-01-16 00:50

I ran into this issue and started down the path of implementing a lot of the suggestions here. However, I discovered the root cause that was creating this issue was that I was using the gem redis-rb 3.3.0. Downgrading to redis-rb 3.2.2 prevented these workers from getting stuck in the first place.

查看更多
够拽才男人
6楼-- · 2019-01-16 00:50

I had stuck/stale resque workers here too, or should I say 'jobs', because the worker is actually still there and running fine, it's the forked process that is stuck.

I chose the brutal solution of killing the forked process "Processing" since more than 5min, via a bash script, then the worker just spawn the next in queue, and everything keeps on going

have a look at my script here: https://gist.github.com/jobwat/5712437

查看更多
地球回转人心会变
7楼-- · 2019-01-16 00:51

Run this command wherever you ran the command to start the server

$ ps -e -o pid,command | grep [r]esque

you should see something like this:

92102 resque: Processing ProcessNumbers since 1253142769

Make note of the PID (process id) in my example it is 92102

Then you can quit the process 1 of 2 ways.

  • Gracefully use QUIT 92102

  • Forcefully use TERM 92102

* I'm not sure of the syntax it's either QUIT 92102 or QUIT -92102

Let me know if you have any trouble.

查看更多
登录 后发表回答