Benefits and Hindrances of Regular Server Reboots

2019-04-07 07:03发布

In the ears of working in multiple teams, I've met multiple infrastructure managers that instituted a policy of weekly server reboots. As a developer, I was always against the policy - it seems that this is a hack to work around software bugs and hardware instabilities, instead of correcting them.

What are the people's opinions, positive and negative points regarding the policy?

8条回答
▲ chillily
2楼-- · 2019-04-07 07:42

Another possibility to consider is that in some environments, such as retail stores that are open 24 hours a day, a "store close" event so that servers can be updated, backed-up, etc.

Even though the servers need to run "24x7", they'e really offline for at least a few minutes every day.

That effectively makes a server reboot every day, even though the store is still operating when it happens.

查看更多
forever°为你锁心
3楼-- · 2019-04-07 07:47

This is a foolish policy.

Here's why:

  • If you need to reboot a server weekly (and somehow it adds to your infrastructure's stability), you are covering up the real problem with a server or its software. A memory leak? A bad driver? The solution to these problems are to fix them, not cover them up with a lazy policy.

  • Servers often get rebooted for updates, at least in the Windows world. Rebooting for critical kernel updates happens anyway.

  • Database servers cache a lot of information in RAM. When you reboot your server, this cache gets empty and very cold. Assuming you have a typical usage pattern, a cold, empty cache will result in slow performance for users when they attempt their queries after a reboot. It may also increase the time needed to perform some types of maintenance like backups because the disk may need to be accessed more.

  • Your servers go down! Your maintenance windows for backups and other things get shortened because your server is off for some nonzero period of time. You also may end up having to tell your users that you will have downtime, depending on your systems' architecture.

  • Assuming you have some sort of notification system for alerting, you will have to configure it to ignore your downtime window. This can mask problems that happen around the time your server reboots, and adds to the amount of configuration you will need to do on your servers.

That being said, reboots sometimes are beneficial as a last resort on resources that you don't necessarily have full control over (old vendor-written software, "black box" devices where explicitly prescribed by the vendor, etc...). But this should be handled on a case by case basis, and not with a naive blanket policy.

查看更多
登录 后发表回答