The IT department where I work is trying to move to 100% virtualized servers, with all the data stored on a SAN. They haven't done it yet, but the plan eventually calls for moving the existing physical SQL Server machines to virtual servers as well.
A few months ago I attended the Heroes Happen Here launch event, and in one of the SQL Server sessions the speaker mentioned in passing that this is not a good idea for production systems.
So I'm looking for a few things:
- What are the specific reasons why this is or is not a good idea? I need references, or don't bother responding. I could come up with a vague "I/O bound" response on my own via google.
- The HHH speaker recollection alone probably won't convince our IT department to change their minds. Can anyone point me directly to something more authoritative? And by "directly", I mean something more specific than just a vague Books OnLine comment. Please narrow it down a little.
I can say this from personal experience because I am dealing with this very problem as we speak. The place I am currently working as a contractor has this type of environment for their SQL Server development systems. I am trying to develop a fairly modest B.I. system on this environment and really struggling with the performance issues.
TLB misses and emulated I/O are very slow on a naive virtual machine. If your O/S has paravirtualisation support (which is still not a mature technology on Windows) you use paravirtualised I/O (essentially a device driver that hooks into an API in the VM). Recent versions of the Opteron have support for nested page tables, which removes the need to emulate the MMU in software (which is really slow).
Thus, applications that run over large data sets and do lots of I/O like (say) ETL processes trip over the achilles heel of virtualisation. If you have anything like a data warehouse system that might be hard on memory or Disk I/O you should consider something else. For a simple transactional application they are probably O.K.
Put in perspective the systems I am using are running on blades (an IBM server) on a SAN with 4x 2gbit F/C links. This is a mid-range SAN. The VM has 4GB of RAM IIRC and now two virtual CPUs. At its best (when the SAN is quiet) this is still only half of the speed of my XW9300, which has 5 SCSI disks (system, tempdb, logs, data, data) on 1 U320 bus and 4GB of RAM.
Your mileage may vary, but I'd recommend going with workstation systems like the one I described for developing anything I/O heavy in preference to virtual servers on a SAN. Unless your resource usage requirements are beyond this sort of kit (in which case they are well beyond a virtual server anyway) this is a much better solution. The hardware is not that expensive - certainly much cheaper than a SAN, blade chassis and VMWare licencing. SQL Server developer edition comes with V.S. Pro and above.
This also has the benefit that your development team is forced to deal with deployment right from the word go - you have to come up with an architecture that's easy to 'one-click' deploy. This is not as hard as it sounds. Redgate SQL Compare Pro is your friend here. Your developers also get a basic working knowledge of database administration.
A quick trip onto HP's website got me a list price of around $4,600 for an XW8600 (their current xeon-based model) with a quad-core xeon chip, 4GB of RAM and 1x146 and 4x73GB 15k SAS hard disks. Street price will probably be somewhat less. Compare this to the price for a SAN, blade chassis and VMware licensing and the cost of backup for that setup. For backup you can provide a network share with backup where people can drop compressed DB backup files as necessary.
EDIT: This whitepaper on AMD's web-site discusses some benchmarks on a VM. From the benchmarks in the back, heavy I/O and MMU workload really clobber VM performance. Their benchmark (to be taken with a grain of salt as it is a vendor supplied statistic) suggests a 3.5x speed penalty on an OLTP benchmark. While this is vendor supplied one should bear in mind:
It benchmarks naive virtualisation and compares it to a para-virtualised solution, not bare-metal performance.
An OLTP benchmark will have a more random-access I/O workload, and will spend more time waiting for disk seeks. A more sequential disk access pattern (characteristic of data warehouse queries) will have a higher penalty, and a memory-heavy operation (SSAS, for example, is a biblical memory hog) that has a large number of TLB misses will also incur additional penalties. This means that the slow-downs on this type of processing would probably be more pronounced than the OLTP benchmark penalty cited in the whitepaper.
What we have seen here is that TLB misses and I/O are very expensive on a VM. A good architecture with paravirtualised drivers and hardware support in the MMU will mitigate some or all of this. However, I believe that Windows Server 2003 does not support paravirtualisation at all, and I'm not sure what level of support is delivered in Windows 2008 server. It has certainly been my experience that a VM will radically slow down a server when working on an ETL process and SSAS cube builds compared to relatively modest spec bare-metal hardware.
I wanted to add this series of articles by Brent Ozar:
It's not exactly authoritative in the sense I was hoping for (coming from the team that builds the server, or an official manual of some kind), but Brent Ozar is pretty well respected and I think he does a great job covering all the issues here.
You are looking at this from the wrong angle. First, you are not going to find White Papers from Vendors why you should "not" virtualize or why you should Virtualize.
Every environment is different and you need to do what works in your Environment. With that said, there are some servers that are perfect for virtualization and there are some that should not be virtualized. For example, if your SQL Server/s are doing millions and millions of transactions per second, like if your server was located at the NYSE or the NASDAQ and millions and millions of dollars depend on it, you probably should not virtualize it. Make sure you understand the ramifications of virtualizing an SQL server.
I've seen where people virtualize SQL over and over just because Virtualization is cool. Then complain later on when the VM server does not perform as expected.
What you need to do is set a benchmark, fully test the solution you want to deploy and show what it can and can't do so you don't run into any surprises. Virtualization is great, it is good for the envronment and saves through consolidation, but you need to show why your supervisors why you should not virtualize your SQL Servers and only you can do this.
Here's some VMWARE testing on it.. http://www.vmware.com/files/pdf/SQLServerWorkloads.pdf
Granted, they do not compare it to physical machines. But, you could probably do similar testing with the tools they used for your environment.
We currently run SQL Server 2005 in a VMWARE environment. BUT, it is a very lightly loaded database and it is great. Runs with no problems.
As most have pointed out, it will depend on your database load.
Maybe you can convince the IT Department to do some good testing before blindly implementing.
Security issues that can be introduced when dealing with Vitalization should also be considered. Virtualization Security is a good article by PandaLabs that covers some of the concerns.
No, I can't point to any specific tests or anything like that, but I can say from experience that putting your production database server on a virtual machine is a bad idea, especially if it has a large load.
It's fine for development. Possibly even testing (on the theory that if it runs fine under load on virtual box, it's going to run fine on prodcution) but not in production.
It's common sense really. Do you want your hardware running two operating systems and your sql server or one operating system and sql server?
Edit: My experience biased my response. I have worked with large databases under heavy constant load. If you have a smaller database under light load, virtualization may work fine for you.