Doubts Even Here: the point of disbelief
Posted by Ceri Davies Sat, 30 May 2009 21:29:00 GMT
It’s slightly late in the day, but as I’m currently picking apart what parts of VMware’s VI/vSphere stack are actually useful in *my* Real World, I’m going to respond to Chuck Hollis’ blog post Why Oracle Doesn’t Like VMware, even though it was posted nearly a month ago.
To state my position clearly, I agree with Chuck’s on 80% of his points, particularly with the general sentiment that it would be trivial for them to support VMware as a platform but for the fact that they may lose business if they do. However, his point #4, “VMware Functionality Competes With Oracle DBMS Features” is completely disingenuous.
There is *no* VMware functionality that competes with Oracle DBMS features, although VMware marketing would like you to believe that there is. Let’s break Chuck’s examples down.
Chuck says about RAC:
On one hand, we've got a multi-server configuration running
Oracle's latest (and most expensive) RAC product. It's doing load
balancing, high availability, and making the hardware function as
a giant pool.
Let’s think about this RAC setup. It will be serving the same database via multiple instances. The clients know about each instance and will choose another when the first fails
and about VMware:
On the other hand, we've got the same multi-server configuration
running the much cheaper Oracle SE on VMware.
It too is load balancing, offers high availability, and makes the
hardware function as a single giant pool. Many of the management
tasks are handled quite well outside of Oracle's domain.
Now I’m thinking right now that Chuck doesn’t know, or more likely has conveniently forgotten, how RAC works and also seems to have made similar mistakes regarding VMware’s features. Either that, or he’s completely believing VMware’s hype, much as they’d both like us all to do.
VMware is not like RAC
A RAC configuration consists of multiple instances serving the same database. This isn’t even conceptually similar to multiple VMs running multiple Oracle SE instances with, necessarily, multiple databases.
“load-balancing”
I’ve only been administering tens of Oracle DBMS databases for 4 years, but I have no idea how one would load balance read/write clients across multiple database+instances in anything approaching a productive way. I’m going to go as far as to say that, at least generally, you can not.
“high availability”
I’ve already mentioned what I think about VMware HA. It doesn’t offer good protection against network failures and it doesn’t offer any protection against FC storage failures. In fact, you can’t even mirror FC storage with VMware unless you get your SAN to do it[1].
Additionally, even when a failure is detected, the only fix is to restart the VM and thereby the Oracle instance, implying the loss of in-flight transactions.
“hardware functioning as a single giant pool”
While both RAC and VMware can be argued as making the hardware they run on function as a single pool, these two pools have an entirely different purpose. The “RAC pool” will take a database query and do the same thing regardless of which node handles it, while the “VMware pool” will not (unless, of course, you happen to be running RAC in it).
There’s more: Fault Tolerance.
Chuck then goes on to say, regarding the VMware setup:
And VMware brings a few very cool features to the table
that Oracle doesn't, like real fault tolerance[...]
That’s a little shocking.
While RAC can be used to provide high-availability, there are (probably a large proportion of) RAC customers who would be using RAC in order to scale past a single server. VMware Fault Tolerance doesn’t even allow you to scale past a single virtual CPU.
VMware Fault Tolerance has other issues, such as a limited list of supported CPUs, the requirement to reboot a VM on most of those CPUs in order to enable it (and since you have to turn FT off in order to patch the ESX cluster it’s running on, that’s a big deal), lack of support for thin-provisioned VMs, an inability to support physical RDM, the requirement for a dedicated gigabit NIC and some other more minor ones. However, I’m also worried that it might use the same algorithm to determine failure of the Primary VM as VMware HA does - the documentation certainly mentions heartbeats, file locking on shared storage to prevent mistaken failover and that “failover occurs if the host running the Primary VM fails” which is the same terminology as the VMware HA documentation uses. I wonder if the loss of FC connectivity or the VM network will cause a failover here?
Don’t believe the hype
So much as I agree with Chuck’s main point, I’m annoyed at the over-hyping of VMware’s availability features because I don’t believe that they’re as good as Chuck would like me to. At the end of the day, as an Oracle and VMware customer I’d love to see Oracle’s database supported in VMware, but I’m well aware of the limitations of both and need this kind of misleading information being disseminated like a hole in the head.
[1] Note that this means that if you use Raw Device Mapping (RDM) to try to mirror in your OS instead, you’re just as at risk as if you hadn’t bothered because the mapping is stored in the not-fault tolerant .vmx file).
Ceri - Thanks for the de-hyping. It's a good read. ;) --Mike