I’ve been with Varrow over 5 years now and it seems like just last week we were rolling out the initial run of EMC VNX arrays. We’ve had just outstanding success with them…everywhere. From small environments to large environments they’ve been solid, dependable, and fast while at the same time being much, much simpler and easier to manage than their predecessors. Now it’s time to move to the next generation.
As you would expect to see in a new generation almost everything is faster. Faster CPUs with more cores. More RAM. Better I/O. It’s all here. The benefit is obvious. Things get done quicker, but we also get a larger “CPU and memory budget” that allows for more advanced features and offerings. They scale up to over 1 Million IOPS. Up to 3PB. Can do 200,000 IOPS in a 3U package.
We’ll go through most of the major enhancements and changes throughout this post but you’ll find that many of them are internal. They aren’t big items on a feature checklist but they make significant changes in how the array performs and how we can architect simpler storage systems.
The New Models
Here is a table that outlines the new model numbering scheme and sizing. One thing that is missing is the new VNX5200. EMC was initially not going to offer another VNX with a max capacity of 125 drives but that has changed. The specs are the same as the VNX5400, but again, limited to 125 drives. If you’re interested in a VNX5200 I suggest you talk to your favorite EMC partner (Varrow!) about one soon as EMC is offering great pricing on the VNX5400s until the VNX5200 starts to ship.
Another note on the VNX5200…. The VNX5100 was block only. That is gone. Everything is unified now. In my opinion that’s a good move. It simplifies the lineup, code releases, installation, and keeps people from buying an array for one purpose and then getting stuck when they need more capabilities. You may also notice a larger array on the high end. The new VNX8000. At launch it will do up to 1,000 drives (hence the asterisk) but will be capable of going to 1,500. Why? Why not a VMAX at that point?
That’s a common discussion point for us. But to be blunt…you buy VMAX for resiliency. If you lost a component the VMAX continues on without much of a hiccup. But many people want high performance without the cost of that resiliency. You can build a VERY fast VNX…as fast or faster than many VMAX out there so there is a need for a larger offering and that’s the VNX8000. Some people may not agree but one thing is certain about EMC….you’re rarely short on options.
The hardware gets simpler on most models. On everything but the VNX8000, DPEs (Disk Processor Enclosure) are used instead of SPEs (Storage Processor Enclosure). DPEs make the physical size smaller as the storage processors are in the first tray of disks and not separate. SPS (Standby Power Supplies) are now integrated as well.
Faster Hardware Is Nice, But What Else?
The biggest change is the underlying operating system/environment (known as the OE) on the new VNX arrays. If you’ve used a CLARiiON or VNX you are probably familiar with FLARE…well, maybe less so on the VNX, but it was still in there. FLARE has been the OS on EMC mid-range arrays for a long time. It was great then, but it was very quickly showing age. It wasn’t built for these new fangled multi-core CPUs and it just wasn’t scaling. It’s time to fix that.
That graphic is an example of how FLARE would handle 16 cores…not well. Often you’d see Core 0 be pegged while other cores were far from being fully utilized. FLARE is gone and now we have MCx. MCx stands for MultiCore Everything. It’s very different than FLARE.
That diagram gives you an idea of how the overall architecture was redesigned. Processes/tasks are broken out now..unlike the more monolithic FLARE. This again allows for better scaling. Here is another example from an actual array showing MCx.
What Does That Get Me?
Let’s talk about how these new arrays are different from the last generation. Sure, the code running on them is new, but what does that mean for you and what you can do? In no real particular order….
Active/Active Storage Processors
Woah. Yeah, that’s a big jump. Previous EMC mid-range was always Active/Passive Storage Processors (SPs). The VNX was capable of ALUA functionality which basically meant you could access a LUN via the owning SP (optimized path) as well as the non-owning SP (non-optimized path). There was some performance hit (not a lot…but some) by using the non-optimized path. Many long time EMC admins would properly balance LUNs across SPs and re-balance when needed.
We now have true Active/Active storage processors…well…mostly… The reason that I say mostly is that Classic LUNs (those not carved from a pool) are handled as Active/Active while LUNs from a pool are still Active/Passive, for now. That will change on a software release later. And as an FYI, there are hidden or private LUNs within pools and those are Active/Active. If you’re like most people with VNX arrays you’re probably using pools and LUNs from pools so you won’t see this benefit right away…but soon.
Think you’ll miss trespassed LUNs? I won’t.
New Caching Methods
EMC calls this Multi-Core Cache (MCC)…going along with the MCx theme. But, it just means they’ve revamped how the arrays handle cache on the front-side. The first major change is that the front-end RAM cache is now dynamic. In previous arrays we would set a percentage of available memory to be Read Cache and Write Cache. Depending on the I/O profile of your workload you may need to adjust those. But not now.
Now we have Adaptive Cache. Basically, the array monitors I/Os going in and out and adjusts the thresholds in realtime as needed. Spikes on incoming I/Os will cause it to lower the read cache and dedicate more to write, and vice versa. Another new feature is Write Throttling. With this the array will hold back acknowledgements to the host to help manage the write cache and align it with the performance capabilities of the underlying disks.
The idea is to minimize forced flushes of front-end cache which disrupts I/O and can cause further problems. You want smooth incoming I/O. You want I/O to come in just as fast as the back-end disks can handle it and Write Throttling provides that.
Better FAST Cache
One of, if not the best, feature on a VNX is FAST Cache. With FAST Cache you can use SSDs as front-end read AND write cache. It’s not uncommon for us to pull performance stats off of customer arrays and see the majority, often the vast majority, of I/Os serviced from those SSDs. It greatly increases performance and reduces the load on the back-end spindles. This lets us build smaller arrays that are cheaper…yet faster. Since the introduction of FAST Cache I don’t think we’ve sold a VNX without atleast two SSDs for cache.
Multi-Core FAST Cache (MCF) is the next iteration of that. The first big change is how the cache is warmed. With FLARE a 64KB block had to be accessed 3 times before it was put in to that SSD cache. There are very good reasons for that…but SSDs are now getting larger and capable of handling more of the working dataset. Now it’s just 1. The first time you access data it goes in to those SSDs…until they are 80% full. Then it goes back to the caching blocks that have been read 3 times.
How much space in the FAST Cache each LUN gets is also different with MCx. All LUNs equally share 50% of the FAST Cache pages (capacity). The other 50% is available to any LUN on an as-needed basis. The idea is to keep a couple of busy LUNs from using all of the cache and denying that use by lower utilized LUNs.
The diagram shows the FAST Cache capacity limits per model. Also note that there are two models of SSDs in VNX arrays. SLC can be used for FAST Cache and eMLC normally used for regular data access. This is due to the longer life and heaver write activity expected when using SSDs with FAST Cache. You can use SLC drives for regular data as well but they are slightly more expensive. But given that you can’t use a SLC as a hot spare for eMLC or eMLC for SLC the cost difference may be very minor when you factor in only one, or one type, of hot spare drive. Do your math.
Multi-Core RAID (MCR)
Yes, even good ol’ RAID got some enhancements here. Well…I guess you’d say it’s RAID. The first is Permanent Sparing. Traditionally when a drive in a RAID set failed the array would grab a designated hot spare and use that to rebuild the RAID set. When you replaced the failed drive the array would copy the data from the hot spare to the new drive, and then make the hot spare a hot spare yet again. That’s no longer the case. Now the array keeps using the hot spare drive. Big deal? I don’t think so. Just be aware.
How hot spares are specified has also changed..and by changed I mean gone away. You no longer specify drives as hot spares. Any unbound drive is capable of being a hot spare. The array is smart in how it chooses which drive to use (capacity, rotation, bus, etc) so that it doesn’t pick an odd drive on a different bus, unless it has to do so. MCx also has a timeout for RAID rebuilds. If a drive goes offline, or fails, or you pull it out for some reason the array now waits 5 minutes before activating a spare and rebuilding the set. It does this to make sure you didn’t accidentally do something or that you’re not moving drives around.
Wait. What? Moving drives around? Yes. MCx supports Drive Mobility.
You can now pull drives from a slot and put them in another slot and the array will detect it and put it back online without activating a rebuild..as long as you do it within 5 minutes. You can also shut down the array and re-cable the backend buses if you want and it will still know which drives belong to what. Let’s be clear here. Don’t just do this without planning. You’re still moving drives and changing things. Do it for a purpose. Also, you can’t move drives, or whole RAID groups, between arrays…even between MCx arrays. It’s only within the same array. Use caution.
MCx does parallel rebuilds on RAID6, if you lose two drives. FLARE would rebuild the set with one drive…then rebuild it again for the second drive. MCx is more intelligent and if you fail two drives it will rebuild both parity sets at once.
More Efficient FAST VP
With FLARE you had auto tiering storage pools, called FAST VP (Virtual Provisioning). Data was broken up in to “slices” and moved up or down performance tiers. Each slice was 1GB in size, which is large, especially when you may only have a few hundred GB of Tier 0 (SSD). With MCx the slice size is now 256MB, making it more efficient.
It’s nice to finally get this feature and I think it’s something that’s been lacking, even though we’ve had file-based for a while. This is available with all new VNX models as there is no additional license is required. It is not an in-line process. It is a post-process that runs twice per day, up to 4 hours at a time. You can enable deduplication on a per-LUN basis for pool LUNs (not classic). The deduplication of the data happens amongst LUNs in the same pool with deduplication enabled. Meaning, if you have two LUNs with a lot of similar data and only enable deduplication on one you will get no benefit. It has to be enabled on both.
Much of the benefit of deduplication is obvious. But you gain others as well. Things like FAST Cache are deduplication aware so they can better utilize that SSD capacity since they only keep one set of redundant data in cache at a time. The block size for data is 8KB.
Licensing & Software
For the most part things are the same, though you now get more for free. Included with each new generation VNX are:
- VNX Monitoring and Reporting (Wow!)
- Unisphere Remote for monitoring multiple arrays
- Unisphere Quality of Service
- Unisphere Analyzer
To me, and you, this is great. You get everything you need to monitor and see all performance information for your storage system.
New Drives and Form Factors
Earlier I mentioned we have eMLC drives now for FAST VP or other general use data volumes. We also now have the option of a 2.5″ 300GB 15K SAS drive and a 2.5″ 1TB 7.2K NL-SAS drive. This will further increase density. You can put 500 of those drives in a rack! We’re going to continue seeing the move to all 2.5″ drives as well as squeezing down SAS. We’re quickly headed toward a SSD & NL-SAS world. As before, the disk enclosures are connected via 24Gb SAS buses.
For the most part nothing changes here. Performance is up but that’s mainly due to the increases in the underlying array. The X-Blades/Datamovers/NAS Heads are the same as before. This isn’t a bad thing, just something that didn’t need to be revised. The heavy lifting of features such as FAST VP, Cache, and other CPU heavy processes are handled by the upgraded storage processors. That’s a big benefit of EMC’s architecture. You have SPs to handle the underlying block storage functions and X-Blades to handle the NAS functions. The X-Blades can benefit from better performance and scaling by the SPs.
When I first saw the next generation models on the roadmap I expected them to just be VNX with more CPU and larger drives and configuration maximums but EMC went well beyond that. While not in your face, the change from FLARE to MCx is huge. The amount of CPU power required to do things like FAST Cache, compression, and deduplication is often very underestimated. FLARE just couldn’t take advantage of the latest CPUs and get the performance needed to offer these new features. MCx will allow the EMC mid-range lines to scale for a good while.
What’s missing? Not a lot really. I’d like to have seen Data At Rest Encryption (DARE) at launch but it’s not here. Word is we’ll be seeing that in a later software release but don’t hold me to that. The addition of deduplication for block is very nice but right now you don’t get to schedule when that process occurs. It’s always a low priority, but I’d like to schedule that like we do with FAST VP tiering and if I had my wish it would be inline but I’m sure we’ll get that as CPUs continue to get faster. There is no free lunch, everything has a cost.
In my opinion, right now the biggest challenge to the VNX in many customer environments are the other startup storage companies such as Tintri, Nimble, hyper-converged solutions like Nutanix, and even VMware’s new VSAN (currently in beta). If you’re in a pure virtual environment or even a silo’d deployment such as VDI it can be hard to ignore these other options. They are simple, fast (enough), and well integrated but they don’t offer the breadth or depth of the VNX to go outside the virtual space or allow many types of external connectivity. What you choose depends on what you need but I’d like to see EMC do more to compete in this area. With the intellectual property owned by EMC and in use in things such as Data Domain, Isilon, RecoverPoint, and ScaleIO they could create some very compelling offerings if they wanted to dedicate the R&D and money to it. Will they? I don’t know and my gut says no. They feel that those hybrid array solutions are a temporary measure until flash gets cheap enough to make all-flash arrays the standard. We’ll see. But in the end the VNX is still the king of mid-range storage.