Archive for March, 2012
So one of the Nexenta systems that I’ve been working on quadrupled memory, and ever since then has been having some issues (as detailed in the previous post – that was actually supposed to go live a few weeks ago). Lots of time spend on Skype with Nexenta support has led us in a few directions. Yesterday, we made a breakthrough.
We have been able to successfully correlate VMware activities with the general wackyness of our Nexenta system. This occurs at the end of a snapshot removal, or at the end of a storage vmotion. Yesterday, we stumbled across something that we hadn’t noticed before. After running the storage vmotion, the Nexenta freed up the same amount of RAM from the ARC cache as the size of the VMDK that just got moved. This told us something very interesting.
1 – There is no memory pressure at all. The entire VMDK got loaded into the ARC cache as it was being read out. And it wasn’t replaced.
2 – Even after tuning the arc_shrink_shift variables, we were still freeing up GOBS of memory. 50GB in this case.
3 – When we free up that much RAM, Nexenta performs some sort of cleanup, and gets _very_ busy.
After reviewing the facts of the case, we started running some dtrace scripts that I’ve come across. Arcstat.pl (from Mike Harsch) showed that as the data was being deleted from disk, arc usage was plummeting, and as soon as it settled down, the arc target size was reduced by the same amount. When that target size was reduced, bad things happened.
At the same time, I ran mpstat to show what was going on with the CPU. While this was going on, we consistently saw millions of cross-calls from one processor core to another, and 100% system time. The system was litterally falling over trying to free up RAM.
Currently the solution that we have put into place is setting arc_c_min to arc_max -1GB. This has so far prevented arc_c (target size) from shrinking aggressively and causing severe outages.
There still appears to be a little bit of a hiccup going on when we do those storage vmotions, but the settings that we are using now appear to at least be preventing the worst of the outages.
Good question. One would think that there’s never too much memory, but in some cases, you’d be dead wrong (at least, not without tuning). I’m battling that exact issue today. On a system that I’m working with, we upgraded the RAM from 48GB to 192GB of RAM. ZFS Evil Tuning guide says don’t worry, we auto-tune better than Chris Brown. I’m starting to not believe that. We’ve been intermittently seeing the system go dark (literally dropping portchannels to Cisco Nexus 5010 switches), then roaring back to life. Standard logging doesn’t appear to be giving much insight, but after digging through ZenOSS logs and multiple dtrace scripts, I think we’ve found a pattern.
It appears as though by default, Nexenta will de-allocate a certain percentage of your memory when it does memory cleanup related to the ARC cache. When you get to larger memory systems, the amount of memory it frees grows. I monitored an event where it free’d up something to the tune of 8GB of RAM. That happened to coincide with a portchannel dropping.
Through all of this, support has been great. We’ve been tuning the amount of memory it free’s up. We’ve tuned the minimum amount of RAM to free up (in an effort to get it to free memory more often). We’ve allocated more memory to ARC metadata. Pretty much we’ve thrown the kitchen sink at it. The last tweak was done today, and I’m monitoring the system to see if we continue to see problems. Hopefully, once this is all done I can post some tuneables for larger memory systems.