Thursday, January 19, 2012

ZFS SAN Build: Part 1 the hardware

Before putting a ZFS SAN in to production for our new VM environment, I wanted to build a proof of concept server. If you have no idea what ZFS is, where have you been? Check out Sun's offical documentation on it HERE! Or over at Serve the Home
Or over at Hard Forum
Or over at Anandtech

Why ZFS?
- Data security due to real data checksums, no raid-write hole problem, online scrubbing against silent corruption.
- Nearly unlimited datasnaps + systemsnaps and writable clones without initial space consumption due to copy on write.
- Easyness, it just works and is stable without special raid-controller or complex setup.
- It scales to petabyte size with off the shelf hardware.
- It is FREE.

Let me repeat the two most important parts- It is FREE and it just WORKS. You can build and configure a 20TB+ system in a weekend , minimal headaches, and be ready to start enjoying all the great features of ZFS before you could even build a similar Raid 5 array. 

When planning the ZFS SAN, I had the following things in mind:
-Reliable hardware
-Off the shelf parts
-Easy to service

Lets start with the chassis:
There are a dozen great chassis which can fit a number of different users requirements. Norco and Supermicro are two great choices. On a budget? Norco. 24 bays for a $300 price range? Can't beat that for home use! Looking for redundancy? Supermicro, you can not built the quality for the price point.
I went with neither. Originally I was going to use the Supermicro 4u 24 bay hot swap. HERE It has redundant 900w or 1200w powersupplies, great build quality, and a price point around $1000 shipped. It was perfect- until I found this monster at a local ewaste recycler:

34bay Hotswap SATA chassis with 4 redundant powersupplies for $200

It was originally designed for a dual opteron with PCI-X raid controllers. This chassis was around $5k with no drives back in its day. It even worked. It fit every requirement and helped take a significant chunk out of the budget. The one down side? It required custom cables for the backplane- SFF-8470 to SFF-8087. These cost a cool $300... It was weeks after I bought it I found this out, and honestly, I should have bought the supermicro. No one makes rails for this monster anymore either. Lesson 1 learned.



The motherboard, processor, and ram:
I wanted something stable and cheap. I have had great luck with supermicro motherboards. I found this board open box on Newegg for $200 off: Supermicro X8DAH+-F-O. It includes some very nice features like ipmi, dual processors, 256gb ram supported, tons of PCIe 16x, 8x, 4x slots, and nearly half off! It uses cheap, unregistered ecc at around $40 per 4gb stick.

Supermicro X8DAH+-F-O
16gb Kingston DDR3 ECC
(1)Xeon e5420 2.5 Quad









Controller cards:
I wanted to go with LSI from the start and keeping in mind being able to provide the best bandwitdh possible, I chose the following:
Two LSI SAS9201-16i controllers
These two cards use 4x 8087 internal connectors and provide up to 430,000 I/Os per second when connected to the 8x PCIe bus. I choose to use two controllers instead of a expander due to concerns over the single 8087 linking the controller to host.



Hard drives:
2x 250gb WD hard drives (Mirrored rpool)
20 1tb WD RE3 hard drives
2x Intel 320 80gb ZIL drives
2x Intel 320 160gb L2ARC cache drive


Now I know using RE3 drives defeats the purpose of using cheap hard drives, but at $65/e brand new off ebay, these were hard to beat. RE3 are raid class enterprise drives designed for 24x7x365 and have a long proven track record.

The ZIL drives are simply for the ZIL log, setup in a mirror this should create a fast log store, which is important for performance in ZFS.
The L2ARC drives keep the most commonly accessed files and setup in a raid-0, these should provide a super fast store for 320gb of the most accessed data.




Networking:
2x Quad Port Intel Nics
With building this fast of a machine, I wanted to provide a solid connection to the SAN network. Two Quad nics allows me to provide two 4gb's LCAP links to two individual switches. Combined with each servers two dual port gigabit cards, each server has a super fast, redundant back end for storage traffic.

1 comment: