Friday, April 04, 2014

Oracle ZFS storage appliance configuration

Oracle is incorporating its ZFS storage appliances in more and more engineered systems. Even if you are not a pure storage administrator or consultant and more into Oracle software and engineered systems it is good to have some basic understanding of how a ZFS storage appliance is working and what you can potentially do with it to enhance your solution and provide a better performing and maintainable solution.

The issue with hardware based solutions commonly is that you cannot just play with it without ordering the device. This is holding a lot of people back from gaining experience before they are getting involved in a project where this specific hardware solution is used. The Oracle ZFS storage appliance is bit different, reason for this is that Oracle has decided to create a virtual appliance you can use to play with the solution. The virtual appliance provides you all the options to test and work with the storage appliance in a Oracle VirtualBox image in the same manner as you would do when you would have purchased the real physical hardware.

Oracle ZFS storage appliance


The virtual Oracle ZFS storage appliance can be downloaded from the Oracle site. After unpacking and importing it into Oracle Virtualbox you will be up and running in  matter of minutes. One thing is good to keep in mind, this is a system to play around with, it is not intended to be used in any serious solution except playing and testing. When the initial boot has been completed you will notice that the welcome screen of the host informs you where you can point your browser to.

A minimal setup is done during the initial boot process, the full configuration and setup will be done via the browser. This is exactly the same manner as you would do when you use the real physical ZFS appliance in your datacenter.The primary things you need to completed are during the inital setup are:
  • Host Name
  • DNS Domain
  • Default Router
  • DNS server
  • Password

After completing those steps you will be pointed to a https://{ip}:215 address which will be the main URL for maintaining the ZFS storage appliance, or rather the ZFS storage appliance simulator in this case.

Oracle ZFS configuration STEP 1:
Before we can configure the machine you will have to login, for this you can use the root account in combination with the password you entered during the initial CLI configuration.


After login you will be shown the shown the welcome screen which again tells you that this is only to be used for demonstration purposes. You can use this also for some extreme small testing however, remember this system is not a solution for a storage need and just to play with.


Oracle ZFS configuration STEP 2:
The next step is to ensure you have all the correct networking in place to be able to use our ZFS appliance in the right manner within your corporate infrastructure.

Oracle ZFS appliance network configration

As you can see from the above screenshot there is a datalink and an interface already configured however still stating "untitled" which is giving a hint that you need to do some configuration to it before it will become usable. By clicking the pencil icon you can edit the details of both the datalinks and the interfaces as shown below.

Oracle ZFS storage configuration

After configuring a the ZFS storage appliance interfaces and datalinks you will be asked to configure the routing tables, DNS and NTP.




By having done this the pure network configuration steps are done. Optional you can now select a manner on how you will embed the new storage into your corporate authentication and authorization solution. You can solutions like; NIS, LDAP or an active directory solution you might already have in place within your corporate IT infrastructure.


More information on how to connect a new ZFS appliance to an already existing Microsoft Active Directory can be found in the Oracle documentation.

Oracle ZFS configuration STEP 3:
In step 3 the actual storage configuration will be done. Here you will have to select how you will use the disks and what type of data profile you will be using. All previous steps are more concerning how you will fit the appliance in your existing IT infrastructure. Those steps are concerning how you will actually configure and use the appliance on a storage level. It is advisable to have given this some thorough thoughts before you do the actual implementation of the appliance.

The first decision you will have to make is to decide how many storage pools your device will have (initially).


During this implementation we will only be using a single storage pool. The next important decision that needs to be made is what kind of storage profile you will be using within your pool or pools. You can have different storage  profiles per pool. The following strorage profiles are available:




Double parity
RAID in which each stripe contains two parity disks. This yields high capacity and high availability, as data remains available even with the failure of any two disks. The capacity and availability come at some cost to performance: parity needs to be calculated on writes (costing both CPU and I/O bandwidth) and many concurrent I/Os need to be performed to access a single block (reducing available I/O operations). The performance effects on read operations are often greatly diminished when cache is available.

Mirrored
Data is mirrored, reducing capacity by half, but yielding a highly reliable and high-performing system. Recommended when space is considered ample, but performance is at a premium (for example, database storage).

Single Parity, Narrow stripes
RAID in which each stripe is kept to three data disks and a single parity disk. At normal stripe widths, single parity RAID offers few advantages over double parity RAID -- and has the major disadvantage of only being able to survive a single disk failure. However, at narrow stripe widths, this single parity RAID configuration can fill a gap between mirroring and double parity RAID: its narrow width offers better random read performance than the wider stripe double parity configuration, but it does not have quite the capacity cost of a mirrored configuration. While this configuration may be an appropriate compromise in some situations, it is generally not recommended unless capacity and random read performance must be carefully balanced: those who need more capacity are encouraged to opt for a wider, double-parity configuration; those for whom random read performance is of paramount importance are encouraged to consider either a mirrored configuration or (if the workload is amenable to it) a double parity RAID configuration with sufficient memory and dedicated cache devices to service the workload without requiring disk-based I/O.

Striped
Data is striped across disks, with no redundancy whatsoever. While this maximizes both performance and capacity, it comes at great cost: a single disk failure will result in data loss. This configuration is not recommended, and should only be used when data loss is considered to be an acceptable trade off for marginal gains in capacity and performance.

Triple mirrored
Data is triply mirrored, reducing capacity by one third, but yielding a very highly reliable and high-performing system. This configuration is intended for situations in which maximum performance, and availability are required while capacity is much less important (for example, database storage). Compared with a two-way mirror, a three-way mirror adds additional protection against disk failures and latent disk failures in particular during reconstruction for a previous failure.

Triple parity, wide stripes
RAID in which each stripe has three disks for parity, and for which wide stripes are configured to maximize for capacity. Wide stripes can exacerbate the performance effects of double parity RAID: while bandwidth will be acceptable, the number of I/O operations that the entire system can perform will be greatly diminished. Resilvering data after one or more drive failures can take significantly longer due to the wide stripes and low random I/O performance. As with other RAID configurations, the presence of cache can mitigate the effects on read performance.

The decision which profile to apply is depending on a number of variables like what the type of performance you need will be and for example how "secure" your data should be in relation to data loss and hardware failure. The decision you make has a direct impact on performance as well as usable storage on your appliance. It is of the highest importance that, before you do the installation, have discussed the options with the consumers of your storage. This can be for example database and application administrators or even the business.

After having completed this section of the setup you should have similar situation as shown below.


This is completing the primary initial setup and you will be able to start distributing the storage to servers and users who will make use of the new ZFS appliance within your corporate IT infrastructure.

No comments: