6.2 RAID

Nobody likes to lose data. And since disks eventually die, often with little warning, it's wise to consider setting up a RAID (Redundant Array of Inexpensive^[1] Disks) array on your database servers to prevent a disk failure from causing unplanned downtime and data loss. But there are many different types of RAID to consider: RAID 0, 1, 0+1, 5, and 10. And what about hardware RAID versus software RAID?

^[1] The "I" in RAID has meant, at various times, either "Inexpensive" or "Independent." It started out as "Inexpensive," but started being referred to as "Independent" because drives weren't really all that inexpensive. By the time people actually started using "Independent," the price of disks had plummeted and they really were "Inexpensive." Murphy at work.

From a performance standpoint, some options are better than others. The faster ones will sacrifice something to gain that performance—usually price or durability. In all cases, the more disks you have, the better performance you'll get. Let's consider the benefits and drawbacks of each RAID option.^[2]

^[2] For a more complete treatment of this topic, consult Derek Vadala's Managing RAID on Linux published by O'Reilly.

RAID 0

Of all the RAID types, RAID 0, or striping, offers the biggest performance improvement. Writes and reads are both faster in RAID 0 than in any other configuration. Because there are no spare or mirrored disks, it's inexpensive. You're using every disk you pay for. But the performance comes at a high price. There's no redundancy at all. Losing a single disk means that your whole array is dead.

RAID 0 should be used only when you don't care about data loss. For example, if you're building a cluster of MySQL slaves, it's entirely reasonable to use RAID 0. You'll reap all the performance benefits, and if a server does die, you can always clone the data from one of the other slaves.

RAID 1

Moving up the scale, RAID 1, or mirroring, isn't as fast as RAID 0, but it provides redundancy; you can lose a disk and keep on running. The performance boost applies only to reads. Since all the data is on every disk in the mirrored volume, the system may decide to read data in parallel from the disks. The result is that in the optimal case it can read the same amount of data in roughly half the time.

Write performance, however is only as good as a single disk. It can even be half as good depending on whether the RAID controller performs the writes in parallel or sequential order. Also, from a price point of view, you're paying for twice as much space as you're using. RAID 1 is a good choice when you need redundancy but have space or budget for only two disks—such as in a 1-U rackmount case.

RAID 5

From a performance standpoint, RAID 5, which is striping (RAID 0) with distributed parity blocks, can be beneficial. There are two disks involved in every operation, so it's not substantially faster than RAID 1 until you have more than three disks total. Even then, its other benefit, size, shines through. Using RAID 5, you can create rather large volumes without spending a lot of cash because you sacrifice only a single disk. By using more smaller disks, such as eight 36-GB disks instead of four 72-GB disks, you increase the number of spindles in the array and therefore boost seek performance and throughput.

RAID 5 is the most commonly used RAID implementation. When funds are tight, and redundancy is clearly more important than performance, it's the best compromise available.

RAID 10 (also known as RAID 1+0)

To get the best of both worlds (the performance benefits of RAID 0 along with the redundancy of RAID 1), you need to buy twice as many disks. RAID 10 is the only way to get the highest performance on your database server without sacrificing redundancy. If you have the budget to justify it, you won't be disappointed.

JBOD

The configuration sometimes called "Just a Bunch of Disks" (JBOD) provides no added performance or redundancy. It's simply a combination of two or more smaller disks to produce a single, larger virtual disk.

Table 6-1 summarizes various RAID features.

Table 6-1. Summary of various RAID features

Level

Redundancy

Disks required

Faster reads

Faster writes

RAID 0

No

N

Yes

Yes

RAID 1

Yes

2^[3]

Yes

No

RAID 5

Yes

N+1

Yes

No

RAID 10

Yes

N*2

Yes

Yes

JBOD

No

N/A

No

No

^[3] Typically, RAID 1 is used with two disks. but it's possible to use more than two. Doing so will boost read performance but doesn't change write performance.

6.2.1 Mix and Match

When deciding how to configure your disks, consider the possibility of multiple RAID arrays. RAID controllers aren't that expensive, so you might benefit from using RAID 5 or RAID 10 for your databases and a separate RAID 1 array for your transaction and replication logs. Some multichannel controllers can manage multiple arrays, and some can even bind several channel controllers together into a single controller to support more disks.

Doing this isolates most of the serial disk I/O from most of the random, seek-intensive I/O. This is because transaction and replication logs are usually large files that are read from and written to in a serial manner, usually by a small number of threads. So it's not necessary to have a lot of spindles available to spread the seeks across. What's important is having sufficient bandwidth, and virtually any modern pair of disks can fill that role nicely. Meanwhile, the actual data and indexes are being read from and written to by many threads simultaneously in a fairly random manner. Having the extra spindles associated with RAID 10 will boost performance. Or, if you simply have too much data to fit on a single disk, RAID 5's ability to create large volumes works to your advantage.

6.2.1.1 Sample configuration

To make this more concrete, let's see what such a setup might look like with both InnoDB and MyISAM tables. It's entirely possible to move most of the files around and leave symlinks in the original locations (at least on Unix-based systems), but that can be a bit messy, and it's too easy to accidentally remove a symlink (or accidentally back up symlinks instead of actual data!). Instead, you can adjust the my.cnf file to put files where they belong.

Let's assume you have a RAID 1 volume on which the following filesystems are mounted: /, /usr, and swap. You also have a RAID 5 (or RAID 10) filesystem mounted as /data. On this particular server, MySQL was installed from a binary tarball into /usr/local/mysql, making /usr/local/mysql/data the default data directory.

The goal is to keep the InnoDB logs and replication logs on the RAID-1 volume, while moving everything else to /data. These my.cnf entries can accomplish that:

datadir = /data/myisam

log-bin = /usr/local/mysql/data/repl/bin-log

innodb_data_file_path = ibdata1:16386M;ibdata2:16385M

innodb_data_home_dir = /data/ibdata

innodb_log_group_home_dir = /usr/local/mysql/data/iblog

innodb_log_arch_dir = /usr/local/mysql/data/iblog

These entries provide two top-level directories in /data for MySQL's data files: ibdata for the InnoDB data and myisam for the MyISAM files. All the logs remain in or below /usr/local/mysql/data on the RAID 1 volume.

6.2.2 Hardware Versus Software

Some operating systems can perform software RAID. Rather than buying a dedicated RAID controller, the operating system's kernel splits the I/O among multiple disks. Many users shy away from using these features because they've long been considered slow or buggy.

In reality, software RAID is quite stable and performs rather well. The performance differences between hardware and software RAID tend not to be significant until they're under quite a bit of load. For smaller and medium-sized workloads, there's little discernible difference between them. Yes, the server's CPU must do a bit more work when using software RAID, but modern CPUs are so fast that the RAID operations consume a small fraction of the available CPU time. And, as we stressed earlier, the CPU is usually not the bottleneck in a database server anyway.

Even with software RAID, you can use multiple disk controllers to achieve redundancy at the hardware level without actually paying for a RAID controller. In fact, some would argue that having two non-RAID controllers is better than a single RAID controller. You'll have twice the available I/O bandwidth and have eliminated a single point of failure if you use RAID 1 or 10 across them.

Having said that, there is one thing that can be done with hardware RAID that simply can't be done in software: write caching. Many RAID controllers can add battery-backed RAM that caches reads and writes. Since there's a battery on the card, you don't need to worry about lost writes even when the power fails. If it does, the data stays in memory on the controller until the machine is powered back up. Most hardware RAID controllers can also read cache as well.

6.2.3 IDE or SCSI?

It's a perpetual question: do you use IDE or SCSI disks for your server? A few years ago, the answer was easy: SCSI. But the issue is further muddied by the availability of faster IDE bus speeds and IDE RAID controllers from 3Ware and other vendors. For our purposes, Serial-ATA is the same as IDE.

The traditional view is that SCSI is better than IDE in servers. While many people dismiss this argument, there's real merit to it when dealing with database servers. IDE disks handle requests in a sequential manner. If the CPU asks the disk to read four blocks from an inside track, followed by eight blocks from an outside track, then two more blocks from an inside track, the disk will do exactly what it's told; even if it's not the most efficient way to read all that data. SCSI disks have a feature known as Tagged Command Queuing (TCQ). TCQ allows the CPU to send several read/write requests to the disk at the same time. The disk controller then tries to find the optimal read/write pattern to minimize seeks.

IDE also suffers from scaling problems; you can't use more than one drive per IDE channel without suffering a severe performance hit. Because most motherboards offer only four IDE channels at most, you're stuck with only four disks unless you add an additional controller. Worse yet, IDE has rather restrictive cable limits. With SCSI, you can typically add 7 or 14 disks before purchasing a new controller. Furthermore, the constant downward price pressure on hard disks has affected SCSI as much as IDE.

On the other hand, SCSI disks still cost more than their IDE counterparts. When you're considering four or more disks, the price difference is significant enough that you might be able to purchase IDE disks and be able to afford another controller, possibly even an IDE RAID controller. Many MySQL users are quite happy using 3Ware IDE RAID controllers with 4-12 disks on them. It costs less than a SCSI option, and the performance is reasonably close to that of a high-end SCSI RAID controller.

6.2.4 RAID on Slaves

As we mentioned in the discussion of RAID 0, if you're using replication to create a cluster of slaves for your application, it's likely that you can save money on the slaves by using a different form of RAID. That means using a higher-performance configuration that doesn't provide redundancy (RAID 0), using fewer disks (RAID 5 instead of RAID 10), or using software rather than hardware RAID, for example. If you have enough slaves, you may not necessarily need the redundancy on the slaves. In the event that one slave suffers the loss of a disk, you can always synchronize it with another nearby slave to get it started again.

< Day Day Up >