In the previous post I talked about different RAID levels and in this post I will explain about the math about RAID 6. It is a little complicated so please feel free to skip this post. Just remember that RAID 6 allow up to 2 disk to fail while having all data intact.
The math behind RAID 6 is Galois field, GF(2^8) (finite fields of characteristic 2 with 8 terms). By using GF(2^8), we can have up to a total of 257 disk, 255 data disk and 2 parity disk. Here are a summary of things we will be using, it is not an exhaustive properties.
- Additive identity is 0.
- Addition and subtraction is simply a bitwise XOR operation, thus A + B = A - B. *Note that A + A = A - A = 0.
- Multiplicative identity is 1. A * 1 = 1 * A = A
- Multiplication is implemented by the the following relations:
- bit 7 of (A * 2) = bit 6 of (A)
- bit 6 of (A * 2) = bit 5 of (A)
- bit 5 of (A * 2) = bit 4 of (A)
- bit 4 of (A * 2) = bit 3 of (A) + bit 7 of (A)
- bit 3 of (A * 2) = bit 2 of (A) + bit 7 of (A)
- bit 2 of (A * 2) = bit 1 of (A) + bit 7 of (A)
- bit 1 of (A * 2) = bit 0 of (A)
- bit 0 of (A * 2) = bit 7 of (A)
- Basic algebra applies, i.e., A * B = B * A, A * ( B + C) = A* B + A * C.
- For all A such that A is not 0, there exist an inverse A (1/A or A^-1) such that A * A^-1 = 1.
- A/B = A * 1/B = A * B^-1, for B not equal 0. Thus if A * B = C, then C/B = A for B not equal 0.
- Anything multiply by 0 is 0.
- Raising power can apply the distributive law:
- *Number are in hexadecimal*
- 02^2 = 02^1 * 02 = 04
- 02^3 = 02^2 * 02 = 04 * 2 = 08
- 02^4 = 02^3 * 02 = 08 * 2 = 10
- 02^5 = 02^4 * 02 = 10 * 2 = 20
- 02^6 = 02^5 * 02 = 20 * 2 = 40
- 02^7 = 02^6 * 02 = 40 * 2 = 80
- 02^8 = 02^7 * 02 = 80 * 2 = 1d (see number 4)
- If you are multiplying by a number that is not a power of 2, for example 1F, you can express it like this: 1E = 10 + 08 + 04 + 02, thus A * 1E = A*10 + A*08 + A*04 + A*02.
- Raising power, the power is congruent mod 255. Base cannot be 0.
- A^256 = A^(256 mod 255) = A^1 = A
- A^255 = A^(255 mod 255) = A^0 = 1
- A^254 = A^(255-1) = A^255 / A^1 = 1/A
- A^-2 = 1 / A^2 = A^255 / A^2 = A^253
- Element, g, is called generator if g is an element in GF(q) and all non zero element in GF(q) can be written as (g^n mod q)
- 2 is a generator for GF(3). GF(3) has 2 non zero element, 1 2.
- 2^1 mod 3 = 2
- 2^2 mod 3 = 1
- 2 is not a generator for GF(7). There exists no i such that (2^i mod 7) = 3.
- Any generator g, defines a function from non zero elements in GF(2^8) to elements in 255 (0 to 254 mod 255). For example g = 02 (Hex), 02^6 = 40, log02 40 = 6.
- For any non zero element A and B in Galois field
- *logg => log base g, (+) and (-) are normal addition and subtraction followed by mod 255*
- A*B = C <==> logg A (+) logg B = logg C
- thus A*B = C <==> C = g^(logg A (+) logg B)
- A/B = C <==> logg A (-) logg B = logg C
- thus A/B = C <==> C = g^(logg A (-) logg B)
Please note that I am only be simply listing the properties for the sole purpose to explain how RAID 6 works, if you are interested how the properties are derived, you can read up on finite fields.
How RAID 6 works? RAID 6 requires two parity data, P and Q, to ensure all data are intact even after suffering up to 2 disk failure. Please note that P is the same as even parity of RAID 5.
Formula of P and Q, for n is the number of data disk and n <= 255 disk:
P = Disk0 + Disk1 + Disk2 + ... + Disk(n-1)
Q = (
g^0 * Disk0) + (
g^1 * Disk1) + (
g^2 * Disk2) + ... +(
g^(n-1) * Disk(n-1))
*g is any generator and we will be operating byte by byte
For example a 3 data disk setup, and using 02 (Hex) for our generator:
P = Disk0 + Disk1 + Disk2
= 1010 1010 + 1100 1100 + 0001 0110
= 0111 0000
g^0 = 02^0 = 0000 0001
g^1 = 02^1 = 0000 0010
g^2 = 02^2 = 0000 0100
Q = (g^0 * Disk0) + (g^1 * Disk1) + (g^2 * Disk2)
= (0000 0001 * 1010 1010) + (0000 0010 * 1100 1100) + (0000 0100 * 0001 0110)
= 1010 1010 + 1000 0101 + 0101 1000
= 0111 0111
Disk 0: 1010 1010
Disk 1: 1100 1100
Disk 2: 0001 0110
Disk 3: 0111 0000 (P)
Disk 4: 0111 0111 (Q)
For 1 disk failure:
If we lose a data disk, we can reconstruct the lost disk using P (see previous post about RAID 5 parity).
If we lose P or Q, we can reconstruct them using the data disks.
For 2 disk failure:
If we lose P and Q, we can simply reconstruct them using the data disks.
If we lose a data disk and Q, we can reconstruct the data disk using P (see previous post about RAID 5 parity), then reconstruct Q using the data disks.
If we lose a data disk and P, we can reconstruct the data disk using Q, then reconstruct P using the data disks
To reconstruct the data using Q, we first construct a Q' by assuming the lost data disk contains 0000 0000
By property 1, 5 and 8, we get:
Q' + (g^x * Dx) = Q (x denote the lost drive number, Dx is the data)
By property 2, we can rewrite it as:
(g^x * Dx) = Q + Q'
By property 6 and 7 and g^x is non since g is 02 (Hex), we can rewrite it as:
Dx = (Q + Q')/g^x = (Q + Q') * g^-x
By property 10, we can rewrite it as:
Dx = (Q + Q') * g^(255-x)
Here, Q, Q' and x are known, so we solve Dx
Here is an example:
Assume we lose disk 2 (data) and disk 3 (P), x = 2
Q' = (0000 0001 * 1010 1010) + (0000 0010 * 1100 1100) + (0000 0100 * 0000 0000)
= 1010 1010 + 1000 0101 + 0000 0000
= 0010 1111
Q + Q' = 0111 0111 + 0010 1111
= 0101 1000
Dx = (Q + Q') * g^(255-x)
= 0101 1000 * g^(255-2)
= 0101 1000 * g^(253)
= 0101 1000 * 0100 0111 (apply property 9)
= (0101 1000 * 0100 0000) + (0101 1000 * 0000 0100) + (0101 1000 * 0000 0010) + (0101 1000 * 0000 0001)
= 1000 0011 + 0111 1101 + 1011 0000 + 0101 1000
= 0001 0110 (disk 2's data, with this we can easily compute P)
If we lose 2 data disk, we can reconstruct both the data using P and Q.
First we need to compute P' and Q' by assuming the 2 lost data disk contains 0000 0000.
By property 1, 5 and 8, we get 2 equations:
(1) P' + Dx + Dy = P (x and y denote the lost drive number, Dx and Dy denote the data)
(2) Q' + (g^x * Dx) + (g^y * Dy) = Q
We have 2 unknown variable, Dx and Dy, and we have 2 equations, so yes, simultaneous equations.
Divide (2) by g^x
(3) (Q' * g^-x) + Dx + (g^(y-x) * Dy) = (Q * g^-x)
By property 2, we rearrange (3)
Dx = (Q * g^-x) + (Q' * g^-x) + (g^(y-x) * Dy)
(4) Dx = ((Q + Q') * g^-x) + (g^(y-x) * Dy)
By property 2, we rearrange (1)
(5) Dy = P + P' + Dx
Sub (5) into (4)
Dx = ((Q + Q') * g^-x) + (g^(y-x) * (P + P' + Dx))
Dx = ((Q + Q') * g^-x) + (g^(y-x) * (P + P')) + (g^(y-x) * Dx)
Dx + (g^(y-x) * Dx) = ((Q + Q') * g^-x) + (g^(y-x) * (P + P')) (by property 2)
(6) Dx * (1 + g^(y-x)) = ((Q + Q') * g^-x) + (g^(y-x) * (P + P'))
By property 6 and 7, if (1 + g^y-x) is not equal to 0, we can divide by it. So, g^(y-x) cannot equal to 1 which means y and x are not congruent mod 255. Since we do not allow more than 255 disk, and by our assumption, the 2 disk are different disk, we do not violate the constraint.
Now we rewrite (6)
(7) Dx = (((Q + Q') * g^-x) + (g^(y-x) * (P + P'))) / (1 + g^(y-x))
All the variable on the right hand side of (7) are all known variable, thus we can solve Dx. After we solve Dx, we can simply use (5) to solve for Dy.
Here is an example:
Assume we lose disk 1 (data) and disk 2 (data), x = 1, y = 2
Q' = (0000 0001 * 1010 1010) + (0000 0010 * 0000 0000) + (0000 0100 * 0000 0000)
= 1010 1010 + 0000 0000 + 0000 0000
= 1010 1010
Q + Q' = 0111 0111 + 1010 1010
= 1101 1101
P' = 1010 1010 + 0000 0000 + 0000 0000
= 1010 1010
P + P' = 0111 0000 + 1010 1010
= 1101 1010
(Q + Q') * g^-x = 1101 1101 * g^-1
= 1101 1101 * g^255-1
= 1101 1101 * g^254
= 1101 1101 * 1000 1110 (apply property 9)
= (1101 1101 * 1000 0000) + (1101 1101 * 0000 1000) + (1101 1101 * 0000 0100) + (1101 1101 * 0000 0010)
= 1011 0010 + 1010 0110 + 0101 0011 + 1010 0111
= 1110 0000
g^(y-x) * (P + P') = g^(2-1) * 1101 1010
= g^1 * 1101 1010
= 0000 0010 * 1101 1010
= 1010 1001
Dx = (((Q + Q') * g^-x) + (g^(y-x) * (P + P'))) / (1 + g^(y-x))
= 0100 1001 / 0000 0011 (apply property 6 and 7)
= 0100 1001 * 1111 0100 (apply property 9)
= 0100 0000 * 1111 0100 + 0000 1000 * 1111 0100 + 0000 0001 * 1111 0100
= 1100 1011 + 1111 0011 + 1111 0100
= 1100 1100 (Disk 1 data, with this we can easily compute Disk 2)