Lecture-06: RAID in Database Management Systems (DBMS)

 RAID (Database Management Systems)

Redundancy array of independent disk (RAID) is a way to combine multiple disk storages for increased performance, data redundancy and disk reliability.

What is the problem with single disk storage?

1)    No backup Disk: If a single disk failure occurs, the whole system fails to perform. Due to  data redundancy if copy of data is stored in multiple disks, then even if one disk failure occurs the system can still fetch data from the other redundant disk.

2)      Performance: If large amount of data is stored in a single disk it can degrade the performance and effectiveness of the system. This can be solved by using multiple disks with Redundant data.

RAID Levels in DBMS:

The different RAID levels used in DBMS are:

1)      RAID 0.

2)      RAID 1.

3)      RAID 2.

4)      RAID 3.

5)      RAID 4.

6)      RAID 5.

7)      RAID 6.

8)      RAID 10.

Need:

1)    To increase performance.

2)    Increased reliability.

3)    To give better throughput.

4)    Data are restored.

 

RAID – Level 0:

  1. RAID level 0 provides data stripping, i.e., a data can place across multiple disks. It is based on stripping that means if one disk fails then all data in the array is lost.
  2. This level doesn't provide fault tolerance but increases the system performance.

Example:

 

Disk 0

Disk 1

Disk 2

Disk 3

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

 

Pros of RAID 0

1)      All the disk space is utilized and hence performance is increased.

2)      Data requests can be on multiple disks and not on a single disk hence improving the throughput.

Cons of RAID 0

1)      Failure of one disk can lead to complete data loss in the respective array.

2)     No data Redundancy is implemented so one disk failure can lead to system failure.

  RAID – Level 1:

1)    Uses mirroring techniques.

2)    All data in the drive is duplicated to another drive.

3)    It provides 100% redundancy in case of a failure.

Example:

 

Disk 0

Disk 1

Disk 2

Disk 3

20

20

22

22

24

24

26

26

28

28

30

30

32

32

34

34

 

Pros of RAID 1

1)      Failure of one Disk does not lead to system failure as there is redundant data in another disk.

Cons of RAID 1

1)      Extra space is required for each disk as each disk data is copied to some other disk also.

 RAID – Level 2:

1)    Use of mirroring as well as stores Error correcting codes for its data striped on different disks.

2)    Each data bit in a word is recorded on a separate disk and ECC codes of the data words are stored on a different set disk.

3)    Due to its complex structure and high cost, RAID 2 is not commercially available.

  Example:

Disk 0

Disk 1

Disk 2

Disk 3

Disk 4

Disk 5

20

21

22

P(20)

P(21)

P(22)

24

25

26

P(24)

P(25)

P(26)

28

29

30

P(28)

P(29)

P(30)

32

33

34

P(32)

P(33)

P(34)

 

Here Disk 3, Disk 4 and Disk 5 stores the parity bits of Data stored in Disk 0, Disk 1, and Disk 2 respectively. Parity bits are used to detect the error in data.

Pros of RAID 2

  1. It checks for error at a bit level for every data word.
  2. One full disk is used to store parity bits which helps in detecting error.

Cons of RAID 2

  1. Large extra space is used for parity bit storage.

RAID – Level 3:

1)    It consists of byte level stripping with dedicated parity. In this level, the parity information is stored for each disk section and written to dedicated parity drive.

2)    Parity is a technique that checks whether data has been lost or written over when it is moved from one place in storage to another.

3)    In the case of disk failure, the parity disk is accessed and data is reconstructed from the remaining devices.

4)    Once the failed disk is replaced, the missing data can be restored on the new disk.

Example:

 

Disk 0

Disk 1

Disk 2

Disk 3

20

21

22

P(20,21,22)

24

25

26

P(24,25,26)

28

29

30

P(28,29,30)

32

33

34

P(32,33,34)

 

Here Disk 3 contains the Parity bits for Disk 0 Disk 1 and Disk 2. If any one of the Disk's data is lost the data can be reconstructed using parity bits in Disk 3.

Pros of RAID 3

1)      Data can be recovered with the help of parity bits.

Cons of RAID 3

1)      Extra space for storing parity bits is used.


RAID – Level 4:

1)    RAID 4 implements block-level striping of data with dedicated parity drive. If only one of the data is lost in any disk, then it can be reconstructed with the help of parity drive.

 

2)    Parity is calculated with the help of XOR operation over each data disk block.

 Example:

 

DISK 0

DISK 1

DISK 2

DISK 3

0

1

0

P0

1

1

0

P1

 

Here P0 is calculated using XOR(0,1,0) = 1 and P1 is calculated using XOR(1,1,0) = 0 If there is even number of 1 then XOR is 0 and for odd number of 1 XOR is 1.

If suppose Disk 0 data is lost, by checking parity P0=1 we will know that Disk 0 should have 0 to make the Parity P0 as 1 whereas if there was 1 in Disk 0 it would have made the parity P0=0 which contradicts with the current parity value.

Pros of RAID 4

1)      Parity bits help to reconstruct the data if at most one data is lost from the disks.

Cons of RAID 4

1)      Extra space for Parity is required.

2)      If there is more than one data loss from multiple disks then Parity cannot help us reconstruct the data.

 RAID – Level 5:

RAID 5 is similar to RAID 4 with only one difference. The parity Rotates among the Disks.

Example:

DISK 0

DISK 1

DISK 2

DISK 3

0

1

0

P0

1

1

P1

0

1

P2

0

1

P3

1

0

0

Here We can see the rotation of Parity bits from Disk 3 to Disk 1.

Pros of RAID 5

1)      Parity is distributed over the disk and makes the performance better.

2)      Data can be reconstructed using parity bits.

Cons of RAID 5

1)      Parity bits are useful only when there is data loss in at most one Disk. If there is loss in more than one Disk Block then parity is of no use.

2)      Extra space for parity is required.

 

 RAID – Level 6:

1)    RAID 6 is an extension of level 5.

2)    In this level, two independent parities are generated and stored in distributed fashion among multiple disks.

3)     Two parities provide additional fault tolerance.

4)    This level requires at least four disk drives to implement RAID.

Example:

DISK 0

DISK 1

DISK 2

DISK 3

0

1

Q0

P0

1

Q1

P1

0

Q2

P2

0

1

P3

1

0

Q3

Here P0,P1,P2,P3 and Q0,Q1,Q2,Q3 are two parity to reconstruct the data if atmost two disks fail.

Pros of RAID 6

1)      More parity helps in reconstructing at most 2 Disk data.

Cons of RAID 6

1)      Extra space is used for both parities (P and Q).

2)      More than 2 disk failures cannot be corrected.

 

 RAID 10:

RAID 10, also known as RAID 1+0, combines data striping (RAID 0) with mirroring (RAID 1) for both high performance and data redundancy, requiring a minimum of four disks and offering protection against single or even multiple drive failures. 

What it is:

1)    RAID 10 combines the principles of RAID 0 (striping) and RAID 1 (mirroring) to achieve a balance between performance and data redundancy. 

 

How it works:

1)    Data is first striped (split into blocks) across multiple disks, similar to RAID 0, which enhances read and write speeds. 

2)    Each striped set is then mirrored, meaning each stripe is duplicated on another disk, providing redundancy and protection against drive failures. 

 

Minimum requirement:

1)    RAID 10 requires a minimum of four disks to function, with each mirrored pair consisting of two disks. 

 

Use Cases:

 

1)    RAID 10 is well-suited for applications requiring high performance, data redundancy, and fault tolerance, such as:

2)    Database servers 

3)    Virtualization platforms 

4)    High-performance computing 

5)    Network Attached Storage (NAS) devices 

 

 

 


 

 


 
 The factors to be taken into account in choosing a RAID level are:

1)    Performance requirements in terms of number of I/O operation.

2)    Performance when a disk has failed.

3)    Performance during rebuild.

 

 

 

 

 

 

Post a Comment

0 Comments