Monthly Archives: November 2013

Breathing Underwater

When you are diving underwater, the pressure of the gases in your lungs must be the same as the ambient pressure under water: if it were lower your lungs would implode, and if it were higher they would explode. As you go deeper and deeper, this increased pressure in your lungs forces nitrogen to dissolve into your bloodstream and from there into your tissues.

If you surface too quickly, this nitrogen is released from the tissues in the form of bubbles, and these large bubbles cause a variety of symptoms, most commonly pain in large joints (e.g. shoulders, hips an knees) and itching in the skin. (Bending these joints lessens the pain, giving the condition its colloquial name: “the bends”.) This decompression sickness can be avoided by surfacing slowly (as calculated by decompression tables or dive computers), which allows the nitrogen to leave blood and tissues in a controlled manner.

In order to lessen the risk of the bends, divers can use a different mix of gases, known as nitrox, in their scuba equipment. Nitrox contains less nitrogen (and therefore more oxygen), and this reduces the amount of nitrogen available to dissolve into the blood.

air-nitrox-comparison

The difference between air (L) and nitrox (R). Nitrogen is coloured blue, oxygen is coloured red and other gases (carbon dioxide, etc.) are coloured orange.

For diving below about thirty metres nitrox is not suitable, for reasons best explained by the ideal gas law:

pV=nRT

p is the pressure in your lungs, V is the volume of your lungs, n is the number of gas molecules (or rather the number of moles of gas), R is a constant known as the molar gas constant (R=8.31 \mathrm{J/K/mol} and T is the temperature of the gas. Only the pressure and the number of gas molecules are variable, so as you dive deeper and the pressure increases this means that you will have more and more molecules of each gas in your lungs.

At a depth of fifty metres the pressure is about five times what it is at the surface, and your lungs would therefore contain five times as many nitrogen and oxygen (and carbon dioxide, argon, etc.) molecules as they normally would. This can lead to nitrogen narcosis, an impairment of cognitive function thought to be caused by disruption of nerve impulses. At this depth it’s not enough to simply replace the nitrogen with more oxygen, as in the case of nitrox; the extra oxygen in the lungs will lead to oxygen toxicity, a condition characterised by altered vision, drowsiness and disorientation.

To dive at this depth both the nitrogen and oxygen must be replaced by an inert gas that does not have narcotic effects. In most cases this is helium, producing a gas mix known as trimix. Different trimix “recipes” are used depending on dive depth: a high-oxygen variety for shallow depths (30-60 metres) and a lower-oxygen variety for deeper dives (below sixty metres).

trimix-comparison

Shallow-dive (L) and deep-dive (R) trimix recipes. Nitrogen is shown in blue, oxygen in red and helium in green.

Other breathing gases are also used. During very deep dives (below 150 metres) high-pressure nervous syndrome (also known as helium tremors) is a serious risk, and thus helium is either partially or entirely eliminated, to form hydreliox or hydrox respectively. These gases are highly dangerous, as they contain both hydrogen and oxygen, which forms an explosive mixture.

RAID Types

In computer science, RAID stands for Redundant Array of Independent (or Inexpensive) Disks. A RAID array takes multiple hard drives and treats them as one logical volume – the computer’s BIOS causes your operating system to treat an array of different drives as one single drive.

RAID 0

In RAID 0 data is split across two drives at the block level, in a process called striping. That is, the first part of a file is stored on Drive 1, the second on Drive 2, the third on Drive 1, the fourth on Drive 2, and so on. It offers no redundancy, so if one drive fails the data on both drives is lost, but it does offer improvements in terms of performance. Data can be read and written faster than with one single drive as the first part of a file can be read/written from Drive 1 at the same time as the second part is being read/written from Drive 2.

RAID 0 is most commonly used with two drives of the same size, as the size of the combined volume is limited to twice the size of the smallest drive.

RAID 1

In a RAID 1 setup the data is mirrored on each drive: every block of every file is written identically to each drive (i.e. the data is not striped). This offers the same performance boost as in RAID 0 when reading data, but writing data has to take place on every drive, so is limited by the slowest drive. RAID 1 offers redundancy in that if one drive fails, the data can be recovered from the other still-functioning drives in the array.

RAID 1 is also most commonly used with two drives, but any number of drives can be used in theory. However, there is usually little advantage to this, as it is highly unlikely that more than one drive would fail simultaneously.

RAID 2, RAID 3, RAID 4

RAID 2, RAID 3 and RAID 4 are modifications of RAID 1, but are now obsolete and not commonly used. They stripe data at the bit (RAID 2), byte (RAID 3) or block (RAID 4) level, and also include parity data, which is explained below.

RAID 5

A RAID 5 array must consist of at least three drives. Data is striped across n-1 drives, where n is the number of drives, and parity data (explained below) is stored on the final drive. Thus an array of  n identically sized drives yields a combined volume n-1 times the size of each drive, with the striping providing improved performance and the parity data providing redundancy. Parity data is spread out across all drives, which increases performance because during large file writing operations the RAID controller does not have to wait for a dedicated parity drive (as used in RAID 2, RAID 3 and RAID 4) to become available.

For a given file, blocks are stored on different drives and can therefore be read/written simultaneously, which offers some of the speed improvements of RAID 0 and RAID 1 (this also applies if the blocks of different files are located on different drives).

raid5

A diagram of a RAID 5 array using four drives.

Parity

For each series of bits parity is calculated by running an exclusive OR (XOR) on the data. An XOR takes individual bits and compares them: a 0 and 0 yields 0, a 0 and 1 or a 1 and 0 yield a 1, and a 1 and 1 yields a 0.

Imagine that we have two blocks of data: 10101010 and 11110000. These two blocks are stored on Drive 1 and Drive 2, and then the two blocks are XORed to yield the parity data which is stored on Drive 3.

Drive 1: 10101010
Drive 2: 11110000
Drive 3: 01011010

Now imagine Drive 2 fails:

Drive 1: 10101010
Drive 2: ????????
Drive 3: 01011010

We know to yield the first parity bit (0) the first bit of Drive 1 (1) must have been XORed with a 1; to yield the second parity bit (1) the bit on Drive 2 must have been a 1, and so on.  Thus we can reconstruct the missing data on a new Drive 2. We can also reconstruct missing parity data on a failed drive by simply repeating our original XOR process. However, if two of the three drives fail simultaneously, the data on the two failed drives is lost.

RAID 6

RAID 6 is similar to RAID 5, but distributes parity data across two drives, meaning that it can cope with two drives failing simultaneously and requires a minimum of four drives. RAID 6 is recommended in place of RAID 5 when using high capacity drives in large arrays, as rebuilding an array after a failure puts every other drive in the array under stress, which could cause a second drive to fail.

RAID 10 (aka RAID 1+0)

RAID 10 is  a nested array, combining striping and mirroring (and the advantages of both). In the minimal four-drive case, data is striped (RAID 0) between two mirrored (RAID 1) arrays. That is, the first block is mirrored on Drives 1 and 2, and the second block on Drives 3 and 4. A large RAID 10 array can sustain a large number of simultaneous drive failures, providing that no mirror loses all its drives (e.g. in the four-drive example above Drives 1 and 3, 2 and 3, 1 and 4 or 2 and 4 could fail and no data would be lost).

Other nested RAID arrays, such as RAID 50 (combining RAID 5 and RAID 0), or RAID 60 (combining RAID 6 and RAID 0) are also used. You can also use other configurations of drives, such as JBOD (Just a Bunch Of Disks), SPAN/BIG or MAID (Massive Array of Idle Drives).

Sorting Algorithms

Sorting lists of items into order is an important problem in computer science. What most people don’t realise is that there are many possible ways to do this.

Big O Notation

But before we look at individual algorithms, I need to explain big O notation. Big O notation is a measure of the efficiency or power of a particular algorithm. If an algorithm is O(n) then increasing the number of items to be sorted from 10 to 100 also increases the time taken by the same factor of ten. But if an algorithm is O(log(n)) then increasing the number of items to be sorted from 10 to 100 only doubles the time taken, as log(10) is 1 and log(100) is 2.

Bubble Sort

The bubble sort is the beginners sorting algorithm. It works by comparing each consecutive pair of items in the list and swapping them if they are the wrong way around.

Imagine that we are sorting the following six items: 536, 362, 653, 937, 351, 331.

The first operation compares 536 and 362 and because 362 is less than 536 the two items are swapped. The second operation compares 536 and 653, and because they are already the correct way around no swap takes place.

  1. 536, 362, 653, 937, 351, 331.
  2. 362, 536, 653, 351, 331, 937.
  3. 362, 536, 351, 331, 653, 937.
  4. 362, 351, 331, 536, 653, 937.
  5. 351, 331, 362, 536, 653, 937.
  6. 331, 351, 362, 536, 653, 937.

At this point, after five run-throughs and a total of ten swaps (but many more comparisons) the list is ordered.

The bubble sort is, on average, O(n2). This means that if the number of items increases by a factor of ten from 10 to 100 then the time taken increases by a factor of one hundred. The position of items in the list is particularly important in the bubble sort. Large items at the start of the list (rabbits) quickly move to the end of the list, but small items at the end of the list (turtles) take much longer to move to the correct position.

A modification of the bubble sort, known as the cocktail shaker sort, aims to reduce the effect of turtles by alternating direction on each run, running forwards and then backwards. For this reason it is sometimes known as the bidirectional bubble sort.

Running the list above through the cocktail shaker sort requires one fewer run than the bubble sort, because 331 acts as a turtle in the previous example:

  1. 536, 362, 653, 937, 351, 331.
  2. 362, 536, 653, 351, 331, 937.
  3. 331, 362, 536, 653, 351, 937.
  4. 331, 362, 536, 351, 653, 937.
  5. 331, 351, 362, 536, 653, 937.

Continue reading