Episode 8.01 – Intro to Error Detection

Click here to download the episode worksheet.

Welcome to the Geek Author series on Computer Organization and Design Fundamentals. I’m David Tarnoff, and in this series, we are working our way through the topics of Computer Organization, Computer Architecture, Digital Design, and Embedded System Design. If you’re interested in the inner workings of a computer, then you’re in the right place. The only background you’ll need for this series is an understanding of integer math, and if possible, a little experience with a programming language such as Java. And one more thing. Due to the computational nature of this episode, you might want to visit the transcript page found at intermation.com to download the episode worksheet.

For as long as humans have been transmitting and storing data, we’ve been trying to control the effects of errors. For example, the analog transmission of images was susceptible to noise, electrical interference acquired as the signal was sent from point A to point B. The errors were not correctable, but we could manage them. Typically, the noise added to the signal making the signal level higher than it ought to have been for a specific image element. The solution? Some formats inverted the analog signal to lessen the effects of the noise. By inverting the signal, in other words, representing the bright pixels with low levels and the dark pixels with high levels, the noise spikes appeared darker rather than lighter, thus lessening the visual impact of the noise.

On the one hand, digital data is less susceptible to noise. Providing we are able to keep noise from changing the numbers, the data transmitted should be identical to the data received. On the other hand, corrupted digital data can have a far more detrimental effect. A single bit flip in a downloaded executable could render the whole file unusable. Can we determine if a bit has flipped?

Let’s begin by looking at a simple application. Devices that interact with the real world depend on sensors. Systems such as automotive airbags or anti-lock braking systems where a failure could be disastrous, need to have mechanisms in place to ensure accuracy. Often, two or three independent parallel sensors are used in these systems to determine if the received input is accurate. If two sensors agree, the signal can be trusted. If the sensor signals are different, then we know that one is in error or the data has been corrupted.

A two-input exclusive-OR gate can be used here to indicate when an error has occurred. Remember that the output of a two-input exclusive-OR gate is a logic zero if both inputs are the same and a logic one if the inputs are different. This gives us a simple circuit to detect when two signals, which should be identical, are not and should not be trusted.

That was simple enough. Of course, the problem now lies with determining which sensor is in error. For critical applications, one solution may be to use three sensors in parallel. Sensors A and B are connected to one exclusive-OR gate, and sensors B and C are connected to a second exclusive-OR gate. This groups our sensors into two groups, with no two sensors belonging to the same set of groups. Sensor A only belongs to the first exclusive-OR “group”, sensor B belongs to both exclusive-OR “groups”, and sensor C belongs only to the second exclusive-OR “group”. If both exclusive-OR gates are outputting logic zeros, then no group has an error and all three sensors agree. If the A-B exclusive-OR gate outputs a logic zero, while the B-C exclusive-OR gate outputs a logic one, then group one is okay while group two has a problem. Which sensor is only a member of the second group? Sensor C, which means it disagrees and we need to go with the output from sensor A or B. If the A-B exclusive-OR gate outputs a logic one, while the B-C exclusive-OR gate outputs a logic zero, then group two is okay while group one has a problem. Which sensor is only a member of the first group? Sensor A, which means it disagrees and we need to go with the output from sensor B or C. And what if both exclusive-OR gates output a logic one? Sensor B is the only one connected to both exclusive-OR gates, so it is in disagreement, and we should go with the reading from sensor A or sensor C.

When we move beyond sensors and into the realm of intelligent devices communicating with digital data, we need to have a more robust system – something that can identify if an error has occurred in the data without having to transmit duplicate data.

One of the simplest and most primitive forms of error detection is to add a single bit called a parity bit to each data element. A parity bit identifies whether the data has an odd or even number of ones. It is considered a poor method of error detection as it is not capable of detecting when an even number of bit inversions have occurred. When combined with other methods of error detection, however, it can improve their overall performance. In addition, an understanding of its behavior will help us correct bit errors later.

There are two primary types of parity: odd and even. Even parity requires the parity bit to be set so that the sum of the ones across both the bits of the data element and the parity bit is an even number. With odd parity, the parity bit is set so that the sum of ones across the bits of the data element and the parity bit is an odd number. When using a digital system incorporating parity, the all devices must agree in advance which type of parity they will be using: odd or even.

One of the primary problems with parity is that if two bits are inverted, the parity bit appears to be correct, and the receiving system assumes that the data is error free. That means that parity can only detect an odd number of bit errors.

Assume that a system uses even parity. If an error has occurred and one of the bits in either the data element or the parity bit has been inverted, then counting the number of ones across the data element and the parity bit results in an odd number. From the information available, the digital system cannot determine which bit was inverted or even if only one bit was inverted. It can only tell that an error has occurred.

For example, a system might store an even parity bit with each data element in memory. Each time it reads data, it checks to make sure that the sum of ones across the data element and the parity bit is even. If it detects that the parity is odd, it knows that an error has occurred. When this happens, the hardware sends a signal to the operating system that the data is corrupted.

Let’s calculate some parity bits. Assume we are using a system that stores an even parity bit along with each eight-bit data element. The first data element we are going to store is the Unicode representation of a capital K, which in binary is 01001011. The binary representation of K contains four ones, an even number. Therefore, the even parity bit should be zero so that the sum of ones across the binary representation of K and the parity bit is even. In the next memory location, we need to store the eight-bit binary representation of the integer 25, which is 00011001. This bit pattern has three ones, an odd number, so the associated even parity bit must be one to make the sum of ones across the data element and parity bit even.

Now we’re going to read some data and check to see if an error has occurred using even parity. Assume the first byte we read is 01100001 while its associated parity bit is one. Counting the ones across both the data element and the parity bit gives us four ones. This is an even number, so we assume the data is good. From the next memory location, we read 11011000 with a parity bit of one. Counting the ones from both the data element and the parity bit gives us five ones. This is an odd number, so we know that an error has occurred.

Digital circuitry can be used to detect an error in parity. Remember that the output of an exclusive-OR gate is a logic one if the number of ones at its inputs is odd and a logic zero if the count is even. By inputting all the bits of the data element plus the even parity bit into an exclusive-OR circuit, the output will be one when the even parity bit indicates the data is in error. To verify data with an odd parity bit, use the exclusive-NOR gate, which will output a one when the number of bits at its inputs is even, and the data is in error.

In our next episode, we are going to take parity to the next level by describing its use in Hamming codes. Hamming codes allow us to correct single bit errors in our data elements by storing the data with multiple parity bits. Hamming codes can also detect when a two-bit error has occurred.

For episode transcripts, worksheets, links, or other podcast notes, please visit us at intermation.com where you will also find links to our Instagram, Twitter, Facebook, and Pinterest pages. Until the next episode, remember that while the scope of what makes a computer is immense, it’s all just ones and zeros.