Fundamentals of block coding
In this essay the basic fundamentals of block coding as a type of forward error correction code, as well as an example of such a code, are examined, in order to highlight the importance of error correction in digital communication systems. In the first part, the theory around error correction codes and types is presented with special emphasis on the block codes, their properties and the problems they encounter. In the second part the most popular block code, Reed-Solomon code, is discussed along with its mathematical formulation and the most common applications that implement it.
Over the past years, there has been an extraordinary development in digital communications especially in the areas of mobile phones, personal computers, satellites, and computer communication. In these digital communication systems, data is represented as a sequence of 0s and 1s. These binary bits are expressed as analog signal waveforms and then transmitted over a communication channel. Communication channels, though, induce interference and noise to the transmitted signal and corrupt it. At the receiver, the corrupted transmitted signal is modulated back to binary bits. The received binary data is an evaluation of the binary data being transmitted. Bit errors may occur because of the transmission and that number of errors depends on the communication channels interference and noise amount.
Channel coding is used in digital communications to protect the digital data and reduce the number of bit errors caused by noise and interference. Channel coding is mostly achieved by adding redundant bits into the transmitted data. These additional bits allow the detection and correction of the bit errors in the received information, thus providing a much more reliable transmission. The cost of using channel coding to protect the transmitted information is a reduction in data transfer rate or an increase in bandwidth.
1. FORWARD ERROR CORRECTION BLOCK CODES
1.1 ERROR DETECTION – CORRECTION
Error detection and correction are methods to make sure that information is transmitted error free, even across unreliable networks or media.
Error detection is the ability to detect errors due to noise, interference or other problems to the communication channel during transmission from the transmitter to the receiver. Error correction is the ability to, furthermore, recreate the initial, error-free information.
There are two basic protocols of channel coding for an error detection-correction system:
Automatic Repeat-reQuest (ARQ): In this protocol, the transmitter, along with the data, sends an error detection code, that the receiver then uses to check if there are errors present and requests retransmission of erroneous data, if found. Usually, this request is implicit. The receiver sends back an acknowledgement of data received correctly, and the transmitter sends again anything not acknowledged by the receiver, as fast as possible.
Forward Error Correction (FEC): In this protocol, the transmitter implements an error-correcting code to the data and sends the coded information. The receiver never sends any messages or requests back to the transmitter. It just decodes what it receives into the “most likely” data. The codes are constructed in a way that it would take a great amount of noise to trick the receiver interpreting the data wrongly.
1.2 FORWARD ERROR CORRECTION (FEC)
As mentioned above, forward error correction is a system of controlling the errors that occur in data transmission, where the sender adds additional information to its messages, also known as error correction code. This gives the receiver the power to detect and correct errors (partially) without requesting additional data from the transmitter. This means that the receiver has no real-time communication with the sender, thus cannot verify whether a block of data was received correctly or not. So, the receiver must decide about the received transmission and try to either repair it or report an alarm.
The advantage of forward error correction is that a channel back to the sender is not needed and retransmission of data is usually avoided (at the expense, of course, of higher bandwidth requirements). Therefore, forward error correction is used in cases where retransmissions are rather costly or even impossible to be made. Specifically, FEC data is usually implemented to mass storage devices, in order to be protected against corruption to the stored data.
However, forward error connection techniques add a heavy burden on the channel by adding redundant data and delay. Also, many forward error correction methods do not quite respond to the actual environment and the burden is there whether needed or not. Another great disadvantage is the lower data transfer rate. However, FEC methods reduce the requirements for power variety. For the same amount of power, a lower error rate can be achieved. The communication in this situation remains simple and the receiver alone has the responsibility of error detection and correction. The sender complexity is avoided and is now entirely assigned to the receiver.
Forward error correction devices are usually placed close to the receiver, in the first step of digital processing of an analog signal that has been received. In other words, forward error correction systems are often a necessary part of the analog to digital signal conversion operation that also contain digital mapping and demapping, or line coding and decoding. Many forward error correction coders can also produce a bit-error rate (BER) signal that can be used as feedback to optimize the received analog circuits. Software controlled algorithms, such as the Viterbi decoder, can receive analog data, and output digital data.
The maximum number of errors a forward error correction system can correct is initially defined by the design of the code, so different FEC codes are suitable for different situations.
The three main types of forward error correction codes are:
Block codes that work on fixed length blocks (packets) of symbols or bits with a predefined size. Block codes can often be decoded in polynomial time to their block size.
Convolutional codes that work on symbol or bit streams of indeterminate size. They are usually decoded with the Viterbi algorithm, though other algorithms are often used as well. Viterbi algorithm allows infinite optimal decoding efficiency by increasing limited length of the convolutional code, but at the cost of greatly increasing complexity. A convolutional code can be transformed into a block code, if needed.
Interleaving codes that have alleviating properties for fading channels and work well combined with the other two types of forward error correction coding.
1.3 BLOCK CODING
Block coding was the first type of channel coding implemented in early mobile communication systems. There are many types of block coding, but among the most used ones the most important is Reed-Solomon code, that is presented in the second part of the coursework, because of its extensive use in famous applications. Hamming, Golay, Multidimensional parity and BCH codes are other well-known examples of classical block coding.
The main feature of block coding is that it is a fixed size channel code (in contrary to source coding schemes such as Huffman coders, and channel coding techniques as convolutional coding). Using a preset algorithm, block coders take a k-digit information word, S and transform it into an n-digit codeword, C(s). The block size of such a code will be n. This block is examined at the receiver, which then decides about the validity of the sequence it received.
1.3.2 FORMAL TYPE
As mentioned above, block codes encode strings taken from an alphabet set S into codewords by encoding each letter of S independently. Suppose (k1, k2,, km) is a sequence of natural numbers that each one less than |S| . If S=s1,s2,,sn and a specific word W is written as W = sk1 sk2 skn , then the codeword that represents W, that is to say C(W), is:
C(W) = C(sk1) C(sk2) C (skm)
1.3.3 HAMMING DISTANCE
Hamming Distance is a rather significant parameter in block coding. In continuous variables, distance is measured as length, angle or vector. In the binary field, distance between two binary words, is measured by the Hamming distance. Hamming distance is the number of different bits between two binary sequences with the same size. It, basically, is a measure of how apart binary objects are. For example, the Hamming distance between the sequences: 101 and 001 is 1 and between the sequences: 1010100 and 0011001 is 4.
Hamming distance is a variable of great importance and usefulness in block coding. The knowledge of Hamming distance can determine the capability of a block code to detect and correct errors. The maximum number of errors a block code can detect is: t = dmin 1, where dmin is the Hamming distance of the codewords. A code with dmin = 3, can detect 1 or 2 bit errors. So the Hamming distance of a block code is preferred to be as high as possible since it directly effects the codes ability to detect bit errors. This also means that in order to have a big Hamming distance, codewords need to be larger, which leads to additional overhead and reduced data bit rate.
After detection, the number of errors that a block code can correct is given by: t(int) = (dmin 1)/2
1.3.4 PROBLEMS IN BLOCK CODING
Block codes are constrained by the sphere packing problem that has been quite significant in the last years. This is easy to picture in two dimensions. For example, if someone takes some pennies flat on the table and push them together, the result will be a hexagon pattern like a bee’s nest. Block coding, though, relies on more dimensions which cannot be visualized so easily. The famous Golay code, for instance, applied in deep space communications uses 24 dimensions. If used as a binary code (which very often it is,) the dimensions refer to the size of the codeword as specified above.
The theory of block coding uses the N-dimensional sphere model. For instance, what number of pennies can be packed into a circle on a tabletop or in 3-dimensional model, what number of marbles can be packed into a globe. Its all about the codes choice. Hexagon packing, for example, in a rectangular box will leave the four corners empty. Greater number of dimensions means smaller percentage of empty spaces, until eventually at a certain number the packing uses all the available space. These codes are called perfect codes and there are very few of them.
The number of a single codewords neighbors is another detail which is usually overlooked in block coding. Back to the pennies example again, first pennies are packed in a rectangular grid. Each single penny will have four direct neighbors (and another four at the four corners that are farther away). In the hexagon formation, each single penny will have six direct neighbors. In the same way, in three and four dimensions there will be twelve and twenty-four neighbors, respectively. Thus, increasing the number of dimensions, the close neighbors increase rapidly. This results in that noise finds numerous ways to make the receiver choose a neighbor, hence an error. This is a fundamental constraint of block coding, and coding in general. It may be more difficult to cause an error to one neighbor, but the number of neighbors can be so big that the probability of total error actually suffers.