Modified Huffman Coding Schemes Information Technology Essay
CHAPTER 2
Document compression is a digital process. Therefor, before compressing the data , information about the document should be known. The CCITT algorithms deals with a page of size 8.5 x 11 inch. The page is divided into horizontal and vertical lines. These horizontal lines are known as scan lines .
Dots per inch and pixels per inch are two standards for image resolution. A 8.5 x 11 inch page is 1728 x 2200 pixels . One scan line is 1728 pixel long .the normal resolution is 200 x 100 dpi and a fine resolution is 200 x 200 dpi.
Figure 2.1
Each pixel is represented by 1 bit , the number of pixel that will form the above page is 3,801,600. Although sending this data through an ISDN line it will take approximately 7 min. If the resolution of the page is increased , the time taken by the transmission will increase. Thus it is not important to transfer every exact bit of the binary page information. The most commonly encoding used for CCITT compression is Modified Huffman which is supported by all the fax compression techniques. Other options used are Modified Read and Modified Modified Read. The following table gives an overview of these encoding/decoding techniques.
Characteristics
MH
MR
MMH
Compression efficiency
Good
Better
Best
Standard
T.4
T.4
T.6
Dimension
1-D
2-D
2-D(extended)
Algorithm
Huffman and RLE
Similarities between two successive lines
More efficient MR
Table 2.1 : Comparisons of MH, MR and MMR
2.1.1 Modified Huffman
The fax pages are contains many runs of white and black pixels which makes RLE efficient for minimizing these run lengths. The efficiently compressed run lengths are then combined with Huffman coding . Thus an efficient and simple algorithm is achieved by combining RLE with Huffman coding and this is known as Modified Huffman. RLE consists of terminating and makeup codes.
MH coding uses specified tables for terminating and makeup codes. Terminating codes represent shorter runs while the makeup codes represents the longer runs. The white and black pixel runs from 0 to 63 are represented by terminating codes while greater than 63 are represented with makeup codes which mean than greater than 63 bit runs are defined in multiples of 64 bits which are formed by the terminating codes. These tables are given in chapter 4. a scan line represented with long runs gives a make code which is less than or equal to the pixel run and then the difference is given by the terminating code. The following example will help in understanding how it works. .
There are three different types of bit pattern in MH coding
Pixel information (data )
Fill
EOL
The term Fill refers to the extra 0 bits that are added to a small data line which fills the left space in the data. The Fill patterns brings highly compressed scan line to a preferred minimum scan line time ( MSLT) , which makes it complete and transmittable. Consider a transmission rate of 4800 bps with an MSLT 10ms so the minimum bit per scan line is 48 bits.1728 pixels scan line is compressed to 43 bit . 31 data bit + 12 EOL bits which in total is 43 bits. The left space is filled by 5 Fill bits given as follow
Scan line 1728 pixels
EOL
RLE code
4B
3W
2B
1719W
12 bits
————————————————–43 bits————————————————
Bit pattern
00110101 011 1000 11 01100001011000 00000 0000000000001
————–31 data bits ——————————– fill patren———— EOL ——–
———————————————48 bits ————————————————-
Figure 2.2 Modified Huffman structure
In addition to this another special bit pattern used in the MH coding is EOL . EOL are special bit patterns which have several different identification function i.e.
EOL at the start of the scan line indicate the start of the scan of line
EOL at the end of the scan line consist of 11 0’s followed by a 1. It helps in stopping the error from one scan line penetrating into other scan lines and each line is independently coded.
At the end of each page an RTC signal is given which holds six EOL patterns which identifies the end of page .
MODIFIED READ
MR is also known as Modified Relative Element address designated (READ). MR exploits the correlation between successive lines . It is known that two consecutive lines have a very high percentage of single pixel transition due to a very high resolution of the images. Using this phenomena, instead of scanning each scan line as done in MH, MR takes in account a reference line and then encodes each scan line that follows. In fact it is more appropriate to say that MR is more complex MH algorithm.
MR encoding encounters both MH and MR coding technique. The reference line is encoded using MH and the subsequent line is encoded using MR encoding until the next reference line appears. The decision on how to encounter the next reference line is taken by a parameter K. The vale of K defines the resolution of the compression.
MR is a 2-Dimensional algorithm. The value of K defines the number of lines that uses 2-Dimensional phenomena, which K-1 lines. However the reference line using the MH algorithm is using 1-dimension. For a normal resolution of an image the value of K is set to 2 the refrence line is encoded every second scan line. Where as the value of K set to 4 will give a higher resolution because the reference line is MH encoded every 4 line , making it more complex and compressed. The following figure shows scan lines for both resolution of K set to 2 and 4.
MH
MR
MH
MR
————-2 scan lines————-
For normal resolution
k = 2 , 1 MH line, 1 MR line
MH
MR
MR
MR
MH
MR
MR
MR
—————4 scan lines——————–
For higher resolution
k = 4, 1 MH line , 3 MR lines
figure 2.3 modified read structure
The advantage of having low resolution over high resolution is that the error prorogation into the subsequent line is reduced with lower number of dependent scan lines. However in MR encoding the value of K can be set as high as 24.
The change between two subsequent line i.e. the refrence line and the next scan line given by MR can be given as follow
reference line b1 b2
Scan line a0 a1 a2
figure 2.4 MR 2-D coding.
The nodes that are given in the figure above are described as follow
a0 is start of changing element in the coding line which is also the reference for the next changing elements
a1 first transition on the coding line
a2 second transition on the coding line
b1 first transition on the reference line on the right of the a0 , first opposite color transition
b2 first transition on the reference line.
In the above figure the reference line is coded with the MH coding while the next scan line is coded with MR. Hence it can be seen that there are very minor changer between both the scan line. MR takes advantage of the minor changes and encodes only the changing elements a0 , a1 and a2 instead of the complete scan line. There are three functional encoding modes of MR , which decide on how to code these changing elments of the scan line with respect to the reference line. These modes are
Pass mode
Vertical mode
Horizontal mode
As it is due to these different modes of MR which makes it more complex algorithm. These MR functional modes are discussed in detail in chapter 3. And then one can reffer back to this part to completely understand it. The structure of MR is given as follow
EOL +1
Data
1-D
fill
EOL
+0
Data
1-D
EOL+1
Data
1-D
fill
EOL +0
Data
1-D
EOL +1
EOL +1
EOL +1
EOL +1
EOL +1
EOL +1
K = 2
EOL+1 MH coding of next line
EOL+0 MR coding of next line
FILL Extra 0 bits
RTC End of page with 6 EOLs
Figure 2.5 Structure of MR data in a page
Modified Modified Read
ITU-T Recommendation T.6 gives the Modified Modified Read or MMR encoding algorithm. MMR is an upgraded version of the MR. They are both 2-Dimensional algorithms but MMR is an Extended version of the 2-Dimension. The fundamentals of MMR are same as MR except a few minor changes to the algorithm however the modes of MR i.e. pass mode , vertical mode and horizontal mode are same for MMR encoding.
The major change in the MMR with respect to MR is the K parameter . The MMR algorithm dose not use the K parameter and recurring reference line. Instead of these the MMR algorithm uses an imaginary scan line which consist of all white pixels which is the first line at the start of each page and a 2-Dimension line follows till the end of the page. This introduced scan line of all whites is the reference line alike the MR.
The error propagation in MMR has a very high predictability because of the connected coding method of all the scan lines. Thus ECM is required for MMR to be enabled. ECM guaranties error free MMR algorithm. Thus MMR dose not require any EOL however a EOFB (end of facsimile block) is required at the end of page which is the same as RTC in MH. The organization of data in MMR and the EOFB block bit sequence is given as follow.
Data
2-D
Data
2-D
Data
2-D
Data
2-D
Data
2-D
Data
2-D
Data
2-D
Data
2-D
Data
2-D
Data
2-D
Data
2-D
Data
2-D
Data
2-D
EOFB
——————————scan lines of page———————————–
EOFB bit sequence
0000000000001 0000000000001
Figure 2.6 Scan lines in MMR page
Tagged Image File Format
Tagged Image File Format(TIFF) is purely a graphical format i.e. pix elated, bitmap or rasterized. TIFF is a common file format that is found in most imaging programs. This discussion here cover majorly the TIFF standard of ITU-T.6 which is the latest. T.6 includes all the specification of the earlier versions with little addition. TIFF is flexible and has good power rating but at the same time it is more complex. Extensibility of TIFF makes it more difficult to design and understand. TIFF is as known by its name a tagged file that holds the information about the image. TIFF structure is organized into three parts
Image file header (IFH)
Bit map data (black and white pixels)
Image File Directory(IFD)
IFH
Bitmap data
IFD
EOB
Figure 2.7 File organization of TIFF
Consider an example of three TIFF images file structures. These three structures hold the same data in possible three different formats. The IFH or the header of TIFF is the first in all the three arrangements. However in the first arrangement IFD’s are been written first and then followed by the image data which is efficient if IFD data is needed to be read quickly. In the second structure the IFD is followed by its particular image which is the most common internal structure of the TIFF. In the last example the image data followed by its IFD’s. This structure is applicable if the image data is available before the IFD’s.
Header
IFD0
IFD1
IFD n
Image 0
Image 1
Image n
Header
IFD 0
Image 0
IFD 1
Image 1
IFD n
Image n
Header
Image 0
Image 1
Image 3
IFD 0
IFD 1
IFD n
Figure 2.8 Different TIFF structures
Image File Header
A TIFF file header is an 8-byte which is the start of a TIFF file. The bytes are organized in the following order
The first two bytes defines the byte order which is either little endian (II)or big endian (MM). The little endian byte order is that it starts from least significant bit and ends on the most significant and big endian is vice verse.
II = 4949H
MM = 4D4DH
The third and fourth bytes hold the value 42H which is the definition for the TIFF file
The next fourth bytes holds the offset value for the IFD. The IFD may be at any location after the header but must begin after a word boundary.
Byte order
42
Byte offset for IFD
Figure 2.9 IFH structure
Image File Directory
Image file directory (IFD) is a 12 byte file that holds information about the image including the color , type of compression, length, width, physical dimension, location of the data and other such information of the image.
Before the IFD there is a 2 byte tag counter. This tag counter holds the number of IFD used. Which is followed by a 12 byte IFD and a four 0 bytes at the end of the last byte. Each IFD entry has the following format
The first two bytes of the IFD hold the identification field. This filed gives information what characteristic of the image it is pointing to. This is also know as the tag.
The next two bytes gives the type of of the IFD i.e. short, long etc
The next four bytes hold the count for the defined tag type
The last two bytes hold the offset value for the next IFD which is always an even number. However the next IFD starts by a word difference. This vale offset can point anywhere in the Image even after the image data.
The IFD are sorted in ascending order according to the Tag number. Thus a TIFF field is a logical entity which consist of a tag number and its vallue
Tag entry count
2-bytes
Tag 0
12 bytes
Tag 1
12 bytes
Tag n
12 bytes
Next IFD offset or
null bytes
4 bytes
Figure 2.10 IFD structure
The IFD is the basic tag file that hold information about the image data in a complete TIFF file. The data is either found in the IFD or retrieved from an offset location pointed in the IFD. Due to offset value to other location instead of having a fixed value makes TIFF more complex. The offset values in TIFF are in three places
last four bytes of the header which indicates the position of the first IFD
Last four bytes of the IFD entry which offsets the next IFD.
The last four bytes in the tag may contain an offset value to the data it represents or possibly the data its self
figuer 2.11
CCITT Encoding
This type of compression is used for facsimile and document imaging files. It is a losses type of image compression. The CCITT ( International telegraph and telephone consultative committee) is an organization which provides standards for communication protocol for black and white images or telephone or other low data rate data lines. The standards given by ITU are T.4 and T.6. These standards are the CCITT group 3 and group 4 compression methods respectively. CCITT group compression algorithms are designed specifically for encoding 1 bit image. CCITT is a non adaptive compression algorithm. There are fixed tables that are used by CCITT algorithms. The coded values in these tables were taken from a reference of set of documents containing both text and graphics.
The compression ratio obtained with CCITT algorithms is much more higher than quarter size of the original image. The compression ratio for a 200 x 200 dpi image achieved with group 3 is 5:1 to 8:1 which is much increased with group 4 that is up to 15:1 with the same image resolution. However the complexity of the algorithms increases with the ratio of its comparisons. Thus group 4 is much more complex than group 3.
The CCITT algorithms are specifically designed for typed or handwritten scanned images, other images with composition different than that of target for CCITT will have different runs of black and white pixels. Thus such bi-level images compressed will not give the required results. The compression will be either to a minimum or even the compressed image will be greater in size than the original image. Such images at maximum can achieve a ratio of 3:1 which is very low if the time taken by the comparisons algorithms is very high.
The CCITT has three algorithms for compressing bi level images,
Group 3 one dimensional
Group 3 two dimensional
Group 4 two dimensional
Earlier when group 3 one dimensional was designed it was targeted for bi level , black and white data that was processed by the fax machines. Group 3 encoding and decoding has the tendency of being fast and has a reputation of having a very high compression ratio. The error correction inside a group 3 algorithm is done with the algorithm itself and no extra hardware is required. This is done with special data inside the group3 decoder. Group 3 makes muse off MH algorithm to encode.
The MMR encoding has the tendency to be much more efficent. Hence group 4 has a very high percentage of compression as compared to group 3 , which is almost half the size of group 3 data but it is much more time consumed algorithm. The complexity of such an algorithm is much more higher than that of group 3 but they do not have any error detection which propagates the error how ever special hardware configuration will be required for this purpose. Thus it makes it a poor choice for image transfer protocols.
Document imaging system that stores these images have adopted CCITT compression algorithms to save disk spaces. However in age of good processing speeds and handful of memory CCITT encoded algorithms are still needed printing and viewing o data as done with adobe files. However the transmission of data through modems with lower data rates still require these algorithms.
Group 3 One Dimensional (G31D)
The main features of G31D are given as follow
G31D is a variation of the Huffman type encoding known as Modified Huffman encoding.
The G31D encodes a bi-level image of black and white pixels with black pixels given by 1 and white with 0’s in the bitmap.
The G31D encodes the length of a same pixel run in a scan line with variable length binary codes.
The variable length binary codes are take from pre defined tables separate for black and white pixels.
The variable code tables are defined in T.4 and t.6 specification foe ITU-T. These tables are determined by taking a number of typed and handwritten documents. Which were statistically analyzed to the show the average frequency of these bi level pixels. It was decided that run length occurring more frequently were assigned small code will other were given bigger codes.
As G31D is a MH coding scheme which is explained earlier in the chapter so we will give some example of the coding is carried out for longer run of same pixels. The coded tables have continuous value from 0 to 63 which are single terminating codes while the greater are coded with addition of make up codes for the same pixels, only for the values that are not in the tables for a particular pixel. The code from 64 to 2623 will have one makeup code and one terminating code while greater than 2623 will have multiple makeup codes. Hence we have two types of tables one is from 0 to 63 and other from 64 till 2560. The later table is selected by statistical analysis as explained above.
Consider a pixel run for 20 black . Hence it is less than the 63 coded mark in the table . We will look for the value of 20 in the black pixel table which is 00001101000. hence this will be the terminating code for the 20 black pixel run which is have the size of the original. Thus a ratio 2:1 is achieved.
Let us take the value 120 which is greater than 63 and is not present in the statistically selected pixel run. Here we will need a make up code and a terminating code. The pixel run can be broken into 64 which is the highest in the tables for this pixel run and 57 which will give 120 pixel run
120 = 64 + 57
64 coded value is 11011
57 coded value is 01011010
hence 120 is 11011 the make up code and 01011010 terminating code as given in the figure 2.11a.
Now consider a bigger run of black pixel which is 8800. This can be given a sum of 4 make up and one terminating code
8800 = 2560 + 2560 + 2560 + 1088 + 32
which is 000000011111, 00000001111, 000000011111, 0000001110101 and 0000001101010
so it can be given as shown in figure 2.11b
11011
1011010
Makeup code terminating code
2.11a makeup and terminating codes for 120
OOOOOOO11111
OOOOOOO11111
OOOOOOO11111
OOOOO111O1O1
1101010
makeup makeup makeup makeup terminating
figure 2.11b makeup and terminating codes for 8800