Bmp Images With Lsb Embedding Information Technology Essay

Technological advancement in computer technology has opened new horizons for human progress along with creating new criminal opportunities. These computer based crimes are difficult to handle due to their rapid growth. Information hiding in images is a relatively new forensics challenge. Information is embedded in images using different techniques. Least Significant Bit technique is the most common technique. This technique uses different methods like matching LSB and replacement LSB. Windows bitmap (BMP) images are the most common carriers of hidden information due to their format and lossless attribute. Steganalysis techniques are used for finding stego images and getting the hidden messages. In this paper we will focus on the Steganalysis using 32 bit BMP images and their forensic analysis from a software programmer’s perspective. A forensics model for Steganalysis is presented and validated through Steganalysis Software developed during this research. Our model consists of format comparison, metadata comparison, massage to image size ratio, mean to message size ratio, Standard deviation of color values to message size ratio, unique colors analysis and hash value differences.

Keywords- Computer forensics, Digital evidence, Steganalysis, Information hiding in images

Introduction

In the current modern era, computer technology has strong impact on each and every angle of human’s daily life. Criminals had thrown their complex and widespread web on the computer technology. In order to deal with these crimes and to prevent them in future, legally acceptable evidence is the core objective of computer forensics.

The term “Computer Forensics” was used back in 1991 in the first learning session held by the international association of computer specialists (IACIS).

Forensic means evidence to court and simply computer evidence means computer related evidence finding that is admissible in the court of law. Computer forensic involves complex nature of evidence. It makes computer forensic weaker in the eyes of law [1]. There are many inter-organizational and legal issues involved in investigation. Evidence collected from a computer device must be legally acceptable and it requires a dedicated, flexible and comprehensive forensics framework.

Information hiding is not a new technique, used by the criminals. In Greek history, information was hided on the shaved head using tattoos engraved and then letting the hair grow to hide the information.

In modern digital world steganography was introduced to hide information using digital images. This information may be text, signs or secret images within images. This information hiding is called steganography.

There is another relative but a different term known as cryptography. Cryptography is the art of making information unreadable. Main component of cryptography is a ‘key’. It is a key which is required to encrypt or decrypt information. In cryptography information is not hidden it is just unreadable. On the other hand, in steganography the information is hidden and not unreadable. Worst scenario for forensics experts is when hidden information is also encrypted.

Neil and Sushil [2] in 1998 suggested the ways to over right the image bits to destroy the hidden message embedded. In computer forensics destruction of data is not permitted because ultimate goal is not to destroy but to preserve data to present it before the court of law. [3][4][5]

Ahmad [6] in 2007 expressed that it is very difficult for forensic analyst to detect and retrieve hidden message without destroying it.

A. Westfeld and A. Pfitzmann [7] in 2000 suggested a method on the basis of statistical analysis of pairs of values that are exchanged during message embedding.

Andrew D Ker [8] in 2005 proposed Histogram characteristic function with adjacency in gray scale images.

Andreas W, Andreas P [9] in 1999 proposed a method to attack all the stegnographic tools. He used statistical approach of performing these attacks.

Tao Zhang, Xijian Ping [10] in 2003 proposed a method using histogram as a measure of weak correlation between LSB plan and remaining bit planes.

Shreelekshmi R1, Wilscy M2 and C E Veni Madhavan [11] in 2010 proposed the method of Improved LSB replacement technique without knowing the cover image on the basis of statistics of image file.

R. Chandraumuli, N M [12] in 2005 proposed a method of false alarm detection of stegnographic images in term of the number of bits.

Xiaoyi Yu, Tieniu Tan, Yunhong Wang [13] in 2005 proposed a method on the basis of estimation and detection.

N Meghan than and L Nayak [14] in 2010 summarized the concept of Steganalysis in audio video and images.

R. Chandramouli [15] in 2002 proposed a mathematical way to Steganalysis with active and passive approaches.

Roshidi D [16] in 2006 proposed a method for detecting image capability of carrying a hidden message based on the file size and other image attributes.

Ismail A, Mehdi K, Nasir M, Bulent S [17] in 2005 proposed a technique using binary similarity and correlation between them to detect a stegnographic image.

Problem defined

BMP images are comprised of pixel matrices. Each pixel contains 4bytes (32 bits) of data. Each pixel represents a different color value. These colors are formed by the combination of three base colors i.e., Red, Green, Blue, where each base color is represented by one byte. Fourth byte is the value for alpha factor. This color mechanism for a 32 bit pixel depicted as RGBAX Length notation. All color combinations are used to make 2n new color combinations.

Following table shows the bit placement in the four bytes of one pixel in 32 bit BMP images. Alpha takes 5 bits of the last byte while R, G and B values are covered in full one byte (8 bits).

Alpha

00011111

00000000

Red

00000000

11111111

00000000

Green

00000000

11111111

00000000

blue

00000000

11111111

Colors

Bit numbers

31â€¦â€¦â€¦.

â€¦â€¦â€¦â€¦.

76543210

Table 1: Bitwise color allocation using RGBAX Notation.

In 32 bit BMP

To store a hidden message we store each bit of each character separately in the last bit called Least Significant Bit (LSB) of all the four bytes of a pixels.

10111110

00001001

01001001

01010010

01001001

10111110

01010010

00001001

LSB encoding technique is widely used because change in LSB does not effectively change the picture quality. In order to store one alphabet of 8 bits we need 8 bytes of image in which we want to hide a message. To get back the hidden message reverse process is operated.

To calculate the maximum possible size of hidden message we have to look at the format structure of 32 bit bmp images.

14 bytes of 32 bit bmp file header

40 bytes of windows popular DIB header

Pixel matrix containing color valuesImage Size = 14 bytes of file header +

40 bytes of windows popular DIB header +

((image width/32)*4)*((image height/32)*4)

Maximum message size = (image size – 54)/ 8

There are different techniques used for embedding and retrieving hidden messages. LSB matching is the simplest of the techniques.

In LSB matching least significant bits are simply replaced by the message bits without any other bit manipulations.

If (LSB! = Message Bit)

{

LSB=Message Bit;

}

Else {move to next bit ;}

LSB replacement is the better technique than LSB matching. In LSB replacement bits are not just replaced with the message bits but also arithmetically changed by a message key. This is like encryption plus steganography. Key is used to change the hidden bits stored secretly, so it’s a double sword technique.

If (LSB! = Message Bit)

{

LSB=Message Bit;

LSB= LSB (Exclusive OR) Key Message bit;

}

Else {move to next bit ;}

LSB random is the most powerful technique in which hidden message is not only replaced and encrypted but its position is also randomized and scattered in the image.

If (LSB! = Message Bit)

{

LSB=Message Bit;

LSB= LSB (Exclusive OR) Key Message bit;

Put LSB in Key generated Byte location;

}

Else {move to next bit ;}

In this technique both bit replacement and LSB positioning is done with key. If key is unknown than forensic experts rely on the other aspects of the images Steganalysis, because no key can be generated by any means in this case unless it is known.

Formulation of solution:

Detecting the possibility of hidden message is the greatest challenge in Forensics Steganalysis. Our model comprises of levels of Steganalysis:

Detection of stegnographic images in 32 bit bmp format

Un hiding the hidden message

For the first level of forensics Steganalysis, a 6 steps process was adopted.

Get the exact copy of the image

In computer forensics first step in forensics process is to preserve the actual suspected object or system. Hence in Steganalysis of images, actual image must be preserved and all the operations must be done on the exact image copy of the stegnographic picture.

In pseudo code:

Bitmap Image Copy = new bitmap (original image)

Send to Steganalysis Process (Image copy);

Check format vulnerability

There are many popular digital picture formats like jpeg, gif and bmp. Some picture formats are compressed and some are not. In compressed picture formats, one can not hide a message with security. For example in jpeg Discrete Cosine Transformation (DCT) compression technique is used. This technique works as a signature when we talk about stenography. It means when a slight change in the data of jpeg image is done, signature format comparison easily detect that there is a hidden message inside.

In pseudo code we write:

Switch (image format)

{

Bmp: output (“high possibility of steganography”);

Jpeg: output (“low possibility of steganography”);

}

BMP is a format in which pixel matrix is used which means data is present in the bits form. Similarly in 32 bit bmp images, there is no compression techniques is used. Another perspective is that in jpeg compressing and uncompressing process some data is loosed. That is why jpeg is a loss full data compression image format.

Compare meta data

Meta data from the bmp file header and windows popular DIB header is very important in detecting weather the file is stegnographic or not. Image sources information like owner name, creation date, and last modified date and image size are very important in guessing either the image has changed for embedding hidden message or not.

Color noise calculation

RGB colors generate 2N colors in the 32 Bit BMP images. Where n is the number of bits per pixel (bpp). It means there are 232 pixel color combinations are possible. Every image comprises of number of very small amount of unique colors. This unique color to total number of pixels ratio lie in between 1:2 to 1:6 in 32 bit BMP images in high quality images.

Hash value check

To authenticate the originality of the object or system hash values are used. Message-Digest Algorithm (MD5) hash value is the most widely used hash algorithm generated value to authenticate the authenticity if the software and files. Hash values are hexadecimal 32 length values which depict the total bit code of the file. A slight change in the binary coded data results in the huge difference of the hash value.

Standard deviation / Mean / Median value analysis

In statistics median gives the values generated by separating higher half of the sample from a lower half. In images, color values are sample which result in the median difference. Following formula is for standard deviation with similar probability of all sample data value.

Here, x stands for the LSB carrying bit with finite number of N Bytes. The data set has finite number of values and each bit has the same probability of being a bit of the hidden message.

Increase in the message embedding capacity decreases the stegnographic security. For example for a message of 100 characters there is more chance of hidden detection then a message of 200 characters, almost double. It means Message Stego points are directly proportional to message size.

Steganalysis and steganography have almost the same relationship as that of encryption and decryption. Decryption is the reactive concept which means first encryption techniques comes and then decryption follows it. Similarly in steganography images are embedded with hidden messages and Steganalysis follows it. There is no better option than using a signature compression algorithm like Jpeg format has which almost make it if not impossible than difficult to hide messages.

To un-hide a secret hidden message we implemented a different technique that contains all the Steganalysis approaches and some improved processes. Our technique named “ZABsteg” is a combination of LSB matching, replacement and random algorithms along with image transformation techniques.

Figure 1: Stegnographic key message in rotated cover imageIn this technique suspected image is transformed (rotated and flipped) with all four dimensions and then all the algorithms are applied using LSB matching, LSB replacement and LSB random. The key message most likely resides in the starting bits of the images and there is a possibility of image transformation before or after message embedding.

Loop until transform image ends

{

If (LSB! = Message Bit)

{

LSB=Message Bit;

LSB= LSB (Exclusive OR) Key Message bit;

Put LSB in Key generated Byte location;

}

Else {move to next bit ;}

}

Loop end;

Validation

MD5 Hash value results have shown that if hash value of the original image is known than it is obvious from the difference of both hash values that image has been altered after the stegnographic process.

2b4932b85b1f29bb832546f9c9c1c599

MD5 Hash value before Stegnographic process

These value are not effected by the size of the hidden message even a change in the single pixel, changes the MD5 hash values.

514d821bc7e64e9e2d775f4aca5b9009

MD5 Hash value after Stegnographic process

Before the start of the Steganalysis process, these values alerts a forensic expert of something is altered and this alteration is may be a stegnographic implementation of the hidden message.

Mean values of the pixel matrixes have shown that with increase in the length of hidden message, mean values are decreased. This helps in detecting that when mean values are high there is more probability of the hidden message presence.

By getting the Standard deviation for the color matrix data sample, it is experimentally observed that the resulting value from the standard value increases with the increase in the size of the message.

Noise in the image puts an effect on the values of mean and Standard deviation. Noise is calculated by the number of unique pixels in the images and the total size of the images. On the other hand the size of the image has not effected by the size of the hidden message. Same image was used in this regard and message of different sizes was embedded inside that image. Experiment shows that by the increase or decrease in the message size does not put any effect on the image size of that #2 bit bmp image.

Interpretation of Results

Hash values provide initial possibility in finding the authenticity of image files especially when original images hash value is known.

Size of the message does not change the size of the original image because image height, width and bits per pixel values are not changed in the embedding process. Message size in number of characters used is always lesser than the total number of bytes (54 bytes of header are not included).

Mean of the pixel array decreases with the noise produced by the embedded hidden messages.

Calculation of Color samples Standard deviation results are in the decrease when hidden image size increases.

Preservation of original image saves forensics result from any change that may occur in Steganalysis process. So the authenticity of the original suspected image remains intact.

Conclusions

To make Steganalysis accurate the implemented process model gives very good results. All steps in the Steganalysis process from original image preservation to hash value analysis helps in determining whether an image has hidden message or not. Using LSB ZABsteg which is the combination of LSB matching, LSB replacement and LSB random is an effective technique to unhide the hidden message. On the basis of color values in the pixel matrix mean and standard deviation calculations provide us the probability of the presence of embedded message.

Future Directions

There are many future possibilities in Forensics Steganalysis with respect to research. A lot of research is needed for retrieving the hidden message. These research areas are

Retrieving hidden message, embedded in more than one image files.

Retrieving hidden message from the color combinations instead of color bit values.

Retrieving hidden message from color patterns.

Creating compressed image formats with signature of bitmap pixel data.

Order Now