The History Of Reverse Engineering Information Technology Essay
Reverse engineering most probably starts with Dos disk operating system based computer games. The aim is to have full life and armed for the player to finish the final stage of the game. In that way the technique of reverse engineering came in to picture, it is just to find the memory locations where the life and number of weapons are stored and modifying the values of that memory locations. So that, the player can changes the values and get through the final stage. That’s why memory cheating tools such as memory came in to existence.
Reverse Engineering:-
Reverse engineering is the process of the understanding the particular aspects of the program, which are listed below
To identify the components of the system and the interrelationship between the components.
And enhance the components of the system and to improve the performance and scalability of the system (or) subsystem.
Software reverse engineering is a technique that converts a machine code of a program (string 0’s and 1’s usually sent to logic processor) back in to the programmable language statements which is called as source code. Software reverse engineering is done to get the source code of the program because to know how the particular parts of the program performs particular operations in order to improve the program functionality or to fix the bugs in the program or to find malicious block of statements in the software if any. Generally, this reverse engineering will take place in older industries on machines. But now it is frequently used on computer hardware and software. The important contents like data formats, algorithms what the programmer used to implement the software and ideas of the programmer (or) company will be revealed to the 3rd person by violating the security and privacy issues using the technique reverse engineering.
“Reverse engineering is evolving as a major link in the software lifecycle, but its growth is hampered by confusion” (Elliot J.chilkofsky & James H.Cross ii, Jan 1990).
Reverse engineering is generally implemented to improve the quality of the product, to observe the competitors products.
Forward engineering is the process of moving from the high level abstracts (or) from the initial requirements stage (objectives, constraints and proper solution to the problem), logical, and independent designs (specification of the solution) to the final product i.e. implementation (coding and testing).; whereas the reverse engineering is the process of moving from the final product to the initial requirements stage in order to under the system logically, why particular function (or) action is being performed. By knowing the system logically, the flaws and errors in the system can be rectified and helps to improve the systems functionality when the source code of the application is not available. For this sake the concept of the reverse engineering techniques is evolved.
Fig 1: reverse engineering and related process are transformations between or within the abstract levels, represented here in terms of life cycle phases. (Elliot J.chilkofsky & James H.Cross ii, Jan 1990)
Reverse engineering in and of itself doesn’t mean changing the subsystem or developing the new system based on the existing. It is a process of examination (or) understanding the program (or) software but not replication (or) change. Reverse engineering involves very broad range of aspects such as starting from the existing implementation, recreating or recapturing the design ideas and extracts the actual requirements of the existing system. Design recovery is the most vital subset of the reverse engineering because in which knowledge of the domain, external (or) outer side information and deduction or fuzzy reasoning are added to the investigated (or) subjected system in order to find the high level abstract of the system, normally which is not obtained by directly observing the system. According to the Ted BiggerStaff: “design recovery recreates design abstractions from a combination of code, existing design documentation(if available), personal experience, and general knowledge about problem and application domains. Design recovery must reproduce all of the information required for a person to fully understand what a program does, how it does it, why it does it, and so forth. Thus, it delas with a far wider range of information than found in conventional software-engineering representation of code.” (T.J. Bigger Staff, 1989).
Re-engineering is termed as renovation and reclamation, is the examination and altering the subjective system again to construct in the new form and the implementation of the new system. Re-engineering involves some form of reverse engineering i.e. to obtain the high level of the abstract of the existing system followed by forward engineering. This may be changes according to the new requirements that were not previously implemented in the system. While re-engineering is not super type of the forward engineering and reverse engineering but it uses the forward engineering and reverse engineering.
Objectives:-
The primary goal of the reverse engineering is to enhance the overall comprehensibility of the system for the both maintenance and new development.
Cope up with the complexity. In order to meet the complexity and shear volumes of the system we have to develop a better methods i.e. automated support. In order to extract the relevant information reverse engineering methods and tools should be combined with the CASE environments. So that decision makers can control the process and product in system evolutions.
Alternative views should be generated. Comprehension aids such as graphic representation as been accepted for long time. However maintaining and creating them is becoming difficult in the process. Reverse engineering facilitates the generation or regeneration of the graphical representation in the other forms. While many designers work on single diagrams such as data flow diagrams where as the reverse engineering tools will give the other graphical representations such as control flow diagrams, entity relation diagrams and structure charts to aid the review and verification process.
To identify the side effects. Both haphazard initial design and intentional modifications to the system can lead to unintentional ramifications and side effects that affect the system performance. Reverse engineering can provide better observation than we can observe by forward engineering perspective. So it makes us to solve that ramifications and anomalies before users intimate them as bugs.
Component reuse. Software reusability is becoming the more essential part in developing the new products in the software field. Reverse engineering can be able to help to detect the candidates for reusable components from the present system.
To recover the lost information. When the continuous evolution of the long lived system which will lead to loss of information. In order to preserve the old information of the system design; “design recovery “of reverse engineering techniques is used.
Many reverse engineering tools try to extract the structure of the legacy systems with the intension to pass this information to software engineers in order to re-engineer or to reverse engineer the existing component.
Code reverse engineering:-
During the evolution of the software, many changes will apply to the code, to add any functionality which is to be added and to change the code in order to rectify the defect and enhance the systems performance (or) quality. Systems with the poor documentation only the code will be reliable solution to get information about the system. As a result, the process of reverse engineering is focused on understanding the code.
Thus reverse engineering has good and bad ends.
Obfuscation:-
Java provides platform independence to the software programs so that software programs will run independently on any platform. All the programs are compiled in order get intermediate code format i.e. class file format. A class file consists of very large amount of information regarding the program methods, variable and constant enough to do reverse engineering. When a company develops the program (or) software in java and sell this product in intermediate code format to the other organization by not giving the original software. The organization who buys the program (or) software will simply change (or) modify the software by violating the security and privacy issues of authorised company; by simply applying the reverse engineering technique. This reverse engineering will be done by the software developers, automated tools and decompilers. Java byte code can be easily decompiled, which makes reverse engineering technique easier in java.
In programming context Obfuscation is described as, making program code more difficult to read and understand for security and privacy purposes of the software. Decompilers can easily extract the source code from the compiled code, in that point of view protecting the code secretly will make impossible. So the growth of obfuscators increased rapidly in order to keep effectively smoke screen around the code. Code obfuscation is the one of the most prominent and best method to protect the java code securely. Code obfuscation makes program to understand difficult. So that code will be more resistant to the reverse engineering.
There several obfuscation techniques to prevent java byte code from decompilation.
For example consider a set of class files, S, becomes another set of class files S’ through an obfuscator. Here the set of class files of s and s’ are different, but they produce the same output.
Example:-
class OHello {
public OHello() {
int num=1;
}
public String gHello(String hname){
return hname;
}
when the above code is passed through the simple obfuscator (such as Klass Master), the following code will be generated.
class aa {
public static boolean aa;
public aa() {
int aa=1;
}
public String aa(String ba){
return ba;
}
By observing the above code the class name OHello is changed to the aa and the gHello method name is changed to the aa. It is more difficult to read the program with aa than a OHello. By this way less information will be interpreted and understand to the reverse engineers. This is just a simple example by renaming the class variables and class method names.
Obfuscation techniques:-
One way of obfuscating the source program by the obfuscators is replacing a symbol of a class file by illegal string. The replacement might be the private are even worst ***.
Other techniques usually obfuscator will use targeting the specific decompilers (Mocha and Jode) is inserting a bad instruction in the code.
The example is
Let us taken an example with bad instruction, let’s take the original code (decompiled):
Method void main(java.lang.String[])
0 new #4
3 invokespecial #10
6 return
and after obfuscation the code is as follows (names are not changed, not to make complex):
Method void main(java.lang.String[])
0 new #4
3 invokespecial #10
6 return
7 pop
By observing the above routine we notice that a pop instruction is added after the return statement. The last and final statement in the method that has return type should be return statement, but in the above routine a pop keyword is inserted which make the routine not to be executed for ever.
Layout obfuscation:-
Layout obfuscation dealt with changing the layout structure of the program i.e. done by 2 basic methods
Renaming the identifiers
Removing the debugging information.
Above 2 will make program code less informative to the reverse engineers. Layout obfuscation techniques use the one way functions such renaming the identifiers by random symbols, removing the comments, unused methods and debugging information. Though the reverse engineers can understand the obfuscated code i.e. done by layout obfuscation, it consumes the cost of reverse engineering. Layout obfuscation techniques are most commonly used in the code obfuscation. All most all obfuscators of java will use these techniques.
Control obfuscation:-
Changing the control flow of the program. It is easiest way to do and which make reverse engineer to find the code what exactly. For example consider a code in which a there is a method A(). Here another new method called A_Dummy() will be created and in the program
If(Predicate)
A_Dummy()
Else
A();
End;
The main aim of doing this is making predicate always true and gets the routine A() to be executed. Which makes the reverse engineer to understand the code?
Data Obfuscation:-
Data obfuscation mainly deals with breaking up the data structures used in the program and encrypting the literals. This includes changing the inheritance, restructuring the arrays, making the variable names constant etc. In that way data obfuscation affect the data structures of the program. Thus data obfuscation make impossible to obtain the original source code of the program.
More viable source code obfuscation methods are based on composite functions, which are Array Index Transformation, Method Argument Transformation, and Hiding Constant. The obfuscation techniques that are based on composite functions make the computation complex and extensive use of these techniques make the software to respond slowly. Some source code obfuscation methods are directed at the object oriented concept; Class Coalescing, Class splitting, and Type Hiding. Other source code obfuscation techniques may include; false refactoring, restructure arrays, inline and outline methods, clone methods, split variables, convert static to procedural data, and merge scalar variables. The obfuscation techniques that work over object oriented concept and other techniques like restructure arrays, split variables, merge scalar variables may distort the logic of the software, so these must be carefully used. The employment of obfuscation technique like outline methods, clone methods, convert static to procedural data increase the size of a class file without providing any significant advantage. In lining a method results in an unresolved method call when some other class calls the in lined method.
Advanced obfuscation techniques for byte code:-
There are several obfuscation techniques to prevent java byte code from decompilation. Many of these tools are simply to change the names of the identifiers with the meaningless names which are stored in byte code. Many crackers can understand the actual source code, even though identifier name are changed, but it will take more time to understand.
Traditionally, when a program is compiled to machine code, most of the symbolic information will be stripped off, after the compilation of the program. When the program is compiled, the address of the variable and functions of the program will be denoted by the identifiers. Even though decompilation of such compiled code is difficult, but still it is possible to decompile the code.
We say protection techniques are difficult if and only if the time and effort taken by the cracker to crack the software should be with more cost and effort. Cracking time to crack software is more than a re-writing a program, then it’s of no use and waste of time and valueless.
Java became the most popular because of benefits that it is providing. One of the major benefits is portability i.e. compiled program can run on any platform i.e. platform independent. When the program is compiled it produces independent byte code. Java uses the symbolic references rather than the traditional memory addresses. Therefore, the names of methods and, variables and types are stored in a constant pool with in a byte code file.
There are many commercial decompilers (P & C, 2001, Vliot 1996, hoeniche 2001 etc.). When the program is decompiled, it extracts the program almost identical to the source code. Making use of decompiler to extract the source code becomes the lethal weapon to intellectual property piracy.
Obfuscation technique is used to stop decompilation of the byte code. The main aim of obfuscation technique is to make decompiled program harder to understand i.e. more time and effort to understand the obfuscated code.
Obfuscation scope:-
Java application consists of one or more packages. A programmer might divide the program in to packages. He can also use the packages that are in standard library and proprietary libraries. Only the part of the program developed by the developer will be given outside. The proprietary library is not distributed due to the copyright restrictions. Obfuscation scope termed as the part of the program obfuscated by the obfuscation techniques, i.e. the part of the program/software developed by the developer is protected not the entire software. The package that serves as the utilities for the standard library and proprietary libraries not obfuscated.
Candidates considered for identifiers scrambling:-
An identifier will denote the following terms in java
A package
A top level type (either class (or) interface)
A nested type (either class (or) interface)
A field
A method
A parameter ( of a method (or) constructer (or) an exception handler)
A local variable
After compilation not all the above 7 will be kept in byte code file, only the identifiers 1 to 5 from the above list are stored in the byte code file. By default local variables and parameters are removed from the byte code. The names of the local variables and parameters are stored in the LocalVariableTable in the byte code, if the debug info is enabled. But, by default the de-bug info is enabled in java compiler. If the local variable is not found, decompilers itself create the names for local variable and parameter, which makes program after reverse somewhat understandable. Even, if we rename the names of the variables and parameter in LocalVariableTable, good decompiler will simply ignore the re-named names and creates the new names, decompile and extract the program same as the actual program.
Since the parameter and local variables are not treated as identifier by describing the reasons in the above paragraph, because decompiler are to successfully extract the source code by simply creating the new names.
Copy right issues:-
Reverse engineering helps us to learn the programs structure and logic of the program i.e. how a particular function is performing a particular functionality. Thus by understanding the programs logic, everybody can change the logical flow of the program. Technically this is called as patching, because it involves in placing the new code over the original code, like a patch on a clothes. Patching allows the reverse engineer to add some additional code to original code which may change the functionality of a particular method how it works. Thus it enables us to maintain the secret code, deleting the particular function (or) disabling the functionality of the particular method or class and fixing the security bugs without the source code.
Because reverse engineering involves in reconstructing the code, it will come under intellectual property law. Software companies thus fear of reverse engineering technique because their secret algorithms and methods will be directly revealed to the outside people than external observation through machines, which they might copy and use them.
Reverse engineering can be used to remove the copy right issues or copy right schemes part of the source code from the software. Patching software to delete (or)
defeat the copy right schemes or digital management rights are illegal. But reverse engineering is not an illegal. The main reason software vendors forbid about reverse engineering is that, their secret code is revealed to the external persons, but this seems to be a bit silly because the person who understands the compiled code is already understood the problem. In order to prevent this not to happen, some encryption technologies has to be applied on the secret code parts of the program.
Software companies forbid of the reverse engineering because any researchers can find the security flaws in their code and can give this buggy information to the people. This may lead to the bad image on the software companies and stops the reputation of the company. If reverse engineering is made illegal, then researchers stops checking the quality of the code produced by the company without examining the code. In that situation people has to accept that software is fully secured even though it is not much secure and correct code.
Software security:-
In the present market, the entire software programs are protected by various techniques. Some software’s are accessible to the users if and only if, they are registered with the software products. Reverse engineering is the technique which allows removing the protection on the program called as Cracking.
In general terms’ cracking is termed as “when we develop a software program, we build the executable file from the source code. Reverse engineering is a technique, which allows extracting the source code from the executable file. By using the reverse engineering techniques, we can understand, in what way the program is performing particular action and can bypass the protection. In simple terms reverse engineering is termed as the making the program to work in the way reverse engineer wants, than it was originally intended to work.
Various software protections
Hard coded serial
Serial number, name protection
Nag screen
Time trail
Dongle(hardware protection)
Commercial protection.
Hard coded serial:-
This technique is the simplest technique, in which one serial key will be given to all the users. When the user enters the given serial key, the software product checks itself to the original key using the algorithms, and if the user enters the correct key then the software will be successfully registered otherwise it won’t work.
Serial number with name protection:-
In this technique user has to enter both the serial and name. Same as hard coded serial, user entered serial key and the original serial key is checked, no which is derived from our name using the same algorithm. This protection is sometimes easy and difficult, based on the programmer’s usage of algorithm. This kind of technique is seen in WinZip.
Nag screen:-
In this protection technique, every time when a user starts the application a window will appear showing the no of days subscription left (or) you should activate your software (or) any some other information will be displayed. This is hard to remove. This is somewhat difficult to be new comers to understand as programmers find it difficult to under standard. This is used by the WinZip.
Time trial :-
According to the +ORC, this following kind of protection techniques are used
Cinderella protection, in which a predetermined amount of the days is given, says 60 days from the starting day of the installation.
‘Count down’ time predictions, in which the some amount of time, say 5 mins/sec or given to the user to use that application after that it will ask for the product registration. Mostly we will see this in game applications.
To have a particular finish date independent of starting date, i.e. ‘BEST_BEFORE’ protection date.
To a predetermined times only user can execute or use the application. It is strictly time independent, but dependent on how many times user executes the program.
Dongle protection:-
Dongle protection is the toughest technique to crack. This protection uses EPROM, which is connected to the port of the computer. When the person wants to access the software or program, first it checks the User ID and Hardware ID i.e. 2 unique Ids which are not changeable. If the user gives the correct Ids then the user can be able to access the program or software. In this some RSA algorithm is used for data protection. This kind of the protection is difficult to implement, so it is implemented places where the software and programs are more important. This protection is implemented by the I/O LPT hardware; you will need the registered card attached to the pc’s parallel port, in order to access the complete software or program otherwise it won’t be accessed. HASP / sentinel are mostly commonly used dongles. DLLs and VxD are used by the dongle to check “is registered”.
Commercial protection:-
Most of the software programmers don’t want to spend their time on developing the security algorithms for their software, which is time consuming. Here programmers are taking equal or more time to develop the security algorithms for their software, which the time consumed to develop the actual software. Here comes the need of the commercial protection, mainly instead of developer developing the security algorithm or software for the software to be protected. There are several companies which will develop the security algorithms (or) software for the software (which has to be developed). The companies that are using the commercial protection are macromedia and Symantec. The commercial protection makes the fully functional software into unregistered version i.e. the software is not exposed to the outside world until they are registered with the software. After the successful registration with the software, then the functionality of the software will come into picture to the user (or) company who wants to use the software.
Other protections:-
The other most common types of protection for the software’s are by disabling the certain functions in the software and cd-rom protection. The cd-rom protection will be known by many of the computer users, when we keep cd only, the program functionality can be executed. Even though, the content of the cd is saved in our pc. This kind of cd-rom protection will be mainly applicable to the games.
The other kind of the software protection is disabling the functions such as we can’t save our work on the pc and even we can’t use any functions.
Order Now