Transparency And Security In Distributed System Information Technology Essay

Nowadays, millions of people including you and me are using the World Wide Web for different purposes such as access information stored on Web servers situated anywhere on the glove, email, reading news, online shopping, downloading music or play games. This gives the illusion that all information is stored locally on user’s computer. Actually, the Web is representing a huge distributed system that appears as a single system [4].

Distributed system has many definitions. One of the definitions is as Coulouris described, a distributed system is: “A system which hardware and software components located at networked computers communicate and coordinate their actions only by message passing” [5]; and as Tanenbaum and van Renesse proposed [1]: “A distributed operating system is looks like an ordinary centralized operating system, but runs on multiple independent CPUs. The key concept here is transparency, in other words, the use of multiple processors should be invisible to user. Another way of expressing is to say that the user views the system as a ‘virtual uniprocessor’, not as a collection of distinct machines”.

Normally, people will give a misconception that distributed system is just another name of a network of computers when discussing about it but actually there is an important distinction [4]. A distributed system is a system that built on top of a network and hide the appearance of multiple autonomous computer and appears as single entity while a network is a medium for entities to communicates with each other to exchange of message based on well-known protocols between there entities, which are explicitly addressable. A huge number of challenges faced by designers can be reflected by what Leslie Lamport said, a famous researcher on message ordering, clock synchronization and timing. He said that “a distributed system is the one that I cannot get any work done because some machine that I have never heard of crashed”.

Issues of Distributed System

Since distributed system is a collection of independent computers that appear as single coherent system to users so designing distributed system is a difficult task which many issues have to be considered during its implementation although they are found in many applications. The issues faced in building a distributed system is depending on the requirements of the system. However, most systems will need to handle the following issues in general [1] [5]:

Heterogeneity

Various entities must be able to interoperate with each other in the system, without being affected by differences in operating systems, hardware architectures, communication protocols, programming languages, security models, data formats, and software interfaces.

Transparency

The system should appear as a single unit and the interactions and complexity between the components should be hidden from the end user.

Fault tolerance and failure management

Failure of one or more components should be isolated and should not degrade the entire system.

Scalability

The performance of the system should be enhanced by addition of a resource. The system should also work with increasing number of users efficiently.

Concurrency

Resources sharing should be made possible.

Openness and Extensibility

Interfaces must be publicly available and cleanly separated so that exist components and add new components can be extent easily.

Migration and load balancing

Allow the movement of tasks in a system without affecting the operation of applications or users and distribute load among available resources to improve performance.

Security

Ensure only known users are able to perform allowed operations to make sure access of resources is secure.

Synchronization

Ensure the synchronization between components is enabling.

In this paper, our focus is on understanding the types of transparencies, security issues, fault tolerance and synchronization issues that involved in distributed systems.

Transparency

A distributed system should be show as a single system by the users or the application programmers rather than as a collection of computers and processors which are cooperating with each other [6]. The transferring from a local machine to a remote machine should be transparent and users should be unaware of where the services are located also. An ideal system that provides transparency access to all resources have not yet exist but sub-systems which build on distributed architectures that can provide the transparency property for particular resources such as disks, memory or file does exists. There are different kinds of transparencies. The different transparencies encountered in distributed system [2] [3].

Access Transparency

The access method used to interact with an object should be hidden so users will be unknowing of the distribution of the files. The files can be provide by several totally different servers which are physically distant apart and a single set of operations should be provided to access these remote as well as the local files. The examples are File system in Network File System (NFS), Navigation of the web, and SQL queries.

Location Transparency

Physical location of the object being accessed should be hidden so that users can see a uniform file name space. Sometimes, files can be relocated without changing their pathnames. A location transparent name contains no information about the movement of the availability of service and the resources. The location and access transparencies together are sometimes referred as Network transparency. The examples that illustrate this property are the pages of the web and File system in NFS.

Concurrency Transparency

The appearance of other users should be hidden so the users will not notice they compete for and share a single resource. With concurrency transparency, users and application can access shared data or object without interference between each other. Since the true concurrency rather than the simulated concurrency of a central system, concurrency transparency requires complex mechanisms in a distributed system. The shared objects are simultaneously access. The concurrency control and its implementation is difficult. The examples are Automatic Teller machine (ATM) network and NFS.

Replication Transparency

The fact that multiple instances of the same object might be created should be hidden so users should not notice that a replicated copy of data exists. Users should also expect operations to return one set of values only. This type of transparency should be mainly incorporated for the distributed file systems, which replicate the data at more than one site for reliability. The examples that illustrate this property are Mirroring of Web pages and Distributed DBMS.

Failure Transparency

The occurrence of fault should be hidden to allow users and application programs to finish their task without being affected by the failure of hardware or software components. To achieve this transparency, fault tolerance is provided by the mechanisms that relate to access transparency. Distributed system is more prone to failures as any one of the component may fail which can degrade the service or the total absence of that service. The distinction between a failed and a slow running process is difficult as the intricacies are hidden. The system that implements this transparency is Database Management System.

Migration Transparency

The fact that object being accessed may changes its physical location should be hidden so that users will unaware of the movement of information and processes to a different physical or logical location in a system without affecting the operations of the users and the application that are running. This mechanism allows for the load balancing of any particular users which might be overloaded.

Performance Transparency

This transparency allows distributed system to be reconfigured to improve the performance as the load varies. The load variation should not lead to performance degradation and this is difficult to achieve.

Scaling Transparency

A system should be able to grow in the condition of application algorithm is not be affected. Elegant evolution and growth is very important for most enterprises. A distributed should be able to scale down to small environment where required, and be space and time efficient as required. The example is World Wide Web.

Security Transparency

A minimum of user intervention is required for negotiation of cryptographically secure access of resources, or users will circumvent the security in preference of productivity.

Security Issues

With the widespread adoption of distributed system, experts are pointing out the security issues that can hurt users in a huge way [7]. By the way, the issue of security might be the most challenge associated with distributed system. This is because those malicious adversaries can reside anywhere and everything is their potential target.

4.1 Authentication

The process of ensuring that the individual is absolutely the person who he/she claims to be is important in any computing scenario. Authentication is a fundamental step which is require before allow any entity or person to do an operation on a computer or to access system. The person, also called as principle, interacts with the system by providing a piece of information or a secret which the principal alone knows or is able to generate.

4.2 Authorization

To provide different levels of access such as deny or permit to different parts of or operations in a computing system, a common security which called as authorization is required. The type of access is dictated by the identity of the person or entity needing the access, and the kind of operation or the system part needing to be accessed. The access control can be enforced in many forms such as role-based access control, discretionary access control and mandatory access control. The forms are described as following:

Discretionary Access Control

Normally different levels of access for the same component need to provide to different principals. This is assist by capturing different permissions and privileges for different principals. A common manifestation of such permissions is extended to individuals and groups. This principal-focused access control is called as discretionary access control. This is enforced by attaching access-control lists (ACLs) with each principal. ACLs are maintained in databases or in file systems.

Role-Based Access Control

Usually typical enterprise users perform a specific role at any point of time. The access to any system and any operation is dependent on the role of the principal. For example, any principal as an administrator needs a different level of access to regular users. This access-control mechanism which is based on the role of the principal is called as role-based access control (RBAC). RBAC require mappings from a role to a group of users and maintenance of the list of roles.

Mandatory Access Control

Based on certain discrete levels associated with the principal, it is necessary to provide access to resources in many access-control scenarios. The level is also associated with resources. Access is granted if the principal’s level is higher than the level of the resource. This kind of access control is named as mandatory access control (MAC). As we do not need to keep detailed ACLs so it is simpler to enforce than RBAC. Only need to maintain a hierarchy of access control levels.

4.3 Data Integrity

In the online system, it needs to make sure that a piece of data arrives at the target destination without having been tampered with during transmission from one location to another. It also needs to be ensured that arrived data is valid, correct, sound and in line with the sender’s expectations. This requirement named as data integrity. It could be prevented from being achieved because of multiple factors such as transmission errors, viruses, deliberate tampering in transit or even problems caused by natural disasters. Common techniques to achieve data integrity are message authentication codes (MAC) and digital signatures.

4.4 Confidentiality

Confidentiality is the most important requirement in the case of business transactions. It restricts access of any information to authorized persons only and prevents others from having access to that information. Falling-out of confidentiality must be avoided to make sure that unauthorized parties do not access data. For example, we do not speak out our password loudly when we talk on a phone, or transmit passwords via any written or online medium.

Besides that, it is important that, even though it may be theoretically possible for unauthorized persons to get access to data, this is made as difficult as possible. The aim of encryption techniques is precisely to achieve this.

4.5 Availability

When authorized users need information, the information must available. This requirement of availability is to make sure any piece of information is not denied to authorize users for any reason. Availability could sometimes be thwarted because of communication or physical factors like improper communication channels or disk crash. However, from an attack perspective, the typical hindrances to availability of a service occur in the form of the popular attacks, names as denial of service (DOS) attacks.

Trust

The maxim ‘Trust then verify’ should be applied to distributed systems.

Trust has always been one of the most significant influences on customer confidence in systems, brands, services and products. Research show that trust in companies and brands has a direct correlation to customer retention and loyalty. Trust begets customer loyalty. Consumers who likely to perform a wider variety of more complicated online banking tasks, such as automated bill payment or applying for new products or services, have a high level of trust in their bank.

Surveys have show that consumers who have a high level of trust in their primary bank are loyal, are not seeking services from other institutions, with a majority not having visited another bank’s Web site. However, the surveys also clearly show that most consumers with high trust in their primary bank would stop all online services with their current bank in the event of a single privacy breach. That could translate into the potential loss of millions of customers, making even a single breach a very costly problem for banks. Gaining and maintaining consumer trust must be a priority although it is challenging. Building consumer trust in the Web channel will impact customer acquisition and retention rates.

People often tend to be transparent in a known ‘circle of trust’. It is well documented that people share their ATM PINs, e-mails, password and so on among their perceived ‘circle of trust’. However it has always been difficult for systems to be designed for this ‘perceived trust’.

Privacy

Privacy which is about the provision for any person or any piece of data to keep information about themselves from others and revealing selectively is a broader issue than confidentiality. Privacy issues are sweeping the information-security landscape, as individuals demand accountability from organizations, corporations and others that handle their data. In today’s world of off-shoring and outsourcing, customers are very concern about privacy of their personal data and enterprises are providing many to make sure that their customers continue to trust them. Consumer surveys have frequently shown that the Number 1 reported answer is to limit the sharing of personal information with third parties [9].

Many targeted e-mail phishing scams have been reported recently. It is not surprising that a survey points [9] to identity theft as the biggest customer concern in the event of a breach or violation of personal information. The recent controversies around loss of personal information such as credit card information and social security details have also been linked to antisocial activities. For example, an fraudulent employee at AOL sold approximately 92 million private customer e-mail addresses to a spammer marketing an offshore gambling Web site in 2004 [10]. The collection and management of private data is becoming increasingly regulated in response to such high-profile exploits and this cause the establishment of new laws to ensure information security. As an information-security professional, it is better for you to understand the basic of data privacy laws and know the legal requirements your organization faces if it enters a line of business regulated by these laws.

A substantial Privacy Rule that affects organizations which process medical records on individual citizens is provided by The Health Insurance Portability and Accountability Act (HIPAA). HIPAA’s ‘covered entities’ include health-care clearing houses, health-care plans and health-care providers.

The HIPAA Privacy Rule requires train employees in the handling of private information, adopt and implement appropriate privacy practices, provide appropriate security for patient records and covered entities to inform patients about their privacy rights.

Gramm-Leach-Bliley Act of 1999 (GLBA [11]) is the most recent addition to privacy law in the United States. Aimed at financial institutions, this law contains a number of specific actions that regulate how covered organizations may handle private financial information, the safeguards must put in place to protect that prohibitions and information against their gaining such information under false pretenses.

Data privacy is a rapidly-changing and complex field. The legal landscape surrounding it is fluid and subject to new legislation and interpretation by government agencies and the courts. Organizations using, creating and managing the Internet will often need to state their privacy policies and require that incoming requests make claims about the senders’ adherence to these policies. By providing privacy statements within the service policy, basic privacy issues can be solved. More sophisticated scenarios such as delegation and authorization will be covered in specifications specific to those scenarios.

A customer can state a set of ‘privacy preferences’ which could set the limits of the underlying contexts and acceptability. The customer can also decide the parameters allowing applications dealing with their personal information to act on their behalf.

Identity Management

In identity management, every person or resource is provided with unique identifier credentials which are used to identify that entity uniquely. Identity management is used to control access to any resource or system through the associated user rights and restrictions of the established identity.

For examples, identity management systems in our daily life can include our driving licenses, citizenship cards or passports. When a citizen enters a country displaying their passport, by virtue of the citizenship rights associated with that person’s identity, they can enter that country. The passport also gives them access to the resources of the country, and to perform certain operations, for example voting.

In today’s enterprise context in the management of large numbers of employees or users, identity management system is important. These systems automate a large number of tasks such as creation of user identities, deletion of user identities, password synchronization, password resetting and overall management of the identity life cycle of users. The advantage of such system is that they can leverage the identity management system data, without the need for specialized data for the new application when new applications are provisioned in an enterprise.

Some specialized identity management system functionalities include a single sign-on usability imperative, wherein a user need not log in multiple times when invoking multiple applications and can reuse the logged-in status of a previous application in the same session. Single sign-on is a key imperative for enterprise usage of identity management solutions.

Fault Tolerance

It is both an opportunity and a threat if we look at the issue of fault tolerance from the distributed systems perspective. It is an opportunity when distributed systems bring with them natural redundancy, which can be used to provide fault tolerance. However, it is a threat as extensive research has been carried out in this area to tackle the problem effectively when the issue of fault tolerance is complex. Failures can happen in transmission media and processing nodes, and because of distributed agreement.

5.1 Processing Sites

The processing sites of a distributed system are independent of each other, which are independent points of failure. While this is a complex problem for developers, it presents an advantage from the viewpoint of the user of the system. The failure of a processing site implies the failure of all the software in a centralized system. In contrast, a processing site failure means that the software on the remaining sites needs to handle and detect that failure in a fault-tolerant distributed system. This may involve switching to some emergency mode of operation or redistributing the functionality from the failed site to other, sites, and operational.

Communication Media

Communication medium which is exists in the most distributed systems can cause another kind of failure. Permanent hard failure of the entire medium which makes communication between processing sites unobtainable is the most obvious. The most serious cases is this type of failure can cause partitioning of the system into multiple parts that are completely isolated from each other and as a result, the different parts will undertake activities that conflict with each other. Failures which are irregular are more difficult to detect and correct, especially if the media is wireless in nature.

5.3 Errors due to Transmission Delays

Message delays can lead to two different types of problem. One type is the time takes by a message is vary significantly from source to destination, which is called as variable delays or jitter. The delays can cause by various factors such as congestion in the medium, intermittent hardware failures, the route taken through the communication medium, congestion at the processing sites and so on. We can assess easily when a message is lost if the transmission delay is constant. Therefore, to fix the delay values and known in advance, some communication networks are designed as synchronous networks. However, there is still the problem of out-of-date information even the transmission delay is constant. If the time required to change from one state to the next smaller than the delays experienced, the information in these messages will be out of date since message are used to convey information about state changes between components. This can have repercussions that can cause unstable systems.

Distributed Agreement

There are several types of this problem such as consistent distributed state, distributed transaction commit, distributed election, time synchronization, distributed mutual exclusion, distributed termination and so on. However, all of these reduce to the common problem of reaching agreement in a distributed environment in the presence of failures.

Synchronization

One of the most complex and well-studied problem in the area of distributed systems is synchronization. The problem of synchronizing concurrent events also occurs in nondistributed systems. However, the problem gets amplified many times in distributed systems. Absence of global shared memory, absence of a globally-shared clock in most cases and the presence of partial failures makes synchronization a complex problem to deal with. There are many issues such as collecting global states, clock synchronization, mutual exclusion, leader election and distributed transactions, which are critical and have been studied in detail in literature.

6.1 Clock Synchronization

Time is very important because it necessary to execute a given action at a given time, time stamping data or objects so that all nodes or machines see the same global state. Many algorithms for clock synchronization have been proposed, which include synchronization of all clocks with a through agreement or central clock. In the first case, the external clock or time server sends the clock information to all the nodes periodically, either through multicast mechanisms or a broadcast, and then the nodes adjust the clock based on the round-trip time calculation and the received information. In the second mechanism, the nodes exchange information so that the time clock can be calculated in a P2P fashion. Clock skew always needs to be considered when designing such a system and this noted that clock synchronization is a major issue in distributed systems.

6.2 Leader Election

Another critical synchronization problem used in many distributed systems is leader election. There are many solutions are available, ranging from the old leader forcing the new leader on the group members based on certain selection criteria, to polls or votes where the node receiving the maximum number of votes gets elected as the leader.

6.3 Collection Global State

Knowledge of the global states is especially useful in some applications, especially when debugging a distributed system. Global state is defined as the sum of the local states and states in transit in a distributed system. One mechanism is to obtain a distributed snapshot which represents the global and consistent state in which the distributed system would have been. There are many challenges in moving a process to the consistent state.

6.4 Mutual Exclusion

In some conditions, there is a required that certain processes access critical data or sections in a mutually-exclusive manner. Emulate the centralized system by having the server manage the process lock through the use of tokens is one way to tackle such a problem. Tokens can also be managed in a distributed manner using a ring or a P2P system, which increases the complexity.

Conclusion

The implementation of distributed system is very complex and has a lot of issues have to be considered to achieve it. Therefore, a lot of research which have good scope in the future in the field of distributed system since the need for this kind of system would be high.

Since distributed system is very complex in how it implemented and handles several situations, and also about the using of intrinsic details of the distributed system, so transparency is very important. With the high requirements of transparency, users need not be worried about the complexities of the design and implementation of distributed system.

Besides that, security of distributed system also is a significant challenge as I mentioned above. It is challenging because of several reasons [8]. First, the security features in an application might depend on the environment such as the type of data exchanged, the capability of the end-points of communication and in which the application is operating. Second, the security mechanisms deployed that can apply to both application and communication layers in the system, cause it difficult to understand and manage overall system security. Therefore, a policy-based framework has to develop to meet these requirements.

Order Now