Importance Of Grid Computing

Today we are in the Internet world and everyone prefers to enjoy fast access to the Internet. But due to multiple downloading, there is a chance that the system hangs up or slows down the performance that leads to the restarting of the entire process from the beginning. This is one of the serious problems that need the attention of the researchers. So we have taken this problem for our research and in this paper we are providing a layout for implementing our proposed Grid Model that can access the Internet very fast. By using our Grid we can easily download any number of files very fast depending on the number of systems employed in the Grid. We have used the concept of Grid Computing for this purpose. The Grid formulated by us uses the standard Globus Architecture, which is the only Grid Architecture currently used

Worldwide for developing the Grid. And we have proposed an algorithm for laying our Grid Model that we consider as a blueprint for further implementation. When practically implemented, our Grid provides the user to experience the streak of lightening over the Internet while downloading multiple files.

What’s Grid computing? Grid Computing is a technique in which the idle systems in the Network and their ” wasted ” CPU cycles can be efficiently used by uniting pools of servers, storage systems and networks into a single large virtual system for resource sharing dynamically at runtime. These systems can be distributed across the globe; they’re

heterogeneous (some PCs, some servers, maybe mainframes and supercomputers); somewhat

autonomous (a Grid can potentially access resources in different organizations).

2. Grid computing (or the use of a computational grid) is the application of several computers to a single problem at the same time — usually to a scientific or technical problem that requires a great number of computer processing cycles or access to large amounts of data. According to John Patrick, IBM’s vice president for Internet strategies, “the next big thing will be grid computing.” Although Grid computing is firmly ensconced in the realm of academic and research activities, more and more companies are starting to turn to it for solving hard-nosed, real-world problems.

3.IMPORTANCE OF GRID COMPUTING: Grid computing is emerging as a viable technology that businesses can use to wring more profits and productivity out of IT resources –and it’s going to be up to you developers and administrators to understand Grid computing and put it to work.It’s really more about bringing a problem to the computer (or Grid) and getting a solution to that problem. Grid computing is flexible, secure, coordinated resource sharing among dynamic collections of individuals, institutions, and resources. Grid computing enables the virtualization of distributed computing resources suchas processing, network bandwidth,and storage capacity to create a single system image, granting users and applications seamless access to vast IT capabilities. Just as an Internet user views a unified instance of content via the World Wide Web, a Grid user essentially sees a single, large, virtual computer. Grid computing will give worldwide access to a network of distributed resources CPU cycles, storage capacity, devices for input and output, services, whole applications, and more abstract elements like licenses and certificates. For example, to solve a compute-intensive problem, the problem is split into multiple tasks that are distributed over local and remote systems, and the individual results are consolidated at the end. Viewed from another perspective, these systems are connected to one big computing Grid. The individual nodes can have different architectures, operating systems, and software versions. Some of the target systems can be clusters of nodes themselves or high performance servers.

4. BEGINNINGS OF THE GRID

Parallel computing in the 1980s focused researchers’ efforts on the development of algorithms, programs and architectures that supported simultaneity. During the 1980s and 1990s, software for parallel computers focused on providing powerful mechanisms for managing communication between processors, and development and execution environments for parallel machines. Successful application paradigms were developed to leverage the immense potential of shared and distributed memory architectures. Initially it was thought that the Grid would be most useful in extending parallel computing paradigms from tightly coupled clusters to geographically distributed systems. However, in practice, the Grid has been utilized more as a platform for the integration of loosely coupled applications – some components of which might be running in parallel on a low-latency parallel machine – and for linking disparate resources (storage, computation, visualization, instruments). Coordination and distribution – two fundamental concepts in Grid Computing.

The first modern Grid is generally considered to be the information wide-area year (IWAY). Developing infrastructure and applications for the I-WAY provided a seminar and powerful experience for the first generation of modern Grid researchers and projects. This was important, as the development of Grid research requires a very different focus than distributed computing research. Grid research focuses on addressing the problems of integration and management of software. I-WAY opened the door for considerable activity in the development of Grid software.

5.TYPES OF GRID:

The three primary types of grids and are summarized below:

5.1 Computational Grid

A computational grid is focused on setting aside resources specifically for computing power. In this type of grid, most of the machines are high-performance servers.

5.2 Scavenging grid

A scavenging grid is most commonly used with large numbers of desktop machines. Machines are scavenged for available CPU cycles and other resources. Owners of the desktop machines are usually given control over when their resources are available to participate in the grid.

5.3 Data Grid

A data grid is responsible for housing and providing access to data across multiple organizations. Users are not concerned with where this data is located as long as they have access to the data.

6.OUR PROPOSED GRID MODEL:

We are using the Scavenging Grid for our implementation as large numbers of desktop machines are used in our Grid and later planning to extend it by using both Scavenging and data Grid. Figure1 gives an idea about the Grid that we have proposed.

Cycle_stealing

CPU-scavenging, cycle-scavenging, cycle stealing, or shared computing creates a “grid” from the unused resources in a network of participants (whether worldwide or internal to an organization). Typically this technique uses desktop computer instruction cycles that would otherwise be wasted at night, during lunch, or even in the scattered seconds throughout the day when the computer is waiting for user input or slow devices.

6.PROBLEMS DUE TO MULTIPLE DOWNLOADING:

While accessing Internet most of us might have faced the burden of multiple downloading and in particular with downloading huge files i.e., there can be a total abrupt system failure while a heavy task is assigned to the system. The system may hang up and may be rebooted while some percentage of downloading might have been completed. This rebooting of the system leads to download of the file once again from the beginning, which is one of the major problems everyone is facing today.

Let us consider N numbers of files of different sizes (in order of several MBs) are being downloaded on a single system (a PC). This will take approximately some minutes or even some hours to download it by using an Internet connection of normal speed with a single CPU. This is one of the tedious tasks for the user to download multiple files at the same time. Our Grid plays a major role here.

8.CONCEPT OF OUR PROPOSED GRID:

In order to avoid this problem we have formulated our own Grid for such an access to the Internet via an Intranet (LAN). By using our Grid these large numbers of files are distributed evenly to all the systems in the Network by using our Grid.

For example we have taken into account of a small LAN that consists of around 20 systems out of which 10 systems are idle and 5 systems are using less amount of CPU(for our consideration) and their CPU cycles are wasted. And our work begins here, as we are going to efficiently utilize those “wasted CPU cycles” into “working cycles”.

FIGURE 1: LAYOUT OF OUR INTRANET GRID

8.1WORKING OF THE PROPOSED GRID:

When we are downloading multiple files using Internet the Grid formulated by us comes in to action. A dialog box will appear on the Desktop asking the user whether to use the Grid or not? If the user selects “use the Grid”, then automatically the available system resources in the Network are obtained by the Globus Toolkit. The configurations of the idle systems are noted and the highest configuration system gets the highest priority in the priority Queue.

E.g. If there is a supercomputer with 8 CPUs, another Supercomputer with 5 CPUs and some other PCs with P3-2.0GHz, P4-2.0GHz, P4-2.5GHz, P3-1.0GHz, P31.3GHz, P4-1.5GHz, P3-1.13GHz, P4-2.4GHz are found in the Network. Then the order of priority will be: 1. Supercomputer-8 CPUs, 2. Supercomputer-5 CPUs, 3. P4-2.5GHz, 4. P4-2.4GHz, 5. P4-2.0GHz, 6. P3-2.0GHz, 7. P4-1.5GHz, 8. P31.3GHz, 9. P3-1.13GHz, 10. P3-1.1GHz.

Now the user can click any number of files to download. The file size of each file is obtained and is stored in the priority Queue based on maximum size as highest priority. Now the highest priority fileis matched with the highest priority system in the Network. The files get evenly distributed to their matched “idle systems”. The downloading gets completed in those systems and these file gets stored in the common database. The authenticated user can access this database and can retrieve his file that he has downloaded.

The various processes that are taking place in our Grid such as authentication, availability of resources, scheduling, data management and finally job and resource management are viewed by following a standard architecture – The Globus Architecture.

9.EMPLOYING THE GLOBUS ARCHITECTURE IN OUR GRID:

While planning to implement a Grid project, we must address issues like security, managing and brokering the workload, and managing data and resources information. Most Grid applications contain a tight integration of all these components.

virtual_organisation

The Globus Project provides open source software tools that make it easier to build computational Grids and Grid-based applications. These tools are collectively called the Globus Toolkit. Globus Toolkit is the open source Grid technology for computing and data Grids. On the server side, Globus Toolkit 2.2 provides interfaces in C. On the client side, it provides interfaces in C, Java language, and other languages. On the client side, the Java interfaces are provided by the Java Commodity Grid (CoG) Kit. Globus runs on Linux, AIX, HP-UX, Solaris, and also on windows operating systems. The Globus architecture represents a multiple-layer model. The local services layer contains the operating system services and network services like TCP/IP. In addition, there are the services provided by cluster scheduling software (like IBM Load Leveler) –job-submission, query of queues, and so forth. The cluster scheduling software allows a better use of the existing cluster resources. The higher layers of the Globus model enable the integration of multiple or heterogeneous clusters.

10.ACCESSING THE INTRANET GRID:

When any user wants to access our proposed Intranet Grid in order to download multiple files over the Internet, then he should follow certain procedures that we consider necessary for the security of our Grid. The main Requirements for Processing in Grid

Environment are:

Security: single sign-on, authentication, authorization, and secure data transfer.

Resource Management: remote job submission and management.

Data Management: secure and robust data movement.

Information Services: directory services of available resources and their status.

Fault Detection: Checking the intranet.

Portability: C bindings (header files) needed to build and compile programs.

11.EXISTING ALGORITHM FOR GLOBUS ARCHITECTURE:

Step[1]. Create security_proxy via GSI services

Step [2]. Access a MDS-GIIS server

Step [3]. Search for required machine(s)

Step[4]. Rank the machine list based on a scheduling policy

Step [5]. Prepare the data

Step[6]. Transfer the data to the target machine by using GASS services

Step [7]. Prepare a RSL document

Step[8]. Submit the program using GRAM services

Step [9]. React to status changes from GRAM

Step[10]. Get results via GASS

Here, we have got the resources available in the Network which is automatically done by have the Globus Toolkit in the server. When we want to download a file this information has to be matched with the client module and then the downloading has to be carried out in the clients. For this we have added some modules to the Grid Architecture.

ADDED MODULE:

Step [11]. Get the Information about files to be downloaded.

Step[12]. Match the files with appropriate Machines.

Step [13]. Store files in common database.

Step[14]. Retrieval of data from database is done after proper authentication.

You’ll also see how Grid services and the very framework it all rests on is very much like object-oriented programming.

12.PROPOSED ALGORITHM FOR OUR INTRANET GRID:

Steps to perform multiple downloading on the Grid. The host details are got from the server of the LAN in order to identify the various hosts. The host information is got whenever needed on the priority queue basis.

//module for downloading files

[1]. Start lookup // look for file size and resource information

[2]. Declare nres, nfile // no of resources available and no of files

[3]. Input nres, nfiles

[4]. Input size // the file size

[5]. Initialize P1 . res info // store the resource information in priority queue P1 with highest system configuration as priority

[6]. Initialize P2 . file size // store the file information in the priority queue P2 with maximum file size as priority

[7]. If condition (nfiles == nres) // check whether the no of resources is equal to no of files

[8]. Initialize counter

[9]. For (counter =1; counter <= nres; counter++) // initialize the loop to assign the files.

[10].Assign the 1st file of P2 to the 1st node in P1.// first node will be node with highest configuration and first file will be the file maximum size.

[11].Start processing // files directed to the appropriate system for accessing their wasted CPU cycles.

[12].Loop

[13].Else:

[14].Start timer

[15].Delay . 1 min

[16].Collect incoming files // the files that the user clicked to download in this duration.

[17].Assign the files . P2

[18].Goto step 8

[19].Goto step 1

[20].End // when the user exits from proposed Grid.

13. CHANLLANGES OF GRID

A word of caution should be given to the overly enthusiastic. The grid is not a silver bullet that can take any application and run it a 1000 times faster without the need for buying any more machines or software. Not every application is suitable or enabled for running on a grid. Some kinds of applications simply cannot be parallelized. For others, it can take a large amount of work to modify them to achieve faster throughput. The configuration of a grid can greatly affect the performance, reliability, and security of an organization’s computing infrastructure. For all of these reasons, it is important for us to understand how far the grid has evolved today and which features are coming tomorrow or in the distant future

14.Job flow in a grid environment

Enabling an application for a grid environment, it is important to keep in mind these components and how they relate and interact with one another. Depending on your grid implementation and application requirements, there are many ways in which these pieces can be put together to create a solution.

15.Cloud Computing vs Grid Computing:

For some, the comparison between these two types of computing could be hard to understand since they aren’t much exclusively to each other .Rather,they are used for enhancing the utilization of the available resources.

The only differentiating factor between the two is the method it adopts for computing the tasks within there individual environments.In grid computing, a single big task is split into multiple smaller tasks which are further distributed to different computing machines. Upon completion of these smaller task, they are sent back to the primary machine which in return offers a single out put.

Whereas a cloud computing architecture is intended to enable users to use difference services without the need for investment in the underlying architecture.Though, grid too offers similar facility for computing power ,but cloud computing is not restricted to just that.With a cloud users can avail various services such as website hosting etc.

In some aspects Cloud Computing will beat Grid computing,In some aspects Grid Computing will beat Cloud Computing Technology.

16.Grid Usage :

1)Over view of AppLogic

2)Application Configuration

3)Application Provisioning

4)Application Template and Provisioning with

AppLogic 2.3.9

5)Custom Application Development

6)Application Migration

7)Hands-on Custom Appliances

8)Creating Custom Appliances Catalogs

9)Building Appliances with the new APK

10)New Linux Distro Appliances

11)Application Architecture and Development 12)Building Application Scalability

13)Creating Assemblies

14)Installing c Panel on on Applogic

15)Volume Maintainence

16)Failure Handling Recovery

17)High Availability

18)Scalable cPanel Application overview

19)Backup and Disaster Recovery Strategies

17.ADVANTAGES

Some advantages are quite obvious.

No need to buy large symmetric multiprocessors(SMP) servers for applications that can be split up and farmed out to smaller servers (which cost far less than SMP servers). Results can then be concatenated and analyzed upon job(s) completion.

Much more efficient use of idle resources. Jobs can be farmed out to idle server or even idle desktops. Many of these resources sit idle especially during off business hours.

Grid environments are much more modular and don’t have single points of failure. If one of the servers/desktops within the grid fail there are plenty of other resources able to pick the load. Jobs can automatically restart if a failure occurs.

This model scales very well. Need more compute resources just plug them in by installing grid client on additional desktops or servers. They can be removed just as easily on the fly.

Upgrading can be done on the fly without scheduling downtime. Since there are so many resources some can be taken offline while leaving enough for work to continue. This way upgrades can be cascaded as to not effect ongoing projects.

Jobs can be executed in parallel speeding performance. Using things like MPI will allow message passing to occur among computer resources.

Can solve larger, more complex problems in a shorter time

Easier to collaborate with other organizations

Make better use of existing hardware

18.DISADVANTAGES

Grid software and standards are still evolving

Learning curve to get started

Non-interactive job submission

19 CURRENT PROJECTS AND APPLICATIONS

The Enabling Grids for E-sciencE project, which is based in the European Union and includes sites in Asia and the United States, is a follow-up project to the European DataGrid (EDG) and is arguably the largest computing grid on the planet. This, along with the LHC Computing Grid (LCG), has been developed to support the experiments using the CERN Large Hadron Collider. The LCG project is driven by CERN’s need to handle huge amounts of data, where storage rates of several gigabytes per second (10 petabytes per year) are required. A list of active sites participating within LCG can be found online as can real time monitoring of the EGEE infrastructure.The relevant software and documentation is also publicly accessible.

Another well-known project is distributed.net, which was started in 1997 and has run a number of successful projects in its history.

The NASA Advanced Supercomp-uting facility (NAS) has run genetic algorithms using the Condor cycle scavenger running on about 350 Sun and SGI workstations.

Until April 27, 2007, United Devices operated the United Devices Cancer Research Project based on its Grid MP product, which cycle-scavenges on volunteer PCs connected to the Internet. As of June 2005, the Grid MP ran on about 3.1 million machines .

Another well-known project is the World Community Grid . The World Community Grid’s mission is to create the largest public computing grid that benefits humanity. This work is built on the belief that technological innovation combined with visionary scientific research and large-scale volunteerism can change our world for the better. IBM Corporation has donated the hardware, software, technical services, and expertise to build the infrastructure for World Community Grid and provides free hosting, maintenance, and support.

20.CONCLUSION:

Grid computing was once said to be fading out but due to the technological convergence it is blooming once again and the Intranet Grid we have proposed adds a milestone for the Globalization of Grid Architecture, which, leads to the hasty computing that is going to conquer the world in the nearest future. By implementing our proposed Intranet Grid it is very easy to download multiple files very fast and no need to worry about the security as we are authenticating each and every step taking place in our Grid and in particular user to access the database. Further implementations could be carried out in the nearest future.

ACKNOWLEDGEMENTS

Thanks for having a review on our paper and awaiting for your comments and suggestions.

BIBILIOGRAPHY:

[1].The Globus Alliance, The Globus Toolkit 3.0.

[2].Foster, The GRID: Blueprint for a New Computing Infrastructure.Morgan-Kaufmann, 1999.

[3].Foster, I, Kesselman, C, Nick, J.M., and Tuecke, S. The Physiology of the Grid: An Open Grid Services Architecture for Distributed Systems Integration.

WEB REFERENCES:

[1].www.wikipedia.com

[2].www.GridForum.org www.gridcomputingplanet.com

[4]. www.globus.org/ogsa.2002

[5] http://www.gridrepublic.org/

Authors Biography

A.Rajashree doing my B.Tech IT 2nd Year in Bannari Amman Institute of Technology ,Sathyamangalam,ErodeDistrict.My mail id is [email protected] .Mobile number:9965546234

Tharani.V doing my B.Tech IT 2nd Year in Bannari Amman Institute of Technology ,Sathyamangalam,ErodeDistrict.My mail id is [email protected] Mobile number:8508538449

Order Now