Emule Peer To Peer File Sharing Information Technology Essay

EMule is a free peer-to-peer file sharing application for Microsoft Windows. Started in May 2002 as an alternative to eDonkey2000, eMule now connects to both the eDonkey network and the Kad network. The distinguishing features of eMule are the direct exchange of sources between clients nodes, fast recovery of corrupted downloads, and the use of a credit system to reward frequent up loaders. Furthermore, eMule transmits data in zip-compressed form to save bandwidth.

Each eMule client is pre-configured with a list of servers and a list of shared files on its local file system. A client uses a single TCP connection to an eMule server for logging into the network, getting information about desired files and available clients. The eMule client also uses several hundreds of TCP connections to other clients which are used to upload and download files. Each eMule client maintains an upload queue for each of his shared files. Downloading clients join the queue at its bottom and advance gradually until they reach the top of the queue and begin downloading his file. A client may download the same file from several other eMule clients, getting different fragments from each on. A client may also upload chunks of a file which it has not yet completed downloading. Finally, eMule extends the eDonkey capabilities and allows clients to exchange information about servers, other clients and files. Note that both client and server communication is TCP based. The server employs an internal database in which it stores information about clients and files. An eMule server doesn’t store any files, it acts as a centralized index for storing information about the location of files. An additional function of the server, which is becoming deprecated, is to bridge between clients that connect through a firewall and are not able to accept incoming connections. The bridging functionality increases considerably the server load. EMule employs UDP to enhance the client’s capabilities against both the server and other clients. The client’s ability to send and receive UDP messages is not mandatory for the client’s correct daily operation and it would function flawlessly when a firewall prevents it from sending and receiving UDP messages.

Figure 1.1: eMule high level network diagram

1.1.1 Client to server connection

Upon startup the client connects using TCP to a single eMule server. The server provides the client with a client ID (section 1.3) which is valid only through the client-server connection’s life time (note that when the client has high ID it will receive the same ID from all servers until its IP address changes). Following the connection establishment the client sends the server his list of shared files. The server stores the list in its internal database which usually contains several hundred thousand of available files and active clients. The eMule client also sends his download list which contains the files that it wishes to download. Section 2 provides a detailed description of the eMule client and server TCP message exchange. After the connection is established, the eMule server sends the client a list of other clients that posses files which the connecting client wishes to download (these clients are called ‘sources’). From this point on, the eMule client begins to establish connections with other clients as described in section 1.2.2 below.

Note that the client/server TCP connection is kept open during the entire client’s session. After the initial handshake transactions are triggered mainly by user activity: From time to time, the client sends file search requests which are replied by a search results, a search transaction is usually followed by a query for sources for a specific file, this query is replied with a list of sources (IP and port) from which the requester can download the file from. UDP is used for communication with servers other than the server to which the client is currently connected. The purpose of UDP messages is file search enhancement, source search enhancement and finally, keep-alive (make sure that all the eMule servers in the client’s server list are valid).

1.1.2 Client to client connection

An eMule client connects to another eMule client (a source) in order to download a file. A file is divided to parts which are further fragmented. A client may download the same file from several (different) clients getting different fragments from each one.

Read also  Computer Ethics In The Workplace Information Technology Essay

When two clients connect they exchange capability information and then negotiate the start of a download (or upload, depends on perspective). Each client has a download queue which holds a list of clients that are waiting to download files. When the eMule client’s download queue is empty a download request will most probably result in a download start (unless, for example, if the requester is banned). When the download queue isn’t empty a download request results in adding the requesting client to the queue. There is no attempt to serve more than a few clients in a given moment providing a minimum bandwidth of 2.4 Kbytes / sec for each. A downloading client may be preempted by a waiting client with a higher queue ranking than his, in the first 15 minutes of the a download session the queue ranking of the downloading eMule client is boosted to prevent thrashing. When a downloading client reaches the head the download queue, the uploading client initiates a connection in order to send him his needed file parts. An eMule client may be on the waiting queue of several other clients, registered to download the same file parts in each one. When the waiting client actually completes downloading the parts (from one of them) it doesn’t notify all the rest that they can remove him from their queues, it will simply reject their upload attempt when it reaches the head of their queue. EMule employs a credit system (see section 1.4) in order to encourage uploads, to prevent impersonation eMule secures the credit system using RSA public-key cryptography. Client connections may use a set of messages not defined by the eDonkey protocol, these message are called the extended protocol. The extended protocol is used for the credit system implementation, for general information exchange (like updates of the lists of servers and sources) and to improve performance by sending and receiving compressed file fragments. The eMule client connection uses UDP in a limited manner to periodically check the client’s status on the upload queue of his peer clients while it is waiting to start downloading a file.

1.2 Client ID

The client ID is an a 4 byte identifier provided by the server at their connection handshake. A client ID is valid only through the lifetime of a client-server TCP connection although in case the client has a high ID it will be assigned the same ID by all servers until its IP address changes. Client IDs are divided to low IDs and high IDs. The eMule server will typically assigns a client with a low ID when the client can’t accept incoming connections. Having a low ID restricts the client’s use of the eMule network and might result in the server’s rejecting the client’s connection. A high ID is calculated on the basis of the client’s IP address as described below. This section describes the client ID assignment and significance from the eMule protocol point of view. A high ID is given to clients that allow other clients to freely connect to eMule’s TCP port on their host machine (the default port number is 4662). A client with a high ID has no restrictions in its use of the eMule network. When the server can’t open a TCP connection to the client’s eMule port the client is given a low ID. This happens mainly with clients that set up a firewall on their machine denying incoming connections. A client might also receive a low ID when it the following cases:

• When the client is connected through a NAT or proxy servers.

• When the server is too busy (causing the server’s reconnection timer to expire).

1.3 User ID

EMule supports a credit system in order to encourage users to share files. The more files a user uploads to other clients, the more credit it receives and the faster it will advance in their waiting queues. The user ID is a 128 bit (16 byte) GUID created by concatenating random numbers, the 6th and 15th bytes are not randomly generated, and their values are 14 and 111 respectively. While the client ID is valid only through a client’s session with a specific server the user ID (also called user hash) is unique and is used to identify a client across sessions (the User ID identifies the workstation). The user ID plays an important part in the credit system, this provides motivation for ‘hackers’ to impersonate to other users in order to receive the privileges granted by their credits. EMule supports an encryption scheme which is designed to prevent fraud and user impersonation.

Read also  Types Of Storage Devices Information Technology Essay

1.4 File ID

File IDs are used both to uniquely identify files in the network and for file corruption detection and recovery. Note that eMule doesn’t rely on the file’s name in order to uniquely identify and catalog it, a file is identified by a globally unique ID computed by hashing the file’s content. There are two kinds of file IDs – the first is used mainly for generating the unique file ID, the second is useful for corruption detection and recovery.

1.4.1 File hash

Files are uniquely identified by a 128 bit GUID has calculated by the client and based on the file’s contents. The GUID is calculated by applying the MD4 algorithm on the file’s data. When calculating the file ID the file is divided in parts each 9.28MB long. A GUID is calculated separately for each part and then all the hashes are combined into the unique file ID. When a downloading client completes downloading a file part it calculates the part hash and compares it against the part hash sent by its peer, should the part be found corrupted, the client will try to recover from the corruption by gradually replacing bits (180kb each) of the part until the hash is calculated.

1.4.2 Root hash

The root hash is calculated for each part using the SHA1 algorithm, based on blocks sized

180kb each. It provides a higher level of reliability and fault recovery.

1.5 eMule protocol extensions

Although eMule is completely compatible with eDonkey it implements several extensions which allow two eMule clients to provide additional functionality to their users. The extensions are focused in the client to client communication especially in the areas of security and UDP utilization.

1.6 Soft and hard limits

The server configuration includes two kind of limits on the number of active users – soft and hard. The hard limit is greater equal to the soft limit. When the number of active users reaches the soft limit the server stops accepting new low ID client connections, when the user count reaches the hard limit the server is full and doesn’t accept any client connection.

2 Client server TCP Communication

Each client connects to exactly one server using TCP connection. The server assigns the client an ID which will be used for to identify the client in the rest of his session with that server (A high ID client is always assigned with his IP address). The eMule GUI client requires that a server connection will be established in order to operate. The client can’t be connected to several servers at the same time and nor can’t it dynamically change servers without user intervention.

2.1 Connection establishment

Figure 2.1: High ID login sequence

When establishing connection to a server the client may try to connect to several servers in parallel, abandoning all but upon a successful login sequence.

There are several possible connection establishment use cases:

1. High ID connection – the server assigns a high ID to the connecting client

2. Low ID connection – the server assigns low ID to the connecting client

3. Rejection session – the server rejects the client

Figure 2.1 describes the message sequence that leads to a high ID connection. In this case, the client establishes a TCP connection to the server and then sends a login message to the server. The server connects using another TCP connection to the client and performs a client-to-client handshake to make sure that the connecting client has the capability to accept connections from other eMule clients. After completing the client handshake the server closes the second connection and completes the client-server handshake by sending the ID change message.

Figure 2.2: Low ID login sequence

Figure 2.2 describes the message sequence that leads to a Low ID connection. In this case, the server fails to connect to the requesting client and the client is assigned with a low ID.

Read also  Organizational information systems and their functionalities

The server message usually contains a warning like “Warning [server details] – You have a Low ID. Please review your network configuration and your settings.”

Both low and high ID handshakes complete with the ID change message which assigns the client with a client ID for its next coming session with the server.

Figure 2.3: Reject session sequence

Figure 2.3 describes the rejected session sequence. Servers might reject sessions due to the client’s having a low ID or when reaching their hard capacity limit. The server message will contain a short string describing the rejection reason.

2.2 Connection startup message exchange

Figure 2.4: Connection startup sequence

After a successful connection establishment the client and server exchange several setup messages. The purpose of these messages is to update both parties regarding their peer’s state. The client starts by offering the server his list of shared files (see section 6.2.4), and then he asks to update his list of servers. The server sends his status and version (sections 6.2.6 and 6.2.2) and then sends his list of known eMule servers and provides some more self identification details. Finally the client asks for sources (other clients that can be accessed to download the files in his download list) and the server replies with a series of messages, one for each file in the client’s download list, until all the sources list has been downloaded to the client.

2.3 File search

Figure 2.5: File search sequence

The file search is initiated by the user. The operation is simple, a search request (see section 6.2.9) is sent to the server which is then answered by a search result (section 6.2.10). When there are many results, the search result message is compressed. Next, the user chooses to download one or more files, the client then requests sources for the chosen files and the server replies with a list of sources (see 6.2.12) for each of the requested files. An optional server status message may be sent by the server just before the found sources reply. The status message (section 6.2.6) contains information about the current number of users and files supported by the server. An important note is that there is a complementary sequence of UDP message which enhances the ability of the client to locate sources for his search list for more details see section 3. After verifying that sources are new, the eMule client initiates a connection attempt and adds them to its sources list. The order in which sources are contacted is the order in which they were received by the eMule client.

The eMule client connects to sources by the order they were added to its list. There is no

priority mechanism to decide to which source to connect. There is a complicated mechanism to resolve situations where the same source can be requested for downloading several files on the client’s download list (Note that eMule allows only a single upload connection between clients). The selection algorithm is based on user priority specification and defaults to alphabetical ordering when no priority is specified. A details description of the handling a source which can upload more than a single file is described in the website.

2.4 Callback mechanism

Figure 2.6: Callback sequence

The callback mechanism is designed to overcome the inability of low ID clients to accept incoming connections and thus share their files with other clients. The mechanism is simple, in case a clients A and B are connected to the same eMule Server and A requires a file that is located on B but B has a low ID, A can send the server a callback request (see section 6.2.13), requesting the server to ask B to call him back. The server, which already has an open TCP connection to B, sends B a callback requested (section 6.2.14) message, providing him with A’s IP and port. B can then connect to A and send him the file without further overhead on the server. Obviously, only a high ID client can request low ID clients to call back (a low ID client is not capable of accepting incoming connections).

Order Now

Order Now

Type of Paper
Subject
Deadline
Number of Pages
(275 words)