The Transaction Oriented Middleware

Middleware is a class of software technologies designed to help manage the complexity and heterogeneity inherent in distributed systems. It is defined as a layer of software above the operating system but below the application program that provides a common programming abstraction across a distributed system. In doing so, it provides a higher-level building block for programmers than Application Programming Interfaces (APIs) such as sockets that are provided by the operating system. This significantly reduces the burden on application programmers by relieving them of this kind of tedious and error-prone programming.

Middleware frameworks are designed to mask some of the kinds of heterogeneity that programmers of distributed systems must deal with. They always mask heterogeneity of networks and hardware. Most middleware frameworks also mask heterogeneity of operating systems or programming languages, or both. A few such as CORBA also mask heterogeneity among vendor implementations of the same middleware standard. Finally, programming abstractions offered by middleware can provide transparency with respect to distribution in one or more of the following dimensions: location, concurrency, replication, failures, and mobility.

The classical definition of an operating system is “the software that makes the hardware useable.” Similarly, middleware can be considered to be the software that makes a distributed system programmable. Just as a bare computer without an operating system could be programmed with great difficulty, programming a distributed system is in general much more difficult without middleware, especially when heterogeneous operation is required. Likewise, it is possible to program an application with an assembler language or even machine code, but most programmers find it far more productive to use high-level languages for this purpose, and the resulting code is of course also portable.

Usage of Middleware

There are various different kinds of middleware that have been developed. These vary in terms of the programming abstractions they provide and the kinds of heterogeneity they provide beyond network and hardware.

Generally, middleware services provide a more functional set of application programming interfaces to allow an application to:-

Locate transparently across the network, thus providing interaction with another service or application

Filter data to make them friendly usable or public via anonymization process for privacy protection (for example)

Be independent from network services

Be reliable and always available

Add complementary attributes like semantics

Transaction Oriented Middleware (TOM) (or Distributed Tuples)

A distributed relational database offers the abstraction of distributed tuples (i.e. particular instances of an entity), and is the most widely deployed kind of middleware today. It uses Structured Query Language (SQL) which allows programmers to manipulate sets of these tuples in an English-like language yet with intuitive semantics and rigorous mathematical foundations based on set theory and predicate calculus. Distributed relational databases also offer the abstraction of a transaction (which can also be performed using Transactional SQL or TSQL). Distributed relational database products typically offer heterogeneity across programming languages, but most do not offer much, if any, heterogeneity across vendor implementations. Transaction Processing Monitors (TPMs) are commonly used for end-to-end resource management of client queries, especially server-side process management and managing multi-database transactions. As an example consider the JINI framework (built on top of JavaSpaces) which is tailored for intelligent networked devices, especially in homes.

Advantages

Users can access virtually any database for which they have proper access rights from anywhere in the world (as opposed to their deployment in closed environments where users access the system only via a restricted network or intranet)

They address the problem of varying levels of interoperability among different database structures.

They facilitate transparent access to legacy database management systems (DBMSs) or applications via a web server without regard to database-specific characteristics.

Disadvantages

This is the oldest form of middleware – hence it lacks many features of much recent forms of middleware.

Does not perform failure transparency

Tight coupling between client and server

Remote Procedure Calls

A Remote Procedure Call (RPC) is an inter-process communication that allows a computer program to cause a subroutine or procedure to execute in another address space (commonly on another computer on a shared network) without the programmer explicitly coding the details for this remote interaction. That is, the programmer writes essentially the same code whether the subroutine is local to the executing program, or remote. When the software in question uses object-oriented principles, RPC is called remote invocation or remote method invocation.

Remote Procedure Call Middleware (RPCM) extends the procedure call interface familiar to virtually all programmers to offer the abstraction of being able to invoke a procedure whose body is across a network. RPC systems are usually synchronous, and thus offer no potential for parallelism without using multiple threads, and they typically have limited exception handling facilities.

Advantages

Language-level pattern of function call which is easy to understand for programmers.

Synchronous request/reply interaction

â€¢ Natural from a programming language point-of-view

â€¢ Matches replies to requests

â€¢ Built in synchronization of requests and replies

Distribution transparency (in the no-failure case)

â€¢ Hides the complexity of a distributed system

Various reliability guarantees

â€¢ Deals with some distributed systems aspects of failure

Failure Transparency is performed

â€¢ May be due to network and/or server congestion or client, network and/or server failure

â€¢ In such situations an error maybe returned to programmer, either at once or after the RPC library has retried the operation several times.

Disadvantages

Synchronous request/reply interaction

â€¢ Tight coupling between client and server

â€¢ Client may block for a long time if server loaded – hence needs a multi-threaded client

â€¢ Slow/failed clients may delay servers when replying multi-threading essential at servers

Distribution Transparency

â€¢ Not possible to mask all problems

RPC paradigm is not object-oriented

â€¢ Invoke functions on servers as opposed to methods on objects

Message Oriented Middleware

Message-Oriented Middleware (MOM) provides the abstraction of a message queue that can be accessed across a network. It is a generalization of the well-known operating system construct: the mailbox. It is very flexible in how it can be configured with the topology of programs that deposit and withdraw messages from a given queue. Many MOM products offer queues with persistence, replication, or real-time performance.

Advantages

Asynchronous interaction

Client and server are only loosely coupled

Messages are queued

Good for application integration

Support for reliable delivery service

Keep queues in persistent storage

Processing of messages by intermediate message server(s)

May do filtering, transforming, logging, etc.

Networks of message servers

Natural for database integration

Disadvantages

1) Poor programming abstraction (but has evolved)

â€¢ Rather low-level (cf. Packets)

â€¢ Request/reply more difficult to achieve, but can be done

2) Message formats originally unknown to middleware

â€¢ No type checking (but JMS addresses this in its implementation)

3) Queue abstraction only gives one-to-one communication

â€¢ Limits scalability (JMS publisher/subscriber implementation)

Java Messaging Service

The Java Message Service (JMS) API is a Java Message Oriented Middleware (MOM) API for sending messages between two or more clients. JMS is a part of the Java Platform, Enterprise Edition, and is defined by a specification developed under the Java Community Process as JSR 914. It is a messaging standard that allows application components based on the Java 2 Platform, Enterprise Edition (J2EE) to create, send, receive, and read messages. It allows the communication between different components of a distributed application to be loosely coupled, reliable, and asynchronous.

Web Services

A web service is a method of communication between two electronic devices. The W3C definition of a “web service” is as a software system designed to support interoperable machine-to-machine interaction over a network. It has an interface described in a machine-processable format (specifically Web Services Description Language WSDL). Other systems interact with the web service in a manner prescribed by its description using SOAP messages, typically conveyed using HTTP with an XML serialization in conjunction with other Web-related standards.

There are two major classes of Web services, REST-compliant Web services and arbitrary Web services. In REST-compliant web services the primary purpose is to manipulate XML representations of Web resources using a uniform set of “stateless” operations. Whereas in arbitrary web services, the service may expose an arbitrary set of operations.

“Big web services” use Extensible Markup Language (XML) messages that follow the SOAP standard and have been popular with traditional enterprise. In such systems, there is often a machine-readable description of the operations offered by the service written in the Web Services Description Language (WSDL). The latter is not a requirement of a SOAP endpoint, but it is a prerequisite for automated client-side code generation in many Java and .NET SOAP frameworks.

IBM MQ Series

IBM WebSphere MQ (formerly known as IBM MQSeries) is a message-oriented middleware platform that is part of IBM’s WebSphere suite for business integration. Messages are stored in message queues that are handled by queue managers. A queue manager is responsible for the delivery of messages through server-to-server channels to other queue managers. A message has a header and an application body that is opaque to the middleware. No type-checking of messages is done by the middleware. Several programming language bindings of the API to send and receive messages to and from queues exist, among them a JMS interface.

WebSphere MQ comes with advanced messaging features, such as transactional support, clustered queue managers for load-balancing and availability, and built-in security mechanisms. Having many features of a request/reply middleware, WebSphere MQ is a powerful middleware, whose strength lies in the simple integration of legacy applications through loosely-coupled queues. Nevertheless, it cannot satisfy the more complex many-to-many communication needs of modern large-scale applications, as it lacks natural support for multi-hop routing and expressive subscriptions.

Object Oriented Middleware (OOM) or Distributed Object Middleware (DOM)

Object Oriented Middleware provides the abstraction of an object that is remote yet whose methods can be invoked just like those of an object in the same address space as the caller. Distributed objects make all the software engineering benefits of object-oriented techniques encapsulation, inheritance, and polymorphism available to the distributed application developer.

Every object-oriented middleware has an interface definition language (IDL) and supports object types as parameters, exception handling and inheritance. It also presents the concept of client and server stubs which act as proxies for servers and clients. The stubs and skeletons are created using the IDL compiler that is provided by the middleware. In addition, the OOM presentation layers need to map object references to the transport format. This is done via marshalling and unmarshalling of serialized objects.

Advantages

Support for object-oriented programming model

Objects, methods, interfaces, encapsulation, etc.

Exception handling is supported

Synchronous request/reply interaction – same as RPC

Location Transparency – system (ORB) maps object references to locations

Services comprising multiple servers are easier to build with OOM

RPC programming is in terms of server-interface (operation)

RPC system looks up server address in a location service

Disdvantages

Synchronous request/reply interaction only and therefore ad to implement Asynchronous Method Invocation (AMI) in the technologies. However this led to tight coupling.

Distributed garbage collection is available which will automatically release the memory held by unused remote objects

OOM is rather static and heavy-weight. This is bad for ubiquitous systems and embedded devices

Common Object Request Broker Architecture (CORBA)

CORBA is a standard for distributed object computing. It is part of the Object Management Architecture (OMA), developed by the Object Management Group (OMG), and is the broadest distributed object middleware available in terms of scope. It encompasses not only CORBA’s distributed object abstraction but also other elements of the OMA which address general purpose and vertical market components helpful for distributed application developers. CORBA offers heterogeneity across programming language and vendor implementations.

Distributed Component Object Model (DCOM)

DCOM is a distributed object technology from Microsoft that evolved from its Object Linking and Embedding (OLE) and Component Object Model (COM). DCOM’s distributed object abstraction is augmented by other Microsoft technologies, including Microsoft Transaction Server and Active Directory. DCOM provides heterogeneity across language but not across operating system or tool vendor. COM+ is the next-generation DCOM that greatly simplifies the programming of DCOM.

Remote Method Invocation (RMI)

Remote Method Invocation (RMI) is a facility provided by Java which is similar to the distributed object abstraction of CORBA and DCOM. RMI provides heterogeneity across operating system and Java vendor, but not across language. However, supporting only Java allows closer integration with some of its features, which can ease programming and provide greater functionality.

The RMI compiler generates stubs and skeletons for the coded Client and Server programs. The server class usually inherits from a pre-coded “Unicast Remote” server object and a security manager is installed. This class is then registered using the RIM Naming service. Any client can look-up a remote server object on the registry; provided its name is known.

Reflective Middleware

Reflective middleware is simply a middleware system that provides inspection and adaptation of its behavior through an appropriate causally connected self-representation (CCSR).

It is a type of flexible object oriented middleware for mobile and context-awareness applications. Its adaptation to context is through the monitoring and substitution of components. It also provides interfaces for reflection and customizability.

Objects can inspect the middleware behavior and it allows for dynamic reconfiguration depending on the behavior.

Advantages

It is more adaptable to its environment and better able to cope with change

Useful in hostile and/ or dynamic environments

More suited for multimedia, group communication, real-time and embedded environments, handheld devices and mobile computing environments

Event Driven Middleware

This is new underlying communication paradigm for building large-scale distributed systems on top of a middleware. Event-based communication is a viable new alternative for the above mentioned middleware types and it uses “events” as the basic communication mechanism.

First, event subscribers, i.e. clients, express their interest in receiving certain events in the form of an event subscription. Then event publishers, i.e. servers, publish events which will be delivered to all interested subscribers. As a result, this model naturally supports a decoupled, many-to-many communication style between publishers and subscribers. A subscriber is usually indifferent to which particular publisher supplies the event that it is interested in. Similarly, a publisher does not need to know about the set of subscribers that will receive a published event.

Advantages

Asynchronous communication

â€¢ Publishers and subscribers are loosely coupled

Many-to-many interaction between pubs. and subs.

â€¢ Scalable scheme for large-scale systems

â€¢ Publishers do not need to know subscribers, and vice-versa

â€¢ Dynamic join and leave of pubs, subs, (brokers – see lecture DS-8)

Topic and Content-based pub/sub very expressive

â€¢ Filtered information delivered only to interested parties

â€¢ Efficient content-based routing through a broker network

Hermes

This is a scalable, event-based middleware architecture that facilitates the building of large-scale distributed systems. Hermes has a distributed implementation that adheres to the design models developed in the previous chapter. It is based on an implementation of a peer-to-peer routing layer to create a self-managed overlay network of event brokers for routing events. Its content-based routing algorithm is highly scalable because it does not require global state to be established at all event brokers. Hermes is also resilient against failure through the automatic adaptation of the overlay broker network and the routing state at event brokers. An emphasis is put on the middleware aspects of Hermes so that its typed events support a tight integration with an application programming language. Two versions of Hermes exist that share most of the codebase: an implementation in a large-scale, distributed systems simulator, and a full implementation with communication between distributed event brokers.

Advantages

Logical Network of Self-Organizing Event Brokers (P2P)

Scalable Design and Routing Algorithms

Expressive Content-Based Filtering

Clean Layered Design

Cambridge Event Architecture (CEA)

The Cambridge Event Architecture (CEA) was created in the early 90s to address the emerging need for asynchronous communication in multimedia and sensor-rich applications. It introduced the publish-register-notify paradigm for building distributed applications. This design paradigm allows the simple extension of synchronous request/reply middleware, such as CORBA, with asynchronous publish/subscribe communication. Middleware clients that become event sources (publishers) or event sinks (subscribers) are standard middleware objects.

First, an event source has to advertise (publish) the events that it produces; for example, in a name service. In addition to regular methods in its synchronous interface, an event source has a special register method so that event sinks can subscribe (register ) to events produced by this source. Finally, the event source performs an asynchronous callback to the event sink’s notify method (notify) according to a previous subscription. Note that event filtering happens at the event sources, thus reducing communication overhead. The drawback of this is that the implementation of an event source becomes more complex since it has to handle event filtering.

Despite the low latency, direct communication between event sources and sinks causes a tight

coupling between clients. To address this, the CEA includes event mediators, which can decouple event sources from sinks by implementing both the source and sink interfaces, acting as a buffer between them. Chaining of event mediators is supported but general content-based routing, as done by other distributed publish/subscribe systems, is not part of the architecture.

Order Now