Chapter 24

Distributed Component Object Model

by Hiro Ruo


CONTENTS

This chapter introduces the reader to the future of computing according to Microsoft: the distributed component object model (DCOM). DCOM allows COM objects to be distributed across remote servers and for the entire operating system to consist of COM objects.

Information "Not-so-Super" Highway

The Information Superhighway is indeed a power tool, giving a user access to a plethora of resources and information all over the world. This is especially evident in the exponential explosion of the number and variety of Web sites on the World Wide Web. The Web is a set of encyclopedias, a mail system, an enormous database, a museum, a newsroom, a telephone, an entertainment system, a broadcast network, a worldwide address book, and many more things. However, its true potential has yet to be exploited.

The technology growth on the Web is tremendous. Just yesterday, a text browser was a piece of software only a doctoral student could manage. Information available on the Internet was limited to the most obscure scientific research topics. Today, anyone with a modern PC can view high-quality art and maneuver the Internet at the touch of a mouse button. Video, audio, and interactive play are all part of the Internet technology toolkit. Technologies such as COM will make the Internet more powerful and flexible. But a superhighway it is not. Why not?

The limiting factor is the connection. Today's typical home computer consists of PCI video, Soundblaster-compatible audio, storage space close to 1GB, and the latest and greatest modem, which transmits and receives at an unimpressive 28Kbps (kilobits per second). This is because the only ubiquitous connection available in the typical house is a telephone line, so a typical home user connecting to the Internet must wait excruciatingly for the information to download to his or her local computer in order to use it. This situation will only worsen as the amount of information increases.

So, what is the solution? Installing ISDN or T-1 lines in a home is unreasonable for the average home computer owner. New communications technologies will take years to become ubiquitous. A better connectivity model for the Web that can utilize the currently available hardware is required, and the answer may be distributed computing.

The Current Programming Model

A good place to start in understanding the distributed computing model is to look at the current programming model. The average program requires three major components: a way for the user to interface with the program, a functional body that actually performs the tasks required by the program, and a method of storing the program. Figure 24.1 illustrates this model.

Figure 24.1 : The current programming model.

A user accesses the program through the user interface, which can be as simple as "Press Any Key to Continue" or as complex as a window graphical user interface. With this interface, the user has control of where to take the program next, as well as a method of entry to and exit from the program.

The processing element is the main computational component of the program. Any task that the program actually performs resides in this block. These tasks can range from the actual mathematical operations in a calculator program to the rendering of complex 3D images in a virtual reality program. The processing element takes instructions from the user through the user interface to determine what tasks to perform and how they will be carried out. It also returns the results of the computation to the user through the user interface.

The storage component is where the program itself and any transient data and information the program needs is kept. This includes the memory used to run the program or the hard drive where the program is kept. The user can access storage through the user interface and the processing element.

This model applies to any program run on a standalone computer. The user starts the program stored on his PC. He controls the program, enters any data, receives the results, and exits the program through the user interface. The program runs in the memory contained in the PC and utilizes the swap space and file system for storage of intermediate and final results.

This programming model is also implemented in the environment of the Web as a local phenomenon. For example, to run a nifty game found while surfing the Web, the user must download the program to his or her local computer's memory in order to run it. In this case, the Web is used only as an information retrieval system. And with the low bandwidth of a telephone line, this retrieval can be inefficient and lengthy. Therein lies the unnecessarily imposed limitations on the use of the Web today: Programmers are writing applications intended for local use only and distributing them across the Web as is.

Distributed Computing

So, what is distributed computing? And how can it help solve the bandwidth problem of today's Web?

The word "distributed" gives an image of many pieces of an item strewn in all directions, possibly in a well-organized manner. And that is the essence of the concept of distributed computing: the capability to utilize a dispersed set of resources to perform the act of computation.

In the current programming model described earlier, a program will run on the local computer on which the program was started. The local computer will multitask all the different functionalities of the program to emulate the simultaneous occurrence of all the events happening. In distributed computing, each task of the program may be assigned to another computer connected to the local computer. In this way, the execution of the program is shared by the local computer and others on the network on which it is connected. This is distributed computing.

Let's look at writing a book as an example to demonstrate this concept. Say you want to write a book to teach people how to use Visual J++. You could sit down, parse through the enormous amount of information required to understand how to use Visual J++, and then proceed with typing the hundreds of pages, drawing the many diagrams and figures, writing the numerous example programs for the book, authoring the CD-ROM of examples and help files, formatting the entire set of text and graphics into the finished product, drawing the catchy cover art, and compiling a listing of all the related books and articles to read. Obviously, this is not the most efficient way to accomplish the goal of writing a book. But writing a standalone program is performing just such a task: The writer probably does not have the resources to perform all the tasks necessary in a timely manner, even if he or she were superhuman.

Using a distributed computing model to write your book, you could seek out several other writers and have them write as many chapters in which they have expertise as they can. Then you could make arrangements with a publisher to communicate with the distributors, hire a trucking company and an airline pilot, have a lawyer work out the contract details, and so on. You get the picture.

In a similar manner, a program can run on multiple computers using the same programming model described previously. The local user interface is used to call various processing elements stored on different remote storage components to execute a program. This distribution across multiple computers allows more flexibility, more efficient resource allocation/sharing, more modularity, and more computing power. Figure 24.2 illustrates the distributed computing environment.

Figure 24.2 : The distributed programming model.

Distributed Web

The distributed computing model can be easily integrated into the Web. The Internet is a perfect infrastructure for such a model. A Web browser, or any program that facilitates access to the Internet, is the user interface. From here, the Web and all its resources are at the user's disposal.

The Web is distributed in the sense that it is a means by which one computer can connect to many other computers to access these computers' information. Too often, however, the resources on the Web act only as storage media. A user should have the processing power and storage capacity of any resource on the Internet. A local PC is not tailored to perform highly computationally intensive tasks such as large searches or database management, but there are many servers and other comparable nodes out there that have this capability. It would only be logical to utilize these resources. The Web PC and other similar low-cost Internet boxes were designed with the intent of harnessing this capability.

Figure 24.3 shows a possible collaboration of several Web servers to perform a complex 3D modeling requested by the PC connecting to the Internet. The search engine Web server is utilized to parse the resources of the Web for graphical objects to be used in the 3D model. The database Web server stores and facilitates the processing of the graphical object definition used. The graphical Web server generates the actual 3D model, collecting information found by the search and database servers. The computational Web server is utilized to perform the complex calculations required in the rendering of the 3D model. Although incomplete, this example demonstrates the potential of a distributed program on the Web.

Figure 24.3 : An example of a distributed program on the World Wide Web.

Currently, the de facto method for using the Web in a somewhat distributed manner is with common gateway interface (CGI) programs. CGI programs are run on remote servers that give users additional processing power to run computing-intensive applications such as database access and searching algorithms. The intent of this protocol is to enable access to security restricted information that otherwise cannot be utilized by the PC. It is also a good first step toward implementing and utilizing the potential of distributed computing power on the Web.

As the Internet evolves into a more ideal distributed computing environment, better methods of sharing resources are needed. The distributed Web model is a possible solution. At a logical block level, this concept seems simple, but as we know, its implementation is no trivial matter. There are many questions to be answered regarding security, protocol, communication services, and operating system design. A possible answer to these questions is DCOM.

DCOM

DCOM is currently being developed by Microsoft Corporation as a follow-up to its COM implementations. DCOM is an application-level protocol for object-oriented remote procedure calls (RPCs) intended for a distributed system. It is based on the distributed computing environment (DCE) RPC specification developed by the Open Software Foundation (OSF). Through the use of COM, DCOM will support standard functionality of any distributed environment, including the Web, regardless of computer platform, communication protocol, and operating system.

The vision of Microsoft for COM, as previously described in this section, is a universal infrastructure for implementing highly portable, object-oriented software that can be used by any of its operating systems, including future versions of Windows 95 and Windows NT. All software, including operating system modules, applications, and networking, will be written so that they conform to the COM specification and therefore can be easily transplanted to any platform without customization.

DCOM is an extension of this idea. Just as future software implementations will use COM to access the local features of a computer, DCOM will allow these same software implementations to use remote features of other connected computers. Any resource, barring security restrictions, available to a local PC can then also be made available to remote PCs over a connection. A perfect implementation would be on the World Wide Web.

DCOM Key Features

DCOM implementation is synonymous to an RPC implementation that conforms to the DCE specification. Key features of DCOM are portability, runtime binding across the network, transparency to requesting applications, and most importantly, a well-defined interface to allow remote task execution. Some details of DCOM features follow.

DCOM utilizes the network data representation (NDR) for any arbitrary data types supported by DCE RPC. This implementation eliminates the need for increased complexity required with the development to support the growing number of formats and task-specific syntax. With today's explosion of new formats and syntax, the effort to upgrade existing software for these features is tremendous. DCOM will allow one common method to support these and future technologies.

Another feature of DCOM is the inherent support for a secure distributed environment. DCOM implements the security provided by DCE RPC, including its capability for authentication, authorization, and data integrity. This is an absolute requirement in today's Web, due to the popularity of applications to transmit confidential data and the adoption of intranets by companies requiring limited, monitored access to confidential information.

Today, information is updated on the Web by simply overwriting existing data and programs. To keep the integrity of the existing interface, both server and client software must be updated simultaneously. With DCOM, this is not necessary. You can label an interface version with universally unique IDs (UUIDs). You can then update an existing interface by publishing a new UUID with the interface, so that both the old and new interfaces can be supported. And two parties can simultaneously update an existing interface without fear of conflict between the two updates. To understand the implementation of DCOM, one must first understand the RPC protocol as defined by the DCE. The following is a list of terminology that requires definition and clarification:

The basis for RPC protocol is in a client/server implementation. An RPC is the protocol by which a client can call a resource available at a remote node (server) through its local RPC manager. The following steps illustrate how a client interacts with the server using RPC (see Figure 24.4):

Figure 24.4 : RPC client perspective.

  1. Before making the actual RPC call, the client must get a compatible binding, one that provides the interface required using mutually supported protocol; it does this by searching a name service called by RPC API routines. Typically, the client first specifies the interface desired, and the RPC runtime uses this information to find bindings with compatible protocol sequences. The client can also specify a protocol sequence or a particular object UUID.
  2. After finding such a binding, the client must then import the binding information from that element. For each binding the client imports, the runtime provides a handle to the indexed server binding that refers to the binding information maintained by the client RPC runtime. A server binding handle, in this client case, may also include a pointer to a particular object UUID, if requested.
  3. If a compatible binding is found, the client can then make the actual RPC call using the server binding handle returned.
  4. The client runtime now has the binding information and any object UUID involved in the establishment of the binding handle, as well as the interface ID and the operation number of the requested routine.

Now let's look at the same RPC call from the perspective of the server (see Figure 24.5):

Figure 24.5 : RPC server perspective.

  1. The server API defines a manager EPV for each manager available on the server. When an RPC request arrives from a client, the operation number for the request is used to index an element from one of the manager EPVs.
  2. The server registers object UUID, interface ID, type UUID, and EPV associations with the RPC runtime to establish an interface mapping that allows the correct selection of the manager.
  3. The server returns the appropriate protocol sequences for the request, and the runtime establishes a set of endpoints for these sequences. The runtime will also return a set of binding handles that refer to this set of endpoints if requested to do so.
  4. If a call is received with a partial binding (one that lacks an endpoint), the endpoint mapper can use information registered by the server, including interface ID, binding information, and object UUID, to select an endpoint capable of handling the call.
  5. The server can export the binding information, minus an endpoint if one was available, through one or more name service entries.

The binding information is a key element of the transactions that occur during an RPC call. It is the method by which a client can access the server and its resources. The binding information format is not a rigid definition, however; Figure 24.6 illustrates the difference between different sets of binding information.

Figure 24.6 : Binding information.

DCOM Functionality

A DCOM call, also called object RPC (ORPC), is actually an RPC call as specified by the DCE. With this implementation, DCOM inherits the security, reliability, and robustness inherent in the RPC protocol. This section discusses the few minor differences between DCOM and RPC.

DCOM utilizes an interface pointer identifier (IPID). This 128-bit identifier is synonymous with the UUID of RPC and is used to identify a specific interface of an object on a server. In fact, the static type of IPID is a UUID.

In a DCOM implementation, the interface ID contains two additional arguments: ORPCTHIS and ORPCTHAT. An ORPCTHAT argument may also be present in a "fault," which is the result of calling an ORPC on a server that does not support the specified interface to the requested object.

The IPID may contain pertinent information to identify the server, object, and interface requested with a ORPC, but it cannot specify the binding information required to execute the ORPC. An additional object exporter identifier (OXID) is used to indicate the scope of an object. An OXID is used to determine the RPC string binding necessary to complete a ORPC by connection to the desired IPID. The OXID is translated into a map of bindings that the referenced application can use to determine whether it is within the scope of the object. Attached to each OXID is an OXID object through which remote management of interface requests are returned.

An extension of DCOM is the concept of marshaled interface reference of an object. There are several types of marshaled interfaces: NULL, a reference to nothing; STANDARD, a standard reference to a remote object; CUSTOM, which gives an object control of references to itself; and HANDLER, which allows a reference through a proxy.

The DCOM protocol also allows remote reference counting on a per-interface methodology. The incrementing and decrementing of the reference count are implemented using the RemAddRef and RemRelease calls. These calls are not processed immediately but are cached until all local references to a remote object are released for processing. This allows a more efficient network communications protocol, resulting in lower network traffic.

To handle abnormal termination of remote objects, DCOM supports a pinging system. Associated with an object are a pingPeriod and numPingsToTimeOut. These values allow a certain number of pings and lapse of time before an object that isn't responding is considered terminated. DCOM also supports Delta Pinging, which allows pings to be grouped into sets; a ping is then performed on a set of IDs instead of pinging for each ID requested. Again, this greatly reduces network traffic.

DCOM also utilizes causality IDs, in which a string of ORPCs can be linked by this UUID to indicate that they are causally related. With this information, an application receiving a string of ORPCs will have visibility into the transient relations of ORPCs, so the application can have some knowledge of possible deadlocks due to misprocessing an ORPC of a causal set.

Table 24.1 shows the DCOM data types and structures. Refer to the DCOM specification for a more detailed explanations.

Table 24.1. DCOM data types and structures.

OBJREFMarshaled Object Reference
OBJREF_STANDARDStandard Marshaled Object Reference
OBJREF_HANDLERHandler Marshaled Reference
OBJREF_CUSTOMCustom Marshaled Reference
OBJREF_NULLNULL Marshaled Reference
STDOBJREFMarshaled COM Interface Pointer
SORFLAGSObject Exporter Flags
ORPCINFOFLAGSORPC Flags
ORPCTHISRequest Marshaled Arguments
ORPCTHATResponse Marshaled Arguments
HRESULTSORPC Return Value

Summary

This chapter presents an overview of the concept of distributed computing and how DCOM can be implemented on today's Web infrastructure to facilitate faster, more modular, more powerful Internet activities. The implementation of the DCOM protocol may help with some inefficiencies of the current Internet by allowing applications to be distributed and shared across the abundant resources available.

So, where does DCOM take us next? It is evident from all the hype over the Internet that Microsoft wishes to be a key contributor in the future of the Internet. The announcement of the availability of COM brings Microsoft's product line closer to a fully portable set of codes that will allow for the proliferation of Microsoft products, including Windows 95 and Windows NT. At this writing, Microsoft has announced the transition of ActiveX and DCOM to an industry standards body. With the addition of DCOM, users can now take advantage of the built-in support for portability and modularity of COM and DCOM to the next level-the Internet as single body. As DCOM becomes a predominant implementation of Internet software, applications will be able to access any resource anywhere on the Web just as they access the resources of the local PC/operating system. This will allow flexibility and seamless functionality like never before. This is the vision of Microsoft and the future of the World Wide Web.