---------------------------------------------------------------------------- The Florida SunFlash The Isis approach to reliable distributed computing SunFLASH Vol 36 #20 December 1991 ---------------------------------------------------------------------------- This article describes Isis, a commercially available toolkit for building distributed applications . -johnj ---------------------------------------------------------------------------- The Isis system is a toolkit for developing distributed applications. Built in layers, the system supports a core set of mechanisms for fault-tolerant process-group communication, on top of which tools are provided for managing replicated data, synchronizing processes, detecting and reacting to system reconfigurations and failures, supporting distributed computation, and for many other operations. Isis process groups are virtually synchronous: they provide users with the illusion of a greatly simplified programming environment (a synchronous one), but are actually extremely asynchronous. Using this approach, it is easy to build high performance, highly reliable software that maintains consistency and availability even as it reacts dynamically to failures, overloads and other conditions. Virtual synchrony has permitted our company to solve a wide range of distributed problems, and to package these solutions in the form of a toolkit for building distributed applications. The basic philosophy of the toolkit is to provide the ``glue'' with which Isis users construct distributed software out of non-distributed components. The programmer of such a distributed system focuses primarily on solving problems ``local'' to individual components, i.e. implementing code by which a specific system component reacts to events of various kinds, failures or restarts of other components, and so forth. Distributed functionality is introduced by issuing subroutine calls to Isis routines that operate upon process groups. Isis Distributed Systems Inc. was founded by the inventors of this approach and is the only company currently offering software embodying this important new concept. Replicated data example. For example, to manage replicated data, an application might be structured around a process group whose members all have current copies; updates would be done by a procedure call to the Isis replicated update routines, and locking by calls to the Isis token passing routines (reads would be local). The programmer would implement the application-specific code associated with actually representing the shared data, but not with the details of implementing these update and synchronization mechanisms using fault-tolerant protocols. Moreover, the programmer can use large numbers of groups -- indeed, process groups can be used simply to track membership of key parts of a system, such as a database service or a name service, so that new copies can be started if the current copy fails. More complex replication schemes are also supported. For example, in an aircraft tracking application it might be useful to replicate the data structure used to maintain radar tracks. In this case, one could imagine combining shared memory with a process group. Message-oriented communication would be used for event-driven aspects of the computation, synchronization, and notifications that affect the state of the process group. Direct updates on the shared memory would be used where this yields a simpler and more efficient algorithm. Many Isis programmers use the system in this manner, and tools are included to make this as easy to build as possible. In fact, the simplest implementation of a process group whose members also share a region of memory involves merely issuing a call to the memory map procedures within UNIX or Mach at the time that a process first joins the group. A completely dynamic system model. This leads us to an important point. Process groups are formed at runtime in Isis. That is, unlike other systems in which the set of programs that will form a group (or the machines on which they will run) must be designated in advance, Isis groups are formed while programs run, namely when calls are issued to the group join primitive. Isis maintains a namespace within which groups are registered (the names look like file system names, although they don't correspond to files). To join a group, a program simply invokes the join primitive using this symbolic name. Isis handles the entire mechanism of running the join protocol fault-tolerantly. Isis arranges to have the credentials of the new process checked, and adds it to the process group. It arranges for an existing member of the group to transfer any group state that the new member might need (for example, the values of replicated data maintained within the group). It synchronizes these events with ongoing communication so that the state transferred reflects all the events (for example, all the data updates) to the instant of the join, and it arranges that the new process will see all subsequent events. And, it does this in such a way that even if a failure were to occur right in the middle of the join, the protocol would complete without delay. The example we have just discussed: that of one process in a group transferring the group's state to a newly-joining member (the problem we call state transfer in Isis). In an asynchronous system, all sorts of things can go wrong: the state could fail to reflect some update that was occurring just as the join took place, or it could reflect an update that the new member will later see and might erroneoulsy re-apply. Or, the process sending the state could fail in mid-transfer. A virtual synchrony environment addresses every one of these issues, and many more subtle ones as well. Not surprisingly, such an environment is much easier to work with then an asynchronous one, where each and every action taken must be viewed against an uncertain background, because it is nearly impossible to guess what other system components are currently seeing or doing. It should be noted that a conventional RPC environment isn't much better than a completely asynchronous system for solving these sorts of problems. RPC simplifies the communication aspect of a distributed system by hiding many of the details of setting up communication channels, packing data into messages, etc. However, the model is not very useful in reducing the complexity of distributed programming within distributed applications. The reader may wonder if the virtual synchrony model doesn't simply push all the complexity of a distributed problem to within Isis. In a sense, it does. On the other hand, Isis itself is simplified by this approach. Moreover, virtual synchrony has been a major tool for us in permitting Isis to employ algorithms that run asynchronously but are easy to reason about and make fault-tolerant. A consequence is that performance seen by the Isis user is excellent -- for example, the join sequence completes within milliseconds and the rate of updates possible on a small group of, say, 4 processes using conventional workstations might exceed 1000/second even over UNIX. Isis Distributed Systems, Inc. Isis Distributed Systems, Inc. is dedicated to the development and support of high technology distributed computing software. Founded by Kenneth Birman in Ithaca, New York in 1988, the company now includes eight full- and part-time employees, participating in both consulting and product development activities. 1991 marks a major shift for the company, with the launch of the Isis Toolkit, a commercial version of the highly popular Isis system, together with stand-alone Isis-based products that include a wide-area news and message-passing system, a network resource manager for batch and interactive load sharing, and a fault-tolerant version of the UNIX Network File System. Over the coming year, the company will introduce additional highly reliable, aggressively priced, distributed software solutions. The Isis Toolkit An environment for distributed and fault-tolerant programming, based on the highly successful Isis system developed at Cornell University. Isis Resource Manager This Isis-based product turns a network of workstations into a software supercomputer, scheduling work onto idle machines and restarting critical services when failures occur. Isis/News Oriented towards banking and brokerage applications, Isis/News permits programs to ``publish'' information, which other programs can subscribe to on a per-subject basis. Isis Reliable Network File System An implementation of SUN's NFS standard that provides transparent fault-tolerance for your critical files. Available mid-1992. Isis Sensors Instruments a distributed application (or external devices) for high-level reactive control. (This software is actually a public-domain Isis application that runs over the Isis Toolkit; a commercially enhanced version is planned for late 1992). Current users. Current users of the Isis system include a number of the world's largest banking and brokerage firms, scientific computing groups with a need for software parallelism techniques, several major computer manufacturers with large simulation or graphics applications, and a number of military applications. References are available on request. For more information. To obtain more information about the full line of software solutions available from Isis Distributed Systems Inc., including our 1992 product descriptions and price list, just send electronic mail to sla@isis.com, or call Susan Allen in our Ithaca offices at 607-272-6327 (Fax: 607-255-4428 attn: Ken Birman), or write to us at 111 South Cayuga Street, Second Floor, Ithaca New York, 14850 USA. Isis technology may be exported; support and customer training is available. ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ For information send mail to info-sunflash@sunvice.East.Sun.COM. Subscription requests should be sent to sunflash-request@sunvice.East.Sun.COM. Archives are on solar.nova.edu and paris.cs.miami.edu. All prices, availability, and other statements relating to Sun or third party products are valid in the U.S. only. Please contact your local Sales Representative for details of pricing and product availability in your region. Descriptions of, or references to products or publications within SunFlash does not imply an endorsement of that product or publication by Sun Microsystems. John McLaughlin, SunFlash editor, flash@sunvice.East.Sun.COM. (305) 776-7770.