PC-NOW99 Workshop PC-NOW'99

Abstracts of Papers

Performance Results for a Reliable Low-latency Cluster Communication Protocol

by Sthephen R. Donaldson, Oxford University Computing Laboratory, UK;
Jonathan M.D. Hill, Sychron Ltd., Oxford, UK;
David B. Skillicorn, CISC, Queen's University, Kingston, Canada.


Existing low-latency protocols make unrealistically strong assumptions about reliability. This allows them to achieve impressive performance, but also prevents this performance being exploited by applications, which must then deal with reliability issues in the application code. We present results from a new protocol that provides error recovery, and whose performance is close to that of existing low-latency protocols. We achieve a CPU overhead of 1.5us for packet download and 3.6us for upload. Our results show that (a) executing a protocol in the kernel is not incompatible with high performance, and (b) complete control over the protocol stack enables (1) simple forms of flow control to be adopted, (2) proper bracketing of the unreliable portions of the interconnect thus minimising buffers held up for possible recovery, and (3) the sharing of buffer pools. The result is a protocol which performs well in the context of parallel computation and the loose coupling of processes in the workstations of a cluster.

Coscheduling through Synchronized Scheduling Servers - A Prototype and Experiments

by Holger Karl, Institut fuer Informatik, Humboldt Universitaet zu Berlin, Germany.


Predictable network computing still involves a number of open questions. One such question is providing a controlled amount of CPU time to distributed processes. Mechanisms to control the CPU share given to a single process have been proposed before. Directly applying this work to distributed programs leads to unacceptable performance, since the execution of processes on distributed machines is not coordinated in time. This paper discusses how coscheduling can be achieved with share-controlling scheduling servers. The performance impact of scheduling control is evaluated for BSP-style programs. These experiments show that synchronization mechanisms are indispensable and that coscheduling can be achieved for unmodified programs, but also that a performance overhead has to be paid for the control over CPU share.

High-Performance Knowledge Extraction from Data on PC-based Networks of Workstations

by Cosimo Anglano, DSTA, Università del Piemonte Orientale, Alessandria, Italy;
Attilio Giordana, DSTA, Università del Piemonte Orientale, Alessandria, Italy;
Giuseppe Lo Bello, Data Mining Laboratory, CSELT, Torino, Italy.


The automatic construction of classifiers (programs able to correctly classify data collected from the real world) is one of the major problems in pattern recognition and in a wide area related to Artificial Intelligence, including Data Mining. In this paper we present G-Net, a distributed algorithm able to infer classifiers from pre-collected data, and its implementation on PC-based networks of workstations (PC-NOWs). In order to effectively exploit the computing power provided by PC-NOWs, G-Net incorporates a set of dynamic load distribution techniques that allow it to adapt its behavior to variations in the computing power due to resource contention. Moreover, it is provided with a fault tolerance scheme that enables it to continue its computation even if the majority of the machines become unnavailable during its execution.

Addressing Communication Latency Issues on Clusters for Fine Grained Asynchronous Applications - A case study

by Umesh Kumar V. Rajasekaran, Malolan Chetlur, Girindra D. Sharma, Radharamanan Radhakrishnan, and Philip A. Wilsey
Computer Architecture Design Lab., Dept. of ECECS, University of Cincinnati, Ohio.


With the advent of cheap and powerful hardware for workstations and networks, a new cluster-based architecture for parallel processing applications has been envisioned. However, fine-grained asynchronous applications that communicate frequently are not the ideal candidates for such architectures because of their high latency communication costs. Hence, designers of fine-grained parallel applications on clusters are faced with the problem of reducing the high communication latency in such architectures. Depending on what kind of resources are available, the communication latency can be improved along the following dimensions: (a) reducing network latency by employing a higher performance network hardware (i.e., Fast Ethernet versus Myrinet); (b) reducing communication software overhead by developing more efficient communication libraries (MPICH versus TCPMPL (our TCP/IP based message passing layer) versus MPI-BIP); (c) rewriting/restructuring the application code for less frequent communication; and (d) exploiting application characteristics by deploying communication optimizations that exploit the application's inherent communication characteristics. This paper discusses our experiences with building a communication subsystem on a cluster of workstations for a fine-grained asynchronous application (a Time Warp synchronized discrete-event simulator). Specifically, our efforts in reducing the communication latency along three of the aforementioned dimensions are detailed and discussed. In addition, performance results from an in-depth empirical evaluation of the communication subsystem are reported in the paper.

Low Cost Databases for NOW

by G. Conte +, M. Mazzeo *, A. Poggi +, P. Rossi *, and M. Vignali +.

+ Dipartimento di Ingegneria dell'Informazione University of Parma, Italy.
* ENEA, HPCN Project, Bologna, Italy.


In this paper, we present a system called DONOW (Database on Network of Workstation), that is an example of an implementation and maintenance of a low cost distributed database without performances penalties. The system has a three tier architecture based on a client, a service and a data layer. The client layer allows local and remote users to interact with the system through a Java user friendly interface. The service layer is implemented in C++; it allows the service of user requests, in particular, the management of queries involving more than one DONOW node. The data layer is based on a well-known freeware relational database management system, that is, MySQL for each DONOW node. DONOW is under experimentation and will be used by Partena, an industry producing very advanced machinery for pharmaceutical products packaging, for the management of the information about their machines, that consists of close to 100000 drawings and related documentation.

Implementation and Evaluation of MPI on an SMP Cluster

by Toshiyuki Takahashi +, Francis O'Carrol *, Hiroshi Tezuka +, Atsushi Hori +, Shinji Sumimoto +, Hiroshi Harada +, Yutaka Ishikawa +, and Peter H. Beckman ^

+ Real World Computing Partnership, Japan
* MRI Systems, Inc.
^ Los Alamos National Laboratory, USA.


An MPI library, called MPICH-PM/CLUMP, has been implemented on a cluster of SMPs. MPICH-PM/CLUMP realizes zero copy message passing between nodes while using one copy message passing within a node to achieve high performance communication. To realize one copy message passing on an SMP, a kernel primitive has been designed which enables a process to read data of another process. The get protocol using this primitive was added to MPICH. The MPICH-PM/CLUMP has been run on an SMP cluster consisting of 64 Pentium II dual processors and Myrinet. It achieves 98 MByte/sec between nodes and 100MBytes/sec within a node.
chiola@disi.unige.it, Apr. 9, 1999