Design and implementation of a multidimensional multilink multicomputer hardware and software
Abstract
In this thesis, we propose and implement a multidimensional multilink system
(MMS) architecture which uses message passing paradigm between computing
elements (CEs). The merits of this architecture are its simplicity,
regularity and rich connectivity among CEs. Many existing message passing
architectures can be emulated very effectively on MMS architecture. A
software environment for this architecture is also designed and implemented.
MMS architecture is made using multiple fully connected multicast networks
in multiple dimensions. CEs in each dimension are connected through
a fully connected network. The number of CEs in this network is called
“drop" parameter of MMS. The network also allows a selective broadcast
(multicast) by which a CE can address a few CEs connected through a fully
connected multicast network.
The thesis also presents two performance measurement tools for this architecture.
The first tool, an analytical model, is based on mathematical
equations and can be used to model very simple and regular problems. It
assumes a load-balanced task partitioning of a given problem. This model
is used for modeling some problems which are also written and run on the
MMS implementation. The second performance measurement tool, simulation
model, can be used for measuring the performance of arbitrarily structured
problems. The simulation model is written in SIMULA programming
language and it can be used for measuring the performance of concurrent programs
written in SIMULA. Both of these tools give performance parameters
like speedup, processor utilization etc.
The thesis also discusses various routing algorithms for MMS architecture.
This architecture supports a multicast network using which algorithms are
given for a point-to-point communication where a CE sends data to another
CE which may not be connected, and a broadcast communication where a
CE sends data to all CEs in the system.
Two variations of this architecture, one with 4 CEs and another with
9 CEs, have been implemented. The design of communication unit is independent
of the processor used in a CE and a communication unit can
be used with any CE if it provides a standard IBM PC bus. This scheme
has an advantage by which a processor in CE can be replaced by a better
performance processor. Currently the implementations are based on IBM PC
motherboards with Intel's iAPX 88 processor at 5 MHz clock frequency
but the communication unit has also been tried with 80386-based PC/AT
motherboards. This implementation is very cost-effective. Implementation
of communication module (CM) can handle a length of interconnection cables
up to 20 feet. Using this interface, various computers in a laboratory
room can be connected thus providing a multicomputer configuration. This
structure then behaves as a small network of computers on which machines
can be used either as a set of sequential computers working independently
or as a single parallel multicomputer.
An integrated software environment has been designed for MMS architecture
and implemented. This environment is fairly general-purpose and can
be interfaced to any existing language compiler. This makes the implementation
user-friendly. The existing commercial compilers and the commercial
computers can be hooked in a parallel computing environment. Software environment
models a program as multiple cooperating tasks where each task is
run on a different CE in parallel. It provides subroutine calls to find the configuration
of the MMS architecture using which one can write programs that
are independent of the MMS architecture. Further, a small subset of MMS
can be specified through drop and dimension parameters. For example, on a
3-drop, 2-dimension system, a 2-drop, 2-dimension system can be emulated
by specifying drop=2 and dimension=2. The environment supports software
calls for creating remote tasks, terminating tasks, interprocessor communication
and certain support routines to help in programming. These support
routines return the topological parameters. The interface to the software
environment has been developed for various languages.
We then discuss the following three benchmark programs that are coded
in C/C++ programming language:
Numerical Integration
Bubble Sort
Matrix Multiplication
We discuss the task partitioning for these problems, allocation on various
CEs and show how a program can be written using the programming
environment that is independent of the MMS configuration. We also present
analytical modeling for these programs, and compare the results from simulation
and implementation.