Parallel Algorithm Configurations
by D.W. Hyatt
Three Different Algorithm Approaches in PVM
There are a number of possible algorithm configurations that one could use
when designing a PVM program. We will present three such configurations
here: 1) Master - Slave Approach, 2) Relay Approach (or Loop ), and 3)
Tree Approach. Take a look at the following example programs and how the
different algorithms are used.
Master - Slave Approach
The master-slave approach is the most common method for a PVM computation.
It utilizes the power of parallel programming more easily than the other
methods because some computational task can be divided into sub-tasks, and
then these smaller units can be worked on independently by the slave tasks,
and the results sent back to the master when complete.
The following code shows how a master program sends an array of data out
to a number of slave programs, each of which calculates the sum of a
specific row. The slaves return the sum of an individual row of data
which the master then records. The data is sent out in what is known as
a "multicast" approach, where one packet is sent, yet all slaves listen
and receive simultaneously.
In a multi-cast situation, it is not inherently clear whether the
program would be more efficient if the master program only
sent the specific row of data to be calculated to just the appropriate
slave process because there would have to be a number of smaller
yet different packets of data sent out in sequence rather than one that
is sent to all. Whether the overhead for the extra pvm_pack( )
and pvm_send( ) routines is greater than the overhead of the
surplus data sent out is not clear. The relationship between the two
factors is an interesting thing to investigate.
Relay or Pipeline Approach
The rationale for the relay or pipeline approach parallels some common
approaches used in computer architecture for speeding up computations.
In this situation, if there are a number of steps that although discrete
must be computed in a certain sequence, an algorithm similar to a
"vector pipeline" might be used. This
is an assembly line approach where the first phase is done by the
first processor, afterwhich it passes the results along to the next
processor in line. Meanwhile, the first processor can begin working
on another batch of data. While the second processor is doing its
task. When the second processor is finished, it can pass the data
onto a third, while the second begins working on the next thing from
process one, and the first process can begin on yet another piece of
Eventually, the pipeline gets filled and the throughput is significantly
increased because each stage of the assembly line is attacking the task in
parallel. If the amount of work do be done is small, the pipeline approach
is not efficient because of the overhead required to fill the pipeline. If
the different stages have some hardware advantage, then the pipeline
might be good for improving throughput.
The tree computation approach is more useful for a "divide and conquer"
where a task is sent out to multiple slave-type processes, which in turn
send out off portions to one or more additional slaves. When the
calculations are eventually complete, the data is passed back up the
nodes of the tree and returned to the originator of the task.
One potential problem in this recursive approach is the assurance
that there is a terminating condition, otherwise a number of processes
can continue to be spawned by sub processes, resulting in an endless
propagation of child tasks, eventually overwhelming the system resources.
Take a look at the program below that allows the generation of a tree
with a maximum depth of four levels.
Multiple Processes on the Linux Cluster
The Linux Cluster uses a different programming approach. Instead of
manually deciding how each independent machine will handle a separate
program, the general approach is that a single program makes multiple
copies of iteself with often different parameters by "forking"
processes on the main node of the cluster.
The operating system then distributes those programs
to the various nodes of the cluster and each node completes
what it was supposed to do.
The following two pages give examples of programs on the Cluster:
A Simple Cluster Program
Working with The Linux Cluster: POV-Ray Animation