A Gentle Introduction on How to use Cluster Effectively

Cluster is a pretty powerful setup; even though the computers are old, Cluster as a whole is about three times as powerful as any single computer in the lab. However, to use this power requires some tweaking of your program. In many cases, it may fairly little; perhaps only twenty lines in the whole program. However, this requires that there be some parallelism in your program; that is what Cluster thrives on.

Parallelism

Most programs are designed to run straight through, one piece at a time. Indeed, most languages encourage this idea; you assign a variable a value, and perform operations on it. Or you have a loop that performs an operation on an array one at a time. This code is very serial; not parallel. However, in many cases it's not strictly necessary to go through each element of the array in order; for example, the C code

for(i = 0; i < 100; i++)
   array[i] = array[i] * i;
could be written just as well as
for(i = 0; i < 100; i += 2)
   array[i] = array[i] * i;
for(i = 1; i < 100; i += 2)
   array[i] = array[i] * i;
While this makes less sense in most contexts, it shows that the order that the array elements are assigned in does not matter. Thus, this snippet of code exhibits parallelism

Processes

But how to take advantage of this? The key, at least on Cluster, is to use multiple processes. But what is a process?

Any time you run a program in Linux (or any other operation system), a new process is start for it. For example, if you open up an Xterm, you get a new process to handle that. If you open up Mozilla, you get a new process for that. But some programs, such as Galeon, will open up several processes. Why? I don't know why Galeon does this, but on Cluster, you do this to speed up your program. Since Cluster has 16 computers, or nodes, each of up to 16 processes can run on a different computer. And you don't have to worry about that. All you do is start 16 processes, and they're automagically distributed among the nodes.

But how do you have 16 different processes? You don't run the program 16 times; instead, you use the power of

The Fork Function (System Call)

Like a road that branches off at a fork, so your program can branch off at a fork(). What this function does is create a brand new copy of your program, with identical values for variables at the same point in the program (immediately after the fork). So how do you know which is which? Because if both processes do the exact same work, you're just doing the same thing twice. But there is one key difference - in the parent process, fork returns the Process Identifier, or PID. But in the child process, fork returns 0. So the simplest way to tell which is which is by testing the return value if fork:

if(fork() == 0)
   printf("Hello!  I'm the child process!\n");
else
   printf("Hello!  I'm the parent process!\n");
However, the PID of the child process can be important for the parent to know, so you probably want to save it:
int childPID;
if((childPID = fork()) == 0)
   printf("Hello!  I'm the child process!\n");
else
   printf("Hello!  I'm the parent process, and I know that the child process's PID is %d\n", childPID);
And for the child, childPID is set to 0, but that doesn't matter since the child will never refer to it.
Note: In the if statement, the expression (childPID = fork()) == 0 is used. While perhaps not the clearest way to say it, it works since the "=" operator returns the value given to it, i.e., the value fork() returned. And it seems to be a fairly standard way of doing this, since it's fairly clear what you mean. In fact, I think it's also common to use the even more shortened form if(childPID = fork()) { parentcode } { childcode } .

Now it's time for an example.

#include <stdio.h>

int main() {
   int i, num = 0;
   for(i = 0; i < 4; i++)
      num = 2 * num + (fork() ? 1 : 0);
   printf("%2d ", num);
   return 0;
}
Look at it, and try to figure out what it does. Then you can copy and paste it directly into a file and compile and run it. Are you surprised by what happens? Run it several times, and see what happens.

Interprocess Communication

However, to do much useful, you need to have communication between the processes. There are several ways to do this. One method that's simple but never used is a file. Why is this never used? Because if multiple processes try to use a file at the same time, unpredictable things might happen. For example, if two try to write at the same time, one will probably just overwrite what the other does. And it gets very complex.

However, Linux has a similar feature meant for just this application - they're called pipes. In fact, you may already have used them without realizing it. For example, have you ever used a command such as ls -l | less or cat namelist | grep Evan? Those are the same devices you'll be using in this program. In the previous commands, all you had to do was say that you wanted the output of one program to go directly into the other. However, in programming, pipes (or more specifically, half-duplex pipes are much more powerful. You can have a single process that has several pipes open to various other processes. Then it can send and recieve from all these processes whenever it wants.

But if you want to have several processes all talking back to one master process, there may be a more useful feature - named pipes In these, any number of slave processes can write, and the master process gets all the input in the order it was written.

Half-duplex Pipes

Half-duplex pipes (henceforth simply called pipes) are a simple method of sending information from one process to another. The simplest way to create one of these is to take an two-element integer array (i.e., int pipes[2]), and use the pipe function, as follows:

int pipes[2];
pipe(pipes);
And now the pipes array is special - anything written to pipes[1] using the write function can be read out of pipes[0] using - you guessed it - the read function. Both of these take the pipe to read from (write to) as the first argument, followed by a pointer to the data to read to (write from), followed by the amount of data to read (write). For example, the following code will open a pipe, write to it, and then read from it:
int pipes[2];
int data = 100, data2;
pipe(pipes);
write(pipes[1], &data, sizeof(int));
read(pipes[0], &data2, sizeof(int));
printf("After %d was written into the pipe, %d came out!", data, data2);

Fun, eh? But having a program send data to itself probably isn't what you had planned. But, by creating the pipe and the forking, you can have two processes bound together by a common pipe. However, since a pipe (at least the half-duplex variety) can only have one input and one output, you must close the input side in one process, and the output side in the other using the aptly named fork function. So, to fork, and then send data from the parent to the child, the following code will work (and compile, if you like):

#include <stdio.h>
#include <string.h>

int main() {
   int pipes[2];
   int messagelen = 22;
   char data[22];
   pipe(pipes);
   if(fork()) {
// remember - if fork returns a non-zero value, this is the parent
      close(pipes[0]);  // the parent doesn't read - just writes.
      strcpy(data, "Go clean your room!");
      write(pipes[1], data, messagelen * sizeof(char));
      printf("Message Sent\n");
   }
   else {
      close(pipes[1]);  // and the child just reads, with no writing
      read(pipes[0], data, messagelen * sizeof(char));
      printf("Uh oh - I've been told: \"%s\"\n", data);
   }
}

Note that this example also sends a string rather than a number; in fact, any form of data can be sent, as long as the processes at both ends know what is being sent. Also note that, since a "string" in C is actually an array of characters, the second argument given to read and write is in fact a pointer to the string, not the string itself. Always be careful when dealing with pointers!

Note: Most of the information here is taken from http://db.ilug-bom.org.in/Documentation/lpg/node7.html and has been verified by direct testing. If you have any questions or comments, e-mail edanaher@tjhsst.edu