[ Search |  Up Level |  Project home |  Index |  Class hierarchy ]

 GetStartedDIY

A simple example of calling MPI yourself to run code in parallel.

This document explains the example program baseMPI.ox and shows how to run it on a particular cluster. The details of running programs will different across clusters. This is the DIY version because it uses the low-level routines that interface between Ox and the MPI library. The objects in CFMPI make it possible to avoid some of this programming or to avoid it entirely.

The program:
Source: examples/CFMPI/baseMPI.ox

is an example a very simply client/server setup.

Include CFMPI
Note that #include is used not #import. The reason is that the MPI interface routines have to be compiled on your machine and unlike the rest of niqlow the Ox code cannot be distributed as compiled .oxo files

Foil automatic (and problematic) processing by OxDoc
Explanation of the next 3 lines
The next three lines could just be the middle line: #include "UseMPI.ox";. The conditional compilation around it is really just a trick to keep the OxDoc program from trying to create documentation for that file. (OX_Parallel is defined for versions of Ox required by niqlow, so this always happens in Ox but OxDoc does not know about this macro so it skips the file.

Declare shared (global) space.
In this code, the server() and client() routines are separate, but in a serial environment they may have to act as if they are passing messages back and fort. The global variables (defined outside curly brackets) are used as share space for main() and the other routines. In the object-oriented solutions in CFMPI this would not be required. The classes for client and server tasks refer to inherited data members that will get set properly on all nodes.

main(){…}
First, call MPI_Init() in order to find out which node I am and how many other nodes there are. Next, if I the client node, start the client task. Otherwise, become a server.

client(){…}
Send messages to all servers.
Act like a server to myself (do some work while the other servers are busy)
Wait for all return messages to arrive (in a random order)
Use the results (here just print out the results)

server(){…}
Wait to receive a message from the client
Process the message
Send the results back to the client

Finalize the MPI environment
CFMPI uses Ox's OxRunMainExitCall() routine (described in the Developer's manual) to add a call to MPI_Finalize() as Ox exits. This means that the user's code can simply exit and no MPI error will be thrown.

Run the program without using MPI
oxl baseMPI
When the MPI environment is not requested, fake routines are inserted for MPI_Init(), MPI_Recv() and MPI_Send()
.
The first will set I=0 and Nodes=1 (serial environment). Then the others will act as if a message was past when really nothing happens. But by referencing the same location as the buffer the "message" is sitting there.
And since the client calls the server after getting (the non-existent) real servers started, the job is done.

Run the program using MPI
Follow Install and Use to compile CFMPI.c to a shared object file and ensure both that and the MPI library are on the LD_LIBRARY_PATH.
oxl -DMPI baseMPI
For this to work the program (oxl) must be run within the MPI environment. How that is done depends on the cluster, but usually there is a command or script which will execute your job with MPI.

SHARCnet script. Source: examples/sharc_test_MPI

This says "run the command at the end of the line in job queue; run it for no more than 18 seconds (0.3 minutes), send output to MPITest.out, send the job to the MPI execution queue (not serial or other options), ask for 5 processors. THe job is then simply to run the program and define MPI so that niqlow will try to link in the MPI interface routines.
Run the script
[ferrallc @saw-login2 examples]$ ./sharc_test_MPI
WARNING: no memory requirement defined; assuming 1GB per process.
WARNING: your timelimit (0.3m) is only 18 seconds; is this right?
submitted as jobid 4169545

Output Produced
Ox Console version 7.00 (Linux_64) (C) J.A. Doornik, 1994-2013
This version may be used for academic research and teaching only

Ox Console version 7.00 (Linux_64) (C) J.A. Doornik, 1994-2013 This version may be used for academic research and teaching only

Ox Console version 7.00 (Linux_64) (C) J.A. Doornik, 1994-2013 This version may be used for academic research and teaching only

Ox Console version 7.00 (Linux_64) (C) J.A. Doornik, 1994-2013 This version may be used for academic research and teaching only

Ox Console version 7.00 (Linux_64) (C) J.A. Doornik, 1994-2013 This version may be used for academic research and teaching only Server 4 received 0.22489 1.7400 -0.20426 -0.91760 -0.67417 -0.34353 0.22335 -0.14139 -0.18338 0.68035 Node 4 is done Server 1 received 1.2282 1.5784 -0.39334 0.45016 1.2814 -0.36170 1.0653 -1.9544 -0.10203 -0.21674 Node 1 is done Server 2 received 2.0016 0.57912 -0.70797 0.59336 -0.58939 1.4674 -0.020230 0.73706 1.4795 -0.26881 Node 2 is done Server 3 received 0.090558 -0.83328 0.81350 1.1174 0.31499 -0.50031 -1.6268 0.61943 -1.4574 -1.8035 Node 3 is done Server 0 received 0.40654 0.13833 0.65715 -0.16683 0.47835 0.46105 0.11538 0.038075 1.1944 -1.4600 Server 0 reports roots equal to -0.57083 1.3755 -0.57083 -1.3755 -1.0186 0.49171 -1.0186 -0.49171 0.19962 1.1266 0.19962 -1.1266 0.82948 0.74242 0.82948 -0.74242 0.78031 0.0000 Server 4 reports roots equal to -7.7927 0.0000 -0.87074 0.30347 -0.87074 -0.30347 -0.35572 0.86423 -0.35572 -0.86423 0.91222 0.12612 0.91222 -0.12612 0.34186 0.70678 0.34186 -0.70678 Server 2 reports roots equal to 0.91394 0.46905 0.91394 -0.46905 -1.2016 0.0000 0.27711 0.84810 0.27711 -0.84810 -0.41358 0.90114 -0.41358 -0.90114 -0.80983 0.0000 0.16712 0.0000 Server 3 reports roots equal to 7.8523 0.0000 1.9209 0.0000 1.4604 0.0000 -0.53961 1.0129 -0.53961 -1.0129 0.42852 0.83050 0.42852 -0.83050 -1.0863 0.0000 -0.72346 0.0000 Server 1 reports roots equal to -1.4156 0.42365 -1.4156 -0.42365 0.65216 0.81208 0.65216 -0.81208 0.84066 0.0000 -0.24734 0.86936 -0.24734 -0.86936 -0.052098 0.32521 -0.052098 -0.32521 Node 0 is done --- SharcNET Job Epilogue --- job id: 4169663 exit status: 0 elapsed time: 3s / 18s (16 %) virtual memory: 2.7M / 1.0G (0 %)

Job completed successfully WARNING: Job only used 16 % of its requested walltime. WARNING: Job only used 0% of its requested memory.