| 1 |
% $Header$ |
% $Header$ |
| 2 |
|
|
| 3 |
In this chapter we describe the software architecture and |
This chapter focuses on describing the {\bf WRAPPER} environment within which |
| 4 |
implementation strategy for the MITgcm code. The first part of this |
both the core numerics and the pluggable packages operate. The description |
| 5 |
chapter discusses the MITgcm architecture at an abstract level. In the second |
presented here is intended to be a detailed exposition and contains significant |
| 6 |
part of the chapter we described practical details of the MITgcm implementation |
background material, as well as advanced details on working with the WRAPPER. |
| 7 |
and of current tools and operating system features that are employed. |
The tutorial sections of this manual (see sections |
| 8 |
|
\ref{sect:tutorials} and \ref{sect:tutorialIII}) |
| 9 |
|
contain more succinct, step-by-step instructions on running basic numerical |
| 10 |
|
experiments, of varous types, both sequentially and in parallel. For many |
| 11 |
|
projects simply starting from an example code and adapting it to suit a |
| 12 |
|
particular situation |
| 13 |
|
will be all that is required. |
| 14 |
|
The first part of this chapter discusses the MITgcm architecture at an |
| 15 |
|
abstract level. In the second part of the chapter we described practical |
| 16 |
|
details of the MITgcm implementation and of current tools and operating system |
| 17 |
|
features that are employed. |
| 18 |
|
|
| 19 |
\section{Overall architectural goals} |
\section{Overall architectural goals} |
| 20 |
|
|
| 97 |
\resizebox{!}{4.5in}{\includegraphics{part4/fit_in_wrapper.eps}} |
\resizebox{!}{4.5in}{\includegraphics{part4/fit_in_wrapper.eps}} |
| 98 |
\end{center} |
\end{center} |
| 99 |
\caption{ |
\caption{ |
| 100 |
Numerical code is written too fit within a software support |
Numerical code is written to fit within a software support |
| 101 |
infrastructure called WRAPPER. The WRAPPER is portable and |
infrastructure called WRAPPER. The WRAPPER is portable and |
| 102 |
can be specialized for a wide range of specific target hardware and |
can be specialized for a wide range of specific target hardware and |
| 103 |
programming environments, without impacting numerical code that fits |
programming environments, without impacting numerical code that fits |
| 120 |
(UMA) and non-uniform memory access (NUMA) designs. Significant work has also |
(UMA) and non-uniform memory access (NUMA) designs. Significant work has also |
| 121 |
been undertaken on x86 cluster systems, Alpha processor based clustered SMP |
been undertaken on x86 cluster systems, Alpha processor based clustered SMP |
| 122 |
systems, and on cache-coherent NUMA (CC-NUMA) systems from Silicon Graphics. |
systems, and on cache-coherent NUMA (CC-NUMA) systems from Silicon Graphics. |
| 123 |
The MITgcm code, operating within the WRAPPER, is also used routinely used on |
The MITgcm code, operating within the WRAPPER, is also routinely used on |
| 124 |
large scale MPP systems (for example T3E systems and IBM SP systems). In all |
large scale MPP systems (for example T3E systems and IBM SP systems). In all |
| 125 |
cases numerical code, operating within the WRAPPER, performs and scales very |
cases numerical code, operating within the WRAPPER, performs and scales very |
| 126 |
competitively with equivalent numerical code that has been modified to contain |
competitively with equivalent numerical code that has been modified to contain |
| 661 |
computation is performed concurrently over as many processes and threads |
computation is performed concurrently over as many processes and threads |
| 662 |
as there are physical processors available to compute. |
as there are physical processors available to compute. |
| 663 |
|
|
| 664 |
|
An exception to the the use of {\em bi} and {\em bj} in loops arises in the |
| 665 |
|
exchange routines used when the exch2 package is used with the cubed |
| 666 |
|
sphere. In this case {\em bj} is generally set to 1 and the loop runs from |
| 667 |
|
1,{\em bi}. Within the loop {\em bi} is used to retrieve the tile number, |
| 668 |
|
which is then used to reference exchange parameters. |
| 669 |
|
|
| 670 |
The amount of computation that can be embedded |
The amount of computation that can be embedded |
| 671 |
a single loop over {\em bi} and {\em bj} varies for different parts of the |
a single loop over {\em bi} and {\em bj} varies for different parts of the |
| 672 |
MITgcm algorithm. Figure \ref{fig:bibj_extract} shows a code extract |
MITgcm algorithm. Figure \ref{fig:bibj_extract} shows a code extract |
| 787 |
forty grid points in y. The two sub-domains in each process will be computed |
forty grid points in y. The two sub-domains in each process will be computed |
| 788 |
sequentially if they are given to a single thread within a single process. |
sequentially if they are given to a single thread within a single process. |
| 789 |
Alternatively if the code is invoked with multiple threads per process |
Alternatively if the code is invoked with multiple threads per process |
| 790 |
the two domains in y may be computed on concurrently. |
the two domains in y may be computed concurrently. |
| 791 |
\item |
\item |
| 792 |
\begin{verbatim} |
\begin{verbatim} |
| 793 |
PARAMETER ( |
PARAMETER ( |
| 823 |
WRAPPER is shown in figure \ref{fig:wrapper_startup}. |
WRAPPER is shown in figure \ref{fig:wrapper_startup}. |
| 824 |
|
|
| 825 |
\begin{figure} |
\begin{figure} |
| 826 |
|
{\footnotesize |
| 827 |
\begin{verbatim} |
\begin{verbatim} |
| 828 |
|
|
| 829 |
MAIN |
MAIN |
| 852 |
|
|
| 853 |
|
|
| 854 |
\end{verbatim} |
\end{verbatim} |
| 855 |
|
} |
| 856 |
\caption{Main stages of the WRAPPER startup procedure. |
\caption{Main stages of the WRAPPER startup procedure. |
| 857 |
This process proceeds transfer of control to application code, which |
This process proceeds transfer of control to application code, which |
| 858 |
occurs through the procedure {\em THE\_MODEL\_MAIN()}. |
occurs through the procedure {\em THE\_MODEL\_MAIN()}. |
| 935 |
File: {\em eesupp/inc/MAIN\_PDIRECTIVES2.h}\\ |
File: {\em eesupp/inc/MAIN\_PDIRECTIVES2.h}\\ |
| 936 |
File: {\em model/src/THE\_MODEL\_MAIN.F}\\ |
File: {\em model/src/THE\_MODEL\_MAIN.F}\\ |
| 937 |
File: {\em eesupp/src/MAIN.F}\\ |
File: {\em eesupp/src/MAIN.F}\\ |
| 938 |
File: {\em tools/genmake}\\ |
File: {\em tools/genmake2}\\ |
| 939 |
File: {\em eedata}\\ |
File: {\em eedata}\\ |
| 940 |
CPP: {\em TARGET\_SUN}\\ |
CPP: {\em TARGET\_SUN}\\ |
| 941 |
CPP: {\em TARGET\_DEC}\\ |
CPP: {\em TARGET\_DEC}\\ |
| 974 |
of controlling and coordinating the start up of a large number |
of controlling and coordinating the start up of a large number |
| 975 |
(hundreds and possibly even thousands) of copies of the same |
(hundreds and possibly even thousands) of copies of the same |
| 976 |
program, MPI is used. The calls to the MPI multi-process startup |
program, MPI is used. The calls to the MPI multi-process startup |
| 977 |
routines must be activated at compile time. This is done |
routines must be activated at compile time. Currently MPI libraries are |
| 978 |
by setting the {\em ALLOW\_USE\_MPI} and {\em ALWAYS\_USE\_MPI} |
invoked by |
| 979 |
flags in the {\em CPP\_EEOPTIONS.h} file.\\ |
specifying the appropriate options file with the |
| 980 |
|
{\tt-of} flag when running the {\em genmake2} |
| 981 |
\fbox{ |
script, which generates the Makefile for compiling and linking MITgcm. |
| 982 |
\begin{minipage}{4.75in} |
(Previously this was done by setting the {\em ALLOW\_USE\_MPI} and |
| 983 |
File: {\em eesupp/inc/CPP\_EEOPTIONS.h}\\ |
{\em ALWAYS\_USE\_MPI} flags in the {\em CPP\_EEOPTIONS.h} file.) More |
| 984 |
CPP: {\em ALLOW\_USE\_MPI}\\ |
detailed information about the use of {\em genmake2} for specifying |
| 985 |
CPP: {\em ALWAYS\_USE\_MPI}\\ |
local compiler flags is located in section \ref{sect:genmake}.\\ |
|
Parameter: {\em nPx}\\ |
|
|
Parameter: {\em nPy} |
|
|
\end{minipage} |
|
|
} \\ |
|
| 986 |
|
|
|
Additionally, compile time options are required to link in the |
|
|
MPI libraries and header files. Examples of these options |
|
|
can be found in the {\em genmake} script that creates makefiles |
|
|
for compilation. When this script is executed with the {bf -mpi} |
|
|
flag it will generate a makefile that includes |
|
|
paths for search for MPI head files and for linking in |
|
|
MPI libraries. For example the {\bf -mpi} flag on a |
|
|
Silicon Graphics IRIX system causes a |
|
|
Makefile with the compilation command |
|
|
Graphics IRIX system \begin{verbatim} |
|
|
mpif77 -I/usr/local/mpi/include -DALLOW_USE_MPI -DALWAYS_USE_MPI |
|
|
\end{verbatim} |
|
|
to be generated. |
|
|
This is the correct set of options for using the MPICH open-source |
|
|
version of MPI, when it has been installed under the subdirectory |
|
|
/usr/local/mpi. |
|
|
However, on many systems there may be several |
|
|
versions of MPI installed. For example many systems have both |
|
|
the open source MPICH set of libraries and a vendor specific native form |
|
|
of the MPI libraries. The correct setup to use will depend on the |
|
|
local configuration of your system.\\ |
|
| 987 |
|
|
| 988 |
\fbox{ |
\fbox{ |
| 989 |
\begin{minipage}{4.75in} |
\begin{minipage}{4.75in} |
| 990 |
File: {\em tools/genmake} |
Directory: {\em tools/build\_options}\\ |
| 991 |
|
File: {\em tools/genmake2} |
| 992 |
\end{minipage} |
\end{minipage} |
| 993 |
} \\ |
} \\ |
| 994 |
\paragraph{\bf Execution} The mechanics of starting a program in |
\paragraph{\bf Execution} The mechanics of starting a program in |
| 1006 |
in the file {\em SIZE.h}. The parameter {\em mf} specifies that a text file |
in the file {\em SIZE.h}. The parameter {\em mf} specifies that a text file |
| 1007 |
called ``mf'' will be read to get a list of processor names on |
called ``mf'' will be read to get a list of processor names on |
| 1008 |
which the sixty-four processes will execute. The syntax of this file |
which the sixty-four processes will execute. The syntax of this file |
| 1009 |
is specified by the MPI distribution |
is specified by the MPI distribution. |
| 1010 |
\\ |
\\ |
| 1011 |
|
|
| 1012 |
\fbox{ |
\fbox{ |
| 1057 |
Allocation of processes to tiles in controlled by the routine |
Allocation of processes to tiles in controlled by the routine |
| 1058 |
{\em INI\_PROCS()}. For each process this routine sets |
{\em INI\_PROCS()}. For each process this routine sets |
| 1059 |
the variables {\em myXGlobalLo} and {\em myYGlobalLo}. |
the variables {\em myXGlobalLo} and {\em myYGlobalLo}. |
| 1060 |
These variables specify (in index space) the coordinate |
These variables specify in index space the coordinates |
| 1061 |
of the southern most and western most corner of the |
of the southernmost and westernmost corner of the |
| 1062 |
southern most and western most tile owned by this process. |
southernmost and westernmost tile owned by this process. |
| 1063 |
The variables {\em pidW}, {\em pidE}, {\em pidS} and {\em pidN} |
The variables {\em pidW}, {\em pidE}, {\em pidS} and {\em pidN} |
| 1064 |
are also set in this routine. These are used to identify |
are also set in this routine. These are used to identify |
| 1065 |
processes holding tiles to the west, east, south and north |
processes holding tiles to the west, east, south and north |
| 1066 |
of this process. These values are stored in global storage |
of this process. These values are stored in global storage |
| 1067 |
in the header file {\em EESUPPORT.h} for use by |
in the header file {\em EESUPPORT.h} for use by |
| 1068 |
communication routines. |
communication routines. The above does not hold when the |
| 1069 |
|
exch2 package is used -- exch2 sets its own parameters to |
| 1070 |
|
specify the global indices of tiles and their relationships |
| 1071 |
|
to each other. See the documentation on the exch2 package |
| 1072 |
|
(\ref{sec:exch2}) for |
| 1073 |
|
details. |
| 1074 |
\\ |
\\ |
| 1075 |
|
|
| 1076 |
\fbox{ |
\fbox{ |
| 1096 |
describes the information that is held and used. |
describes the information that is held and used. |
| 1097 |
|
|
| 1098 |
\begin{enumerate} |
\begin{enumerate} |
| 1099 |
\item {\bf Tile-tile connectivity information} For each tile the WRAPPER |
\item {\bf Tile-tile connectivity information} |
| 1100 |
sets a flag that sets the tile number to the north, south, east and |
For each tile the WRAPPER |
| 1101 |
|
sets a flag that sets the tile number to the north, |
| 1102 |
|
south, east and |
| 1103 |
west of that tile. This number is unique over all tiles in a |
west of that tile. This number is unique over all tiles in a |
| 1104 |
configuration. The number is held in the variables {\em tileNo} |
configuration. Except when using the cubed sphere and the exch2 package, |
| 1105 |
|
the number is held in the variables {\em tileNo} |
| 1106 |
( this holds the tiles own number), {\em tileNoN}, {\em tileNoS}, |
( this holds the tiles own number), {\em tileNoN}, {\em tileNoS}, |
| 1107 |
{\em tileNoE} and {\em tileNoW}. A parameter is also stored with each tile |
{\em tileNoE} and {\em tileNoW}. A parameter is also stored with each tile |
| 1108 |
that specifies the type of communication that is used between tiles. |
that specifies the type of communication that is used between tiles. |
| 1125 |
(see figure \ref{fig:communication_primitives}). The routine |
(see figure \ref{fig:communication_primitives}). The routine |
| 1126 |
{\em ini\_communication\_patterns()} is responsible for setting the |
{\em ini\_communication\_patterns()} is responsible for setting the |
| 1127 |
communication mode values for each tile. |
communication mode values for each tile. |
| 1128 |
\\ |
|
| 1129 |
|
When using the cubed sphere configuration with the exch2 package, the |
| 1130 |
|
relationships between tiles and their communication methods are set |
| 1131 |
|
by the package in other variables. See the exch2 package documentation |
| 1132 |
|
(\ref{sec:exch2} for details. |
| 1133 |
|
|
| 1134 |
|
|
| 1135 |
|
|
| 1136 |
\fbox{ |
\fbox{ |
| 1137 |
\begin{minipage}{4.75in} |
\begin{minipage}{4.75in} |
| 1422 |
|
|
| 1423 |
WRAPPER layer. |
WRAPPER layer. |
| 1424 |
|
|
| 1425 |
|
{\footnotesize |
| 1426 |
\begin{verbatim} |
\begin{verbatim} |
| 1427 |
|
|
| 1428 |
MAIN |
MAIN |
| 1450 |
|--THE_MODEL_MAIN :: Numerical code top-level driver routine |
|--THE_MODEL_MAIN :: Numerical code top-level driver routine |
| 1451 |
|
|
| 1452 |
\end{verbatim} |
\end{verbatim} |
| 1453 |
|
} |
| 1454 |
|
|
| 1455 |
Core equations plus packages. |
Core equations plus packages. |
| 1456 |
|
|
| 1457 |
|
{\footnotesize |
| 1458 |
\begin{verbatim} |
\begin{verbatim} |
| 1459 |
C |
C |
| 1460 |
C |
C |
| 1793 |
C :: events. |
C :: events. |
| 1794 |
C |
C |
| 1795 |
\end{verbatim} |
\end{verbatim} |
| 1796 |
|
} |
| 1797 |
|
|
| 1798 |
\subsection{Measuring and Characterizing Performance} |
\subsection{Measuring and Characterizing Performance} |
| 1799 |
|
|