--- manual/s_software/text/sarch.tex	2002/02/28 19:32:20	1.7
+++ manual/s_software/text/sarch.tex	2004/03/23 15:29:40	1.17
@@ -1,11 +1,11 @@
-% $Header: /home/ubuntu/mnt/e9_copy/manual/s_software/text/sarch.tex,v 1.7 2002/02/28 19:32:20 cnh Exp $
+% $Header: /home/ubuntu/mnt/e9_copy/manual/s_software/text/sarch.tex,v 1.17 2004/03/23 15:29:40 afe Exp $
 
 This chapter focuses on describing the {\bf WRAPPER} environment within which
 both the core numerics and the pluggable packages operate. The description
 presented here is intended to be a detailed exposition and contains significant
 background material, as well as advanced details on working with the WRAPPER. 
-The tutorial sections of this manual (see Chapters
-\ref{chap:tutorialI}, \ref{chap:tutorialII} and \ref{chap:tutorialIII}) 
+The tutorial sections of this manual (see sections
+\ref{sect:tutorials}  and \ref{sect:tutorialIII}) 
 contain more succinct, step-by-step instructions on running basic numerical
 experiments, of varous types, both sequentially and in parallel. For many 
 projects simply starting from an example code and adapting it to suit a 
@@ -76,6 +76,9 @@
 \end{figure}
 
 \section{WRAPPER}
+\begin{rawhtml}
+<!-- CMIREDIR:wrapper -->
+\end{rawhtml}
 
 A significant element of the software architecture utilized in
 MITgcm is a software superstructure and substructure collectively
@@ -97,7 +100,7 @@
 \resizebox{!}{4.5in}{\includegraphics{part4/fit_in_wrapper.eps}}
 \end{center}
 \caption{
-Numerical code is written too fit within a software support
+Numerical code is written to fit within a software support
 infrastructure called WRAPPER. The WRAPPER is portable and
 can be specialized for a wide range of specific target hardware and 
 programming environments, without impacting numerical code that fits
@@ -120,7 +123,7 @@
 (UMA) and non-uniform memory access (NUMA) designs. Significant work has also 
 been undertaken on x86 cluster systems, Alpha processor based clustered SMP 
 systems, and on cache-coherent NUMA (CC-NUMA) systems from Silicon Graphics. 
-The MITgcm code, operating within the WRAPPER, is also used routinely used on 
+The MITgcm code, operating within the WRAPPER, is also routinely used on 
 large scale MPP systems (for example T3E systems and IBM SP systems). In all 
 cases numerical code, operating within the WRAPPER, performs and scales very 
 competitively with equivalent numerical code that has been modified to contain 
@@ -150,6 +153,9 @@
 computer architecture currently available to the scientific computing community.
 
 \subsection{Machine model parallelism}
+\begin{rawhtml}
+<!-- CMIREDIR:domain_decomp -->
+\end{rawhtml}
 
  Codes operating under the WRAPPER target an abstract machine that is assumed to
 consist of one or more logical processors that can compute concurrently.  
@@ -661,6 +667,12 @@
 computation is performed concurrently over as many processes and threads
 as there are physical processors available to compute.
 
+An exception to the the use of {\em bi} and {\em bj} in loops arises in the
+exchange routines used when the exch2 package is used with the cubed 
+sphere.  In this case {\em bj} is generally set to 1 and the loop runs from 
+1,{\em bi}.  Within the loop {\em bi} is used to retrieve the tile number,
+which is then used to reference exchange parameters.
+
 The amount of computation that can be embedded
 a single loop over {\em bi} and {\em bj} varies for different parts of the
 MITgcm algorithm. Figure \ref{fig:bibj_extract} shows a code extract
@@ -781,7 +793,7 @@
 forty grid points in y. The two sub-domains in each process will be computed 
 sequentially if they are given to a single thread within a single process.
 Alternatively if the code is invoked with multiple threads per process
-the two domains in y may be computed on concurrently.
+the two domains in y may be computed concurrently.
 \item
 \begin{verbatim}
       PARAMETER (
@@ -817,6 +829,7 @@
 WRAPPER is shown in figure \ref{fig:wrapper_startup}.
 
 \begin{figure}
+{\footnotesize
 \begin{verbatim}
 
        MAIN  
@@ -845,6 +858,7 @@
 
 
 \end{verbatim}
+}
 \caption{Main stages of the WRAPPER startup procedure.
 This process proceeds transfer of control to application code, which
 occurs through the procedure {\em THE\_MODEL\_MAIN()}.
@@ -927,7 +941,7 @@
 File: {\em eesupp/inc/MAIN\_PDIRECTIVES2.h}\\
 File: {\em model/src/THE\_MODEL\_MAIN.F}\\
 File: {\em eesupp/src/MAIN.F}\\
-File: {\em tools/genmake}\\
+File: {\em tools/genmake2}\\
 File: {\em eedata}\\
 CPP:  {\em TARGET\_SUN}\\
 CPP:  {\em TARGET\_DEC}\\
@@ -966,45 +980,21 @@
 of controlling and coordinating the start up of a large number
 (hundreds and possibly even thousands) of copies of the same 
 program, MPI is used. The calls to the MPI multi-process startup
-routines must be activated at compile time. This is done
-by setting the {\em ALLOW\_USE\_MPI} and {\em ALWAYS\_USE\_MPI}
-flags in the {\em CPP\_EEOPTIONS.h} file.\\
-
-\fbox{ 
-\begin{minipage}{4.75in}
-File: {\em eesupp/inc/CPP\_EEOPTIONS.h}\\
-CPP:  {\em ALLOW\_USE\_MPI}\\
-CPP:  {\em ALWAYS\_USE\_MPI}\\
-Parameter:  {\em nPx}\\
-Parameter:  {\em nPy}
-\end{minipage}
-} \\
+routines must be activated at compile time.  Currently MPI libraries are 
+invoked by
+specifying the appropriate options file with the 
+{\tt-of} flag when running the {\em genmake2} 
+script, which generates the Makefile for compiling and linking MITgcm.
+(Previously this was done by setting the {\em ALLOW\_USE\_MPI} and 
+{\em ALWAYS\_USE\_MPI} flags in the {\em CPP\_EEOPTIONS.h} file.)  More
+detailed information about the use of {\em genmake2} for specifying 
+local compiler flags is located in section \ref{sect:genmake}.\\  
 
-Additionally, compile time options are required to link in the 
-MPI libraries and header files. Examples of these options 
-can be found in the {\em genmake} script that creates makefiles
-for compilation. When this script is executed with the {bf -mpi}
-flag it will generate a makefile that includes
-paths for search for MPI head files and for linking in 
-MPI libraries. For example the {\bf -mpi} flag on a
- Silicon Graphics IRIX system causes a
-Makefile with the compilation command
-Graphics IRIX system \begin{verbatim}
-mpif77 -I/usr/local/mpi/include -DALLOW_USE_MPI -DALWAYS_USE_MPI
-\end{verbatim}
-to be generated.
-This is the correct set of options for using the MPICH open-source
-version of MPI, when it has been installed under the subdirectory
-/usr/local/mpi.
-However, on many systems there may be several
-versions of MPI installed. For example many systems have both
-the open source MPICH set of libraries and a vendor specific native form
-of the MPI libraries. The correct setup to use will depend on the
-local configuration of your system.\\
 
 \fbox{ 
 \begin{minipage}{4.75in}
-File: {\em tools/genmake}
+Directory: {\em tools/build\_options}\\
+File: {\em tools/genmake2}
 \end{minipage}
 } \\
 \paragraph{\bf Execution} The mechanics of starting a program in 
@@ -1022,7 +1012,7 @@
 in the file {\em SIZE.h}. The parameter {\em mf} specifies that a text file
 called ``mf'' will be read to get a list of processor names on
 which the sixty-four processes will execute. The syntax of this file
-is specified by the MPI distribution
+is specified by the MPI distribution.
 \\ 
 
 \fbox{ 
@@ -1073,15 +1063,20 @@
 Allocation of processes to tiles in controlled by the routine
 {\em INI\_PROCS()}. For each process this routine sets
 the variables {\em myXGlobalLo} and {\em myYGlobalLo}.
-These variables specify (in index space) the coordinate
-of the southern most and western most corner of the 
-southern most and western most tile owned by this process.
+These variables specify in index space the coordinates
+of the southernmost and westernmost corner of the 
+southernmost and westernmost tile owned by this process.
 The variables {\em pidW}, {\em pidE}, {\em pidS} and {\em pidN}
 are also set in this routine. These are used to identify
 processes holding tiles to the west, east, south and north 
 of this process. These values are stored in global storage
 in the header file {\em EESUPPORT.h} for use by
-communication routines.
+communication routines.  The above does not hold when the 
+exch2 package is used -- exch2 sets its own parameters to 
+specify the global indices of tiles and their relationships
+to each other.  See the documentation on the exch2 package
+(\ref{sec:exch2})  for
+details.
 \\
 
 \fbox{ 
@@ -1107,10 +1102,13 @@
 describes the information that is held and used.
 
 \begin{enumerate}
-\item {\bf Tile-tile connectivity information} For each tile the WRAPPER
-sets a flag that sets the tile number to the north, south, east and
+\item {\bf Tile-tile connectivity information} 
+For each tile the WRAPPER
+sets a flag that sets the tile number to the north, 
+south, east and
 west of that tile. This number is unique over all tiles in a 
-configuration. The number is held in the variables {\em tileNo}
+configuration. Except when using the cubed sphere and the exch2 package,
+the number is held in the variables {\em tileNo}
 ( this holds the tiles own number), {\em tileNoN}, {\em tileNoS},
 {\em tileNoE} and {\em tileNoW}. A parameter is also stored with each tile
 that specifies the type of communication that is used between tiles.
@@ -1133,7 +1131,13 @@
 (see figure \ref{fig:communication_primitives}). The routine 
 {\em ini\_communication\_patterns()} is responsible for setting the
 communication mode values for each tile.
-\\
+
+When using the cubed sphere configuration with the exch2 package, the 
+relationships between tiles and their communication methods are set 
+by the package in other variables.  See the exch2 package documentation 
+(\ref{sec:exch2} for details.
+
+
 
 \fbox{ 
 \begin{minipage}{4.75in}
@@ -1269,7 +1273,9 @@
 the cube-sphere grid. In this class of grid a rotation may be required
 between tiles. Aligning the coordinate requiring rotation with the
 tile decomposition, allows the coordinate transformation to 
-be embedded within a custom form of the \_EXCH primitive.
+be embedded within a custom form of the \_EXCH primitive.  In these
+cases \_EXCH is mapped to exch2 routines, as detailed in the exch2
+package documentation  \ref{sec:exch2}.
 
 \item {\bf Reverse Mode}
 The communication primitives \_EXCH and \_GSUM both employ 
@@ -1286,6 +1292,7 @@
 is set to the value {\em REVERSE\_SIMULATION}. This signifies 
 ti the low-level routines that the adjoint forms of the
 appropriate communication operation should be performed.
+
 \item {\bf MAX\_NO\_THREADS}
 The variable {\em MAX\_NO\_THREADS} is used to indicate the
 maximum number of OS threads that a code will use. This
@@ -1390,7 +1397,7 @@
 This is done to allow a large number of variations on the exchange 
 process to be maintained. One set of variations supports the
 cube sphere grid. Support for a cube sphere grid in MITgcm is based
-on having each face of the cube as a separate tile (or tiles).
+on having each face of the cube as a separate tile or tiles.
 The exchange routines are then able to absorb much of the
 detailed rotation and reorientation required when moving around the
 cube grid. The set of {\em \_EXCH} routines that contain the
@@ -1424,6 +1431,7 @@
 
 WRAPPER layer.
 
+{\footnotesize
 \begin{verbatim}
 
        MAIN  
@@ -1451,9 +1459,11 @@
        |--THE_MODEL_MAIN   :: Numerical code top-level driver routine
 
 \end{verbatim}
+}
 
 Core equations plus packages.
 
+{\footnotesize
 \begin{verbatim}
 C
 C
@@ -1792,6 +1802,7 @@
 C                     :: events.
 C 
 \end{verbatim}
+}
 
 \subsection{Measuring and Characterizing Performance}