--- manual/s_phys_pkgs/text/exch2.tex	2004/01/29 17:55:35	1.4
+++ manual/s_phys_pkgs/text/exch2.tex	2004/03/17 19:49:22	1.13
@@ -1,4 +1,4 @@
-% $Header: /home/ubuntu/mnt/e9_copy/manual/s_phys_pkgs/text/exch2.tex,v 1.4 2004/01/29 17:55:35 afe Exp $
+% $Header: /home/ubuntu/mnt/e9_copy/manual/s_phys_pkgs/text/exch2.tex,v 1.13 2004/03/17 19:49:22 afe Exp $
 % $Name:  $
 
 %%  * Introduction
@@ -10,98 +10,410 @@
 %%    o automatically inserted at \section{Reference} 
 
 
-\section{exch2: Extended Cubed Sphere Exchange}
+\section{exch2: Extended Cubed Sphere \mbox{Topology}}
 \label{sec:exch2}
 
 
 \subsection{Introduction}
 
-The exch2 package is an extension to the original cubed sphere exchanges
-to allow more flexible domain decomposition and parallelization.  Cube faces
-(subdomains) may be divided into whatever number of tiles that divide evenly
-into the grid point dimensions of the subdomain.  Furthermore, the individual
-tiles may be run on separate processors in different combinations,
-and whether exchanges between particular tiles occur between different
-processors is determined at runtime.
-
-The exchange parameters are declared in {\em W2\_EXCH2\_TOPOLOGY.h} and 
-assigned in {\em w2\_e2setup.F}, both in the 
-{\em pkg/exch2} directory.  The validity of the cube topology depends
-on the {\em SIZE.h} file as detailed below.  Both files are generated by 
-Matlab scripts and
-should not be edited.  The default files provided in the release set up
-a cube sphere arrangement of six tiles, one per subdomain, each with 32x32 grid
-points, running on a single processor.  
+The \texttt{exch2} package extends the original cubed
+sphere topology configuration to allow more flexible domain
+decomposition and parallelization.  Cube faces (also called
+subdomains) may be divided into any number of tiles that divide evenly
+into the grid point dimensions of the subdomain.  Furthermore, the
+individual tiles can run on separate processors in different
+combinations, and whether exchanges between particular tiles occur
+between different processors is determined at runtime.  This
+flexibility provides for manual compile-time load balancing across a
+relatively arbitrary number of processors. \\
+
+The exchange parameters are declared in
+\filelink{pkg/exch2/W2\_EXCH2\_TOPOLOGY.h}{pkg-exch2-W2_EXCH2_TOPOLOGY.h}
+and assigned in
+\filelink{pkg/exch2/w2\_e2setup.F}{pkg-exch2-w2_e2setup.F}. The
+validity of the cube topology depends on the \file{SIZE.h} file as
+detailed below.  The default files provided in the release configure a
+cubed sphere topology of six tiles, one per subdomain, each with
+32$\times$32 grid points, all running on a single processor.  Both
+files are generated by Matlab scripts in
+\file{utils/exch2/matlab-topology-generator}; see Section
+\ref{sec:topogen} \sectiontitle{Generating Topology Files for exch2}
+for details on creating alternate topologies.  Pregenerated examples
+of these files with alternate topologies are provided under
+\file{utils/exch2/code-mods} along with the appropriate \file{SIZE.h}
+file for single-processor execution.
+
+\subsection{Invoking exch2}
+
+To use exch2 with the cubed sphere, the following conditions must be
+met: \\
+
+$\bullet$ The exch2 package is included when \file{genmake2} is run.
+  The easiest way to do this is to add the line \code{exch2} to the
+  \file{profile.conf} file -- see Section
+  \ref{sect:buildingCode} \sectiontitle{Building the code} for general
+  details. \\
+
+$\bullet$ An example of \file{W2\_EXCH2\_TOPOLOGY.h} and
+  \file{w2\_e2setup.F} must reside in a directory containing code
+  linked when \file{genmake2} runs.  The safest place to put these
+  is the directory indicated in the \code{-mods=DIR} command line
+  modifier (typically \file{../code}), or the build directory.  The
+  default versions of these files reside in \file{pkg/exch2} and are
+  linked automatically if no other versions exist elsewhere in the
+  link path, but they should be left untouched to avoid breaking
+  configurations other than the one you intend to modify.\\
+
+$\bullet$ Files containing grid parameters, named
+  \file{tile00$n$.mitgrid} where $n$=\code{(1:6)} (one per subdomain),
+  must be in the working directory when the MITgcm executable is run.
+  These files are provided in the example experiments for cubed sphere
+  configurations with 32$\times$32 cube sides and are non-trivial to
+  generate -- please contact MITgcm support if you want to generate
+  files for other configurations. \\
+
+$\bullet$ As always when compiling MITgcm, the file \file{SIZE.h} must
+  be placed where \file{genmake2} will find it.  In particular for
+  exch2, the domain decomposition specified in \file{SIZE.h} must
+  correspond with the particular configuration's topology specified in
+  \file{W2\_EXCH2\_TOPOLOGY.h} and \file{w2\_e2setup.F}.  Domain
+  decomposition issues particular to exch2 are addressed in Section
+  \ref{sec:topogen} \sectiontitle{Generating Topology Files for exch2}
+  and \ref{sec:exch2mpi} \sectiontitle{exch2, SIZE.h, and MPI}; a more
+  general background on the subject relevant to MITgcm is presented in
+  Section \ref{sect:specifying_a_decomposition}
+  \sectiontitle{Specifying a decomposition}.\\
+
+As of the time of writing the following examples use exch2 and may be
+used for guidance:
+
+\begin{verbatim}
+verification/adjust_nlfs.cs-32x32x1
+verification/adjustment.cs-32x32x1 
+verification/aim.5l_cs
+verification/global_ocean.cs32x15
+verification/hs94.cs-32x32x5
+\end{verbatim}
+
+
+
+
+\subsection{Generating Topology Files for exch2}
+\label{sec:topogen}
+
+Alternate cubed sphere topologies may be created using the Matlab
+scripts in \file{utils/exch2/matlab-topology-generator}. Running the
+m-file
+\filelink{driver.m}{utils-exch2-matlab-topology-generator_driver.m}
+from the Matlab prompt (there are no parameters to pass) generates
+exch2 topology files \file{W2\_EXCH2\_TOPOLOGY.h} and
+\file{w2\_e2setup.F} in the working directory and displays a figure of
+the topology via Matlab.  The other m-files in the directory are
+subroutines of \file{driver.m} and should not be run ``bare'' except
+for development purposes. \\
+
+The parameters that determine the dimensions and topology of the
+generated configuration are \code{nr}, \code{nb}, \code{ng},
+\code{tnx} and \code{tny}, and all are assigned early in the script. \\
+
+The first three determine the size of the subdomains and
+hence the size of the overall domain.  Each one determines the number
+of grid points, and therefore the resolution, along the subdomain
+sides in a ``great circle'' around an axis of the cube.  At the time
+of this writing MITgcm requires these three parameters to be equal,
+but they provide for future releases  to accomodate different
+resolutions around the axes to allow (for example) greater resolution
+around the equator.\\
+
+The parameters \code{tnx} and \code{tny} determine the dimensions of
+the tiles into which the subdomains are decomposed, and must evenly
+divide the integer assigned to \code{nr}, \code{nb} and \code{ng}.
+The result is a rectangular tiling of the subdomain.  Figure
+\ref{fig:24tile} shows one possible topology for a twentyfour-tile
+cube, and figure \ref{fig:12tile} shows one for twelve tiles. \\
+
+\begin{figure}
+\begin{center}
+ \resizebox{4in}{!}{
+  \includegraphics{part6/s24t_16x16.ps}
+ }
+\end{center} 
+
+\caption{Plot of a cubed sphere topology with a 32$\times$192 domain
+divided into six 32$\times$32 subdomains, each of which is divided into four tiles 
+(\code{tnx=16, tny=16}) for a total of twentyfour tiles.
+} \label{fig:24tile}
+\end{figure}
+
+\begin{figure}
+\begin{center}
+ \resizebox{4in}{!}{
+  \includegraphics{part6/s12t_16x32.ps}
+ }
+\end{center} 
+\caption{Plot of a cubed sphere topology with a 32$\times$192 domain
+divided into six 32$\times$32 subdomains of two tiles each
+ (\code{tnx=16, tny=32}).
+} \label{fig:12tile}
+\end{figure}
+
+\begin{figure}
+\begin{center}
+ \resizebox{4in}{!}{
+  \includegraphics{part6/s6t_32x32.ps}
+ }
+\end{center} 
+\caption{Plot of a cubed sphere topology with a 32$\times$192 domain
+divided into six 32$\times$32 subdomains with one tile each
+(\code{tnx=32, tny=32}).  This is the default configuration.
+  }
+\label{fig:6tile}
+\end{figure}
+
+
+Tiles can be selected from the topology to be omitted from being
+allocated memory and processors.  This tuning is useful in ocean
+modeling for omitting tiles that fall entirely on land.  The tiles
+omitted are specified in the file
+\filelink{blanklist.txt}{utils-exch2-matlab-topology-generator_blanklist.txt}
+by their tile number in the topology, separated by a newline. \\
+
+
+
+
+\subsection{exch2, SIZE.h, and multiprocessing}
+\label{sec:exch2mpi}
+
+Once the topology configuration files are created, the Fortran
+\code{PARAMETER}s in \file{SIZE.h} must be configured to match.
+Section \ref{sect:specifying_a_decomposition} \sectiontitle{Specifying
+a decomposition} provides a general description of domain
+decomposition within MITgcm and its relation to \file{SIZE.h}. The
+current section specifies certain constraints the exch2 package
+imposes as well as describes how to enable parallel execution with
+MPI. \\
+
+As in the general case, the parameters \varlink{sNx}{sNx} and
+\varlink{sNy}{sNy} define the size of the individual tiles, and so
+must be assigned the same respective values as \code{tnx} and
+\code{tny} in \file{driver.m}.\\
+
+The halo width parameters \varlink{OLx}{OLx} and \varlink{OLy}{OLy}
+have no special bearing on exch2 and may be assigned as in the general
+case. The same holds for \varlink{Nr}{Nr}, the number of vertical 
+levels in the model.\\
+
+The parameters \varlink{nSx}{nSx}, \varlink{nSy}{nSy},
+\varlink{nPx}{nPx}, and \varlink{nPy}{nPy} relate to the number of
+tiles and how they are distributed on processors.  When using exch2,
+the tiles are stored in single dimension, and so
+\code{\varlink{nSy}{nSy}=1} in all cases.  Since the tiles as
+configured by exch2 cannot be split up accross processors without
+regenerating the topology, \code{\varlink{nPy}{nPy}=1} as well. \\
+
+The number of tiles MITgcm allocates and how they are distributed
+between processors depends on \varlink{nPx}{nPx} and
+\varlink{nSx}{nSx}.  \varlink{nSx}{nSx} is the number of tiles per
+processor and \varlink{nPx}{nPx} the number of processors.  The total
+number of tiles in the topology minus those listed in
+\file{blanklist.txt} must equal \code{nSx*nPx}. \\
+
+The following is an example of \file{SIZE.h} for the twelve-tile
+configuration illustrated in figure \ref{fig:12tile} running on 
+one processor: \\
+
+\begin{verbatim}
+      PARAMETER (
+     &           sNx =  16,
+     &           sNy =  32,
+     &           OLx =   2,
+     &           OLy =   2,
+     &           nSx =  12,
+     &           nSy =   1,
+     &           nPx =   1,
+     &           nPy =   1,
+     &           Nx  = sNx*nSx*nPx,
+     &           Ny  = sNy*nSy*nPy,
+     &           Nr  =   5)
+\end{verbatim}
+
+The following is an example for the twentyfour-tile topology in figure
+\ref{fig:24tile} running on six processors:
+
+\begin{verbatim}
+      PARAMETER (
+     &           sNx =  16,
+     &           sNy =  16,
+     &           OLx =   2,
+     &           OLy =   2,
+     &           nSx =   4,
+     &           nSy =   1,
+     &           nPx =   6,
+     &           nPy =   1,
+     &           Nx  = sNx*nSx*nPx,
+     &           Ny  = sNy*nSy*nPy,
+     &           Nr  =   5)
+\end{verbatim}
+
+
+
+
 
 \subsection{Key Variables}
 
 The descriptions of the variables are divided up into scalars,
-one-dimensional arrays indexed to the tile number, and two-dimensional
-arrays indexed to tile number and neighboring tile.  This division
-actually reflects  the functionality of these variables, not just the
-whim of some FORTRAN enthusiast.
+one-dimensional arrays indexed to the tile number, and two and three
+dimensional arrays indexed to tile number and neighboring tile.  This
+division reflects the functionality of these variables: The
+scalars are common to every part of the topology, the tile-indexed
+arrays to individual tiles, and the arrays indexed by tile and
+neighbor to relationships between tiles and their neighbors. \\
 
 \subsubsection{Scalars}
 
 The number of tiles in a particular topology is set with the parameter
-{\em NTILES}, and the maximum number of neighbors of any tiles by 
-{\em MAX\_NEIGHBOURS}.  These parameters are used for defining the size of
-the various one and two dimensional arrays that store tile parameters
-indexed to the tile number.
-
-The scalar parameters {\em exch2\_domain\_nxt} and 
-{\em exch2\_domain\_nyt} express the number of tiles in the x and y global
-indices.  For example, the default setup of six tiles has 
-{\em exch2\_domain\_nxt=6} and {\em exch2\_domain\_nyt=1}.  A topology of
-twenty-four square (in gridpoints) tiles, four (2x2) per subdomain, will
-have {\em exch2\_domain\_nxt=12} and {\em exch2\_domain\_nyt=2}.  Note 
-that these parameters express the tile layout to allow global data files that
-are tile-layout-neutral and have no bearing on the internal storage of the
-arrays.  The tiles are internally stored in a range from {\em 1,bi} (in the
-x axis) and y-axis variable {\em bj} is generally ignored within the package.
-
-\subsubsection{One-Dimensional Arrays}
-
-The following arrays are indexed to the tile number, and the indices are
-omitted in their descriptions.
-
-The arrays {\em exch2\_tnx} and {\em exch2\_tny} 
-express the x and y dimensions of each tile.  At present for each tile
-{\em exch2\_tnx = sNx} 
-and {\em exch2\_tny = sNy}, as assigned in {\em SIZE.h}.  Future releases of 
-MITgcm are to allow varying tile sizes.
+\code{NTILES}, and the maximum number of neighbors of any tiles by
+\code{MAX\_NEIGHBOURS}.  These parameters are used for defining the
+size of the various one and two dimensional arrays that store tile
+parameters indexed to the tile number and are assigned in the files
+generated by \file{driver.m}.\\
+
+The scalar parameters \varlink{exch2\_domain\_nxt}{exch2_domain_nxt}
+and \varlink{exch2\_domain\_nyt}{exch2_domain_nyt} express the number
+of tiles in the $x$ and $y$ global indices.  For example, the default
+setup of six tiles (Fig. \ref{fig:6tile}) has \code{exch2\_domain\_nxt=6} and
+\code{exch2\_domain\_nyt=1}.  A topology of twenty-four square tiles,
+four per subdomain (as in figure \ref{fig:24tile}), will have
+\code{exch2\_domain\_nxt=12} and \code{exch2\_domain\_nyt=2}.  Note
+that these parameters express the tile layout to allow global data
+files that are tile-layout-neutral and have no bearing on the internal
+storage of the arrays.  The tiles are internally stored in a range
+from \code{(1:\varlink{bi}{bi})} the $x$ axis, and $y$ axis variable
+\varlink{bj}{bj} is generally ignored within the package. \\
+
+\subsubsection{Arrays Indexed to Tile Number}
+
+The following arrays are of length \code{NTILES}, are indexed to the
+tile number, and the indices are omitted in their descriptions. \\
+
+The arrays \varlink{exch2\_tnx}{exch2_tnx} and
+\varlink{exch2\_tny}{exch2_tny} express the $x$ and $y$ dimensions of
+each tile.  At present for each tile \texttt{exch2\_tnx=sNx} and
+\texttt{exch2\_tny=sNy}, as assigned in \file{SIZE.h} and described in
+section \ref{sec:exch2mpi} \sectiontitle{exch2, SIZE.h, and
+multiprocessing}.  Future releases of MITgcm are to allow varying tile
+sizes. \\
+
+The location of the tiles' Cartesian origin within a subdomain are
+determined by the arrays \varlink{exch2\_tbasex}{exch2_tbasex} and
+\varlink{exch2\_tbasey}{exch2_tbasey}.  These variables are used to
+relate the location of the edges of different tiles to each other.  As
+an example, in the default six-tile topology (Fig. \ref{fig:6tile})
+each index in these arrays is set to \code{0} since a tile occupies
+its entire subdomain.  The twentyfour-tile case discussed above will
+have values of \code{0} or \code{16}, depending on the quadrant the
+tile falls within the subdomain.  The elements of the arrays
+\varlink{exch2\_txglobalo}{exch2_txglobalo} and
+\varlink{exch2\_txglobalo}{exch2_txglobalo} are similar to
+\varlink{exch2\_tbasex}{exch2_tbasex} and
+\varlink{exch2\_tbasey}{exch2_tbasey}, but locate the tiles within the
+global address space, similar to that used by global files. \\
+
+The array \varlink{exch2\_myFace}{exch2_myFace} contains the number of
+the subdomain of each tile, in a range \code{(1:6)} in the case of the
+standard cube topology and indicated by \textbf{\textsf{f}}$n$ in
+figures \ref{fig:12tile} and
+\ref{fig:24tile}. \varlink{exch2\_nNeighbours}{exch2_nNeighbours}
+contains a count of how many neighboring tiles each tile has, and is
+used for setting bounds for looping over neighboring tiles.
+\varlink{exch2\_tProc}{exch2_tProc} holds the process rank of each
+tile, and is used in interprocess communication.  \\
+
+
+The arrays \varlink{exch2\_isWedge}{exch2_isWedge},
+\varlink{exch2\_isEedge}{exch2_isEedge},
+\varlink{exch2\_isSedge}{exch2_isSedge}, and
+\varlink{exch2\_isNedge}{exch2_isNedge} are set to \code{1} if the
+indexed tile lies on the edge of a subdomain, \code{0} if not.  The
+values are used within the topology generator to determine the
+orientation of neighboring tiles, and to indicate whether a tile lies
+on the corner of a subdomain.  The latter case requires special
+exchange and numerical handling for the singularities at the eight
+corners of the cube. \\
+
+
+\subsubsection{Arrays Indexed to Tile Number and Neighbor}
+
+The following arrays are all of size
+\code{MAX\_NEIGHBOURS}$\times$\code{NTILES} and describe the
+orientations between the the tiles. \\
+
+The array \code{exch2\_neighbourId(a,T)} holds the tile number
+\code{Tn} for each of the tile number \code{T}'s neighboring tiles
+\code{a}.  The neighbor tiles are indexed \code{(1:MAX\_NEIGHBOURS)}
+in the order right to left on the north then south edges, and then top
+to bottom on the east and west edges.  Maybe throw in a fig here, eh?
+\\
+
+\sloppy
+The \code{exch2\_opposingSend\_record(a,T)} array holds the index
+\code{b} in \texttt{exch2\_neighbourId(b,Tn)} that holds the tile
+number \code{T}.  In other words,
+\begin{verbatim}
+   exch2_neighbourId( exch2_opposingSend_record(a,T),
+                      exch2_neighbourId(a,T) ) = T
+\end{verbatim}
+This provides a back-reference from the neighbor tiles. \\
 
-The location of the tiles' Cartesian origin within a subdomain are determined 
-by the arrays {\em exch2\_tbasex} and {\em exch2\_tbasey}.  These
+The arrays \varlink{exch2\_pi}{exch2_pi} and
+\varlink{exch2\_pj}{exch2_pj} specify the transformations of variables
+in exchanges between the neighboring tiles.  These transformations are
+necessary in exchanges between subdomains because a physical vector
+component in one direction may map to one in a different direction in
+an adjacent subdomain, and may be have its indexing reversed. This
+swapping arises from the ``folding'' of two-dimensional arrays into a
+three-dimensional cube.
+
+The dimensions of \code{exch2\_pi(t,N,T)} and \code{exch2\_pj(t,N,T)}
+are the neighbor ID \code{N} and the tile number \code{T} as explained
+above, plus a vector of length 2 containing transformation factors
+\code{t}.  The first element of the transformation vector indicates
+the factor \code{t} by which variables representing the same
+\emph{physical} vector component of a tile \code{T} will be multiplied
+in exchanges with neighbor \code{N}, and the second element indicates
+the transform to the physical vector in the other direction.  To
+clarify (hopefully), \code{exch2\_pi(1,N,T)} holds the transform of
+the $i$ component of a vector variable in tile \code{T} to the $i$
+component of tile \code{T}'s neighbor \code{N}, and
+\code{exch2\_pi(2,N,T)} holds the transform of \code{T}'s $i$
+components to the neighbor \code{N}'s $j$ component. \\
+ 
+Under the current cube topology, one of the two elements of
+\code{exch2\_pi} or \code{exch2\_pj} for a given tile \code{T} and
+neighbor \code{N} will be \code{0}, reflecting the fact that the two
+vector components are orthogonal.  The other element will be \code{1}
+or \code{-1}, depending on whether the components are indexed in the
+same or opposite directions.  For example, the transform vector of the
+arrays for all tile neighbors on the same subdomain will be
+\code{(1,0)}, since all tiles on the same subdomain are oriented
+identically.  A vector direction that corresponds to the orthogonal
+dimension with the same index direction in a particular tile-neighbor
+orientation will have \code{(0,1)}, whereas those in the opposite
+index direction will have \code{(0,-1)}. \\
+
+
+\varlink{exch2\_oi}{exch2_oi},
+\varlink{exch2\_oj}{exch2_oj}, \varlink{exch2\_oi\_f}{exch2_oi_f}, and
+\varlink{exch2\_oj\_f}{exch2_oj_f}
 
-\subsubsection{Two-Dimensional Arrays}
 
 
-//
 
+This needs some diagrams. \\
+
+
+{\footnotesize
 \begin{verbatim}
-C      NTILES            :: Number of tiles in this topology 
-C      MAX_NEIGHBOURS    :: Maximum number of neighbours any tile has.
-C      exch2_domain_nxt  :: Total domain length in tiles. 
-C      exch2_domain_nyt  :: Maximum domain height in tiles. 
-C      exch2_tnx         :: Size in X for each tile.                  
-C      exch2_tny         :: Size in Y for each tile.                  
-C      exch2_tbasex      :: Tile offset in X within its sub-domain (cube face)
-C      exch2_tbasey      :: Tile offset in Y within its sub-domain (cube face)
-C      exch2_tglobalxlo  :: Tile base X index within global index space.
-C      exch2_tglobalylo  :: Tile base Y index within global index space.
-C      exch2_isWedge     :: 0 if West not at domain edge, 1 if it is.   
-C      exch2_isNedge     :: 0 if North not at domain edge, 1 if it is.   
-C      exch2_isEedge     :: 0 if East not at domain edge, 1 if it is.   
-C      exch2_isSedge     :: 0 if South not at domain edge, 1 if it is.   
-C      exch2_myFace      :: Cube face number used for I/O.               
-C      exch2_nNeighbours :: Tile neighbour entries count.               
-C      exch2_tProc       :: Rank of process owning tile                 
-C                        :: (filled at run time).                       
-C      exch2_neighbourId :: Tile number for each neighbour entry.        
-C      exch2_opposingSend_record :: Record for entry in target tile send 
-C                                :: list that has this tile and face     
-C                                :: as its target.                       
 C      exch2_pi          :: X index row of target to source permutation 
 C                        :: matrix for each neighbour entry.            
 C      exch2_pj          :: Y index row of target to source permutation 
@@ -119,7 +431,7 @@
 C                        :: offset vector for face quantities           
 C                        :: of each neighbor entry.                     
 \end{verbatim}
-
+}