--- manual/s_phys_pkgs/text/exch2.tex	2004/03/17 19:49:22	1.13
+++ manual/s_phys_pkgs/text/exch2.tex	2004/03/19 21:25:45	1.17
@@ -1,4 +1,4 @@
-% $Header: /home/ubuntu/mnt/e9_copy/manual/s_phys_pkgs/text/exch2.tex,v 1.13 2004/03/17 19:49:22 afe Exp $
+% $Header: /home/ubuntu/mnt/e9_copy/manual/s_phys_pkgs/text/exch2.tex,v 1.17 2004/03/19 21:25:45 afe Exp $
 % $Name:  $
 
 %%  * Introduction
@@ -16,16 +16,14 @@
 
 \subsection{Introduction}
 
-The \texttt{exch2} package extends the original cubed
-sphere topology configuration to allow more flexible domain
-decomposition and parallelization.  Cube faces (also called
-subdomains) may be divided into any number of tiles that divide evenly
-into the grid point dimensions of the subdomain.  Furthermore, the
-individual tiles can run on separate processors in different
-combinations, and whether exchanges between particular tiles occur
-between different processors is determined at runtime.  This
-flexibility provides for manual compile-time load balancing across a
-relatively arbitrary number of processors. \\
+The \texttt{exch2} package extends the original cubed sphere topology
+configuration to allow more flexible domain decomposition and
+parallelization.  Cube faces (also called subdomains) may be divided
+into any number of tiles that divide evenly into the grid point
+dimensions of the subdomain.  Furthermore, the tiles can run on
+separate processors individually or in groups, which provides for
+manual compile-time load balancing across a relatively arbitrary
+number of processors. \\
 
 The exchange parameters are declared in
 \filelink{pkg/exch2/W2\_EXCH2\_TOPOLOGY.h}{pkg-exch2-W2_EXCH2_TOPOLOGY.h}
@@ -55,13 +53,13 @@
   details. \\
 
 $\bullet$ An example of \file{W2\_EXCH2\_TOPOLOGY.h} and
-  \file{w2\_e2setup.F} must reside in a directory containing code
-  linked when \file{genmake2} runs.  The safest place to put these
-  is the directory indicated in the \code{-mods=DIR} command line
-  modifier (typically \file{../code}), or the build directory.  The
-  default versions of these files reside in \file{pkg/exch2} and are
-  linked automatically if no other versions exist elsewhere in the
-  link path, but they should be left untouched to avoid breaking
+  \file{w2\_e2setup.F} must reside in a directory containing files
+  symbolically linked when \file{genmake2} runs.  The safest place to
+  put these is the directory indicated in the \code{-mods=DIR} command
+  line modifier (typically \file{../code}), or the build directory.
+  The default versions of these files reside in \file{pkg/exch2} and
+  are linked automatically if no other versions exist elsewhere in the
+  build path, but they should be left untouched to avoid breaking
   configurations other than the one you intend to modify.\\
 
 $\bullet$ Files containing grid parameters, named
@@ -84,8 +82,8 @@
   Section \ref{sect:specifying_a_decomposition}
   \sectiontitle{Specifying a decomposition}.\\
 
-As of the time of writing the following examples use exch2 and may be
-used for guidance:
+At the time of this writing the following examples use exch2 and may
+be used for guidance:
 
 \begin{verbatim}
 verification/adjust_nlfs.cs-32x32x1
@@ -129,7 +127,7 @@
 the tiles into which the subdomains are decomposed, and must evenly
 divide the integer assigned to \code{nr}, \code{nb} and \code{ng}.
 The result is a rectangular tiling of the subdomain.  Figure
-\ref{fig:24tile} shows one possible topology for a twentyfour-tile
+\ref{fig:24tile} shows one possible topology for a twenty-four-tile
 cube, and figure \ref{fig:12tile} shows one for twelve tiles. \\
 
 \begin{figure}
@@ -140,9 +138,9 @@
 \end{center} 
 
 \caption{Plot of a cubed sphere topology with a 32$\times$192 domain
-divided into six 32$\times$32 subdomains, each of which is divided into four tiles 
-(\code{tnx=16, tny=16}) for a total of twentyfour tiles.
-} \label{fig:24tile}
+divided into six 32$\times$32 subdomains, each of which is divided
+into four tiles (\code{tnx=16, tny=16}) for a total of twenty-four
+tiles.  } \label{fig:24tile}
 \end{figure}
 
 \begin{figure}
@@ -206,7 +204,7 @@
 The parameters \varlink{nSx}{nSx}, \varlink{nSy}{nSy},
 \varlink{nPx}{nPx}, and \varlink{nPy}{nPy} relate to the number of
 tiles and how they are distributed on processors.  When using exch2,
-the tiles are stored in single dimension, and so
+the tiles are stored in a single dimension, and so
 \code{\varlink{nSy}{nSy}=1} in all cases.  Since the tiles as
 configured by exch2 cannot be split up accross processors without
 regenerating the topology, \code{\varlink{nPy}{nPy}=1} as well. \\
@@ -237,8 +235,8 @@
      &           Nr  =   5)
 \end{verbatim}
 
-The following is an example for the twentyfour-tile topology in figure
-\ref{fig:24tile} running on six processors:
+The following is an example for the twenty-four-tile topology in
+figure \ref{fig:24tile} running on six processors:
 
 \begin{verbatim}
       PARAMETER (
@@ -262,9 +260,9 @@
 \subsection{Key Variables}
 
 The descriptions of the variables are divided up into scalars,
-one-dimensional arrays indexed to the tile number, and two and three
-dimensional arrays indexed to tile number and neighboring tile.  This
-division reflects the functionality of these variables: The
+one-dimensional arrays indexed to the tile number, and two and
+three-dimensional arrays indexed to tile number and neighboring tile.
+This division reflects the functionality of these variables: The
 scalars are common to every part of the topology, the tile-indexed
 arrays to individual tiles, and the arrays indexed by tile and
 neighbor to relationships between tiles and their neighbors. \\
@@ -281,20 +279,22 @@
 The scalar parameters \varlink{exch2\_domain\_nxt}{exch2_domain_nxt}
 and \varlink{exch2\_domain\_nyt}{exch2_domain_nyt} express the number
 of tiles in the $x$ and $y$ global indices.  For example, the default
-setup of six tiles (Fig. \ref{fig:6tile}) has \code{exch2\_domain\_nxt=6} and
-\code{exch2\_domain\_nyt=1}.  A topology of twenty-four square tiles,
-four per subdomain (as in figure \ref{fig:24tile}), will have
-\code{exch2\_domain\_nxt=12} and \code{exch2\_domain\_nyt=2}.  Note
-that these parameters express the tile layout to allow global data
-files that are tile-layout-neutral and have no bearing on the internal
-storage of the arrays.  The tiles are internally stored in a range
-from \code{(1:\varlink{bi}{bi})} the $x$ axis, and $y$ axis variable
-\varlink{bj}{bj} is generally ignored within the package. \\
+setup of six tiles (Fig. \ref{fig:6tile}) has
+\code{exch2\_domain\_nxt=6} and \code{exch2\_domain\_nyt=1}.  A
+topology of twenty-four square tiles, four per subdomain (as in figure
+\ref{fig:24tile}), will have \code{exch2\_domain\_nxt=12} and
+\code{exch2\_domain\_nyt=2}.  Note that these parameters express the
+tile layout to allow global data files that are tile-layout-neutral
+and have no bearing on the internal storage of the arrays.  The tiles
+are stored internally in a range from \code{(1:\varlink{bi}{bi})} the
+$x$ axis, and the $y$ axis variable \varlink{bj}{bj} generally is
+ignored within the package. \\
 
 \subsubsection{Arrays Indexed to Tile Number}
 
-The following arrays are of length \code{NTILES}, are indexed to the
-tile number, and the indices are omitted in their descriptions. \\
+The following arrays are of length \code{NTILES} and are indexed to
+the tile number, which is indicated in the diagrams with the notation
+\textsf{t}$n$.  The indices are omitted in the descriptions. \\
 
 The arrays \varlink{exch2\_tnx}{exch2_tnx} and
 \varlink{exch2\_tny}{exch2_tny} express the $x$ and $y$ dimensions of
@@ -310,22 +310,23 @@
 relate the location of the edges of different tiles to each other.  As
 an example, in the default six-tile topology (Fig. \ref{fig:6tile})
 each index in these arrays is set to \code{0} since a tile occupies
-its entire subdomain.  The twentyfour-tile case discussed above will
+its entire subdomain.  The twenty-four-tile case discussed above will
 have values of \code{0} or \code{16}, depending on the quadrant the
 tile falls within the subdomain.  The elements of the arrays
 \varlink{exch2\_txglobalo}{exch2_txglobalo} and
 \varlink{exch2\_txglobalo}{exch2_txglobalo} are similar to
 \varlink{exch2\_tbasex}{exch2_tbasex} and
 \varlink{exch2\_tbasey}{exch2_tbasey}, but locate the tiles within the
-global address space, similar to that used by global files. \\
+global address space, similar to that used by global output and input
+files. \\
 
 The array \varlink{exch2\_myFace}{exch2_myFace} contains the number of
 the subdomain of each tile, in a range \code{(1:6)} in the case of the
 standard cube topology and indicated by \textbf{\textsf{f}}$n$ in
 figures \ref{fig:12tile} and
 \ref{fig:24tile}. \varlink{exch2\_nNeighbours}{exch2_nNeighbours}
-contains a count of how many neighboring tiles each tile has, and is
-used for setting bounds for looping over neighboring tiles.
+contains a count of the neighboring tiles each tile has, and is used
+for setting bounds for looping over neighboring tiles.
 \varlink{exch2\_tProc}{exch2_tProc} holds the process rank of each
 tile, and is used in interprocess communication.  \\
 
@@ -334,31 +335,30 @@
 \varlink{exch2\_isEedge}{exch2_isEedge},
 \varlink{exch2\_isSedge}{exch2_isSedge}, and
 \varlink{exch2\_isNedge}{exch2_isNedge} are set to \code{1} if the
-indexed tile lies on the edge of a subdomain, \code{0} if not.  The
-values are used within the topology generator to determine the
-orientation of neighboring tiles, and to indicate whether a tile lies
-on the corner of a subdomain.  The latter case requires special
+indexed tile lies on the respective edge of a subdomain, \code{0} if
+not.  The values are used within the topology generator to determine
+the orientation of neighboring tiles, and to indicate whether a tile
+lies on the corner of a subdomain.  The latter case requires special
 exchange and numerical handling for the singularities at the eight
 corners of the cube. \\
 
 
 \subsubsection{Arrays Indexed to Tile Number and Neighbor}
 
-The following arrays are all of size
-\code{MAX\_NEIGHBOURS}$\times$\code{NTILES} and describe the
-orientations between the the tiles. \\
+The following arrays have vectors of length \code{MAX\_NEIGHBOURS} and
+\code{NTILES} and describe the orientations between the the tiles. \\
 
 The array \code{exch2\_neighbourId(a,T)} holds the tile number
 \code{Tn} for each of the tile number \code{T}'s neighboring tiles
-\code{a}.  The neighbor tiles are indexed \code{(1:MAX\_NEIGHBOURS)}
-in the order right to left on the north then south edges, and then top
-to bottom on the east and west edges.  Maybe throw in a fig here, eh?
-\\
-
-\sloppy
-The \code{exch2\_opposingSend\_record(a,T)} array holds the index
-\code{b} in \texttt{exch2\_neighbourId(b,Tn)} that holds the tile
-number \code{T}.  In other words,
+\code{a}.  The neighbor tiles are indexed
+\code{(1:exch2\_nNeighbours(T))} in the order right to left on the
+north then south edges, and then top to bottom on the east then west
+edges.  \\
+
+ The \code{exch2\_opposingSend\_record(a,T)} array holds the
+index \code{b} of the element in \texttt{exch2\_neighbourId(b,Tn)}
+that holds the tile number \code{T}, given
+\code{Tn=exch2\_neighborId(a,T)}.  In other words,
 \begin{verbatim}
    exch2_neighbourId( exch2_opposingSend_record(a,T),
                       exch2_neighbourId(a,T) ) = T
@@ -366,77 +366,152 @@
 This provides a back-reference from the neighbor tiles. \\
 
 The arrays \varlink{exch2\_pi}{exch2_pi} and
-\varlink{exch2\_pj}{exch2_pj} specify the transformations of variables
+\varlink{exch2\_pj}{exch2_pj} specify the transformations of indices
 in exchanges between the neighboring tiles.  These transformations are
-necessary in exchanges between subdomains because a physical vector
-component in one direction may map to one in a different direction in
-an adjacent subdomain, and may be have its indexing reversed. This
-swapping arises from the ``folding'' of two-dimensional arrays into a
-three-dimensional cube.
+necessary in exchanges between subdomains because the array index in
+one dimension may map to the other index in an adjacent subdomain, and
+may be have its indexing reversed. This swapping arises from the
+``folding'' of two-dimensional arrays into a three-dimensional
+cube. \\
 
 The dimensions of \code{exch2\_pi(t,N,T)} and \code{exch2\_pj(t,N,T)}
 are the neighbor ID \code{N} and the tile number \code{T} as explained
-above, plus a vector of length 2 containing transformation factors
-\code{t}.  The first element of the transformation vector indicates
-the factor \code{t} by which variables representing the same
-\emph{physical} vector component of a tile \code{T} will be multiplied
-in exchanges with neighbor \code{N}, and the second element indicates
-the transform to the physical vector in the other direction.  To
-clarify (hopefully), \code{exch2\_pi(1,N,T)} holds the transform of
-the $i$ component of a vector variable in tile \code{T} to the $i$
-component of tile \code{T}'s neighbor \code{N}, and
-\code{exch2\_pi(2,N,T)} holds the transform of \code{T}'s $i$
-components to the neighbor \code{N}'s $j$ component. \\
+above, plus a vector of length \code{2} containing transformation
+factors \code{t}.  The first element of the transformation vector
+holds the factor to multiply the index in the same axis, and the
+second element holds the the same for the orthogonal index.  To
+clarify, \code{exch2\_pi(1,N,T)} holds the mapping of the $x$ axis
+index of tile \code{T} to the $x$ axis of tile \code{T}'s neighbor
+\code{N}, and \code{exch2\_pi(2,N,T)} holds the mapping of \code{T}'s
+$x$ index to the neighbor \code{N}'s $y$ index. \\
  
-Under the current cube topology, one of the two elements of
-\code{exch2\_pi} or \code{exch2\_pj} for a given tile \code{T} and
-neighbor \code{N} will be \code{0}, reflecting the fact that the two
-vector components are orthogonal.  The other element will be \code{1}
-or \code{-1}, depending on whether the components are indexed in the
-same or opposite directions.  For example, the transform vector of the
-arrays for all tile neighbors on the same subdomain will be
+One of the two elements of \code{exch2\_pi} or \code{exch2\_pj} for a
+given tile \code{T} and neighbor \code{N} will be \code{0}, reflecting
+the fact that the two axes are orthogonal.  The other element will be
+\code{1} or \code{-1}, depending on whether the axes are indexed in
+the same or opposite directions.  For example, the transform vector of
+the arrays for all tile neighbors on the same subdomain will be
 \code{(1,0)}, since all tiles on the same subdomain are oriented
-identically.  A vector direction that corresponds to the orthogonal
-dimension with the same index direction in a particular tile-neighbor
-orientation will have \code{(0,1)}, whereas those in the opposite
-index direction will have \code{(0,-1)}. \\
+identically.  An axis that corresponds to the orthogonal dimension
+with the same index direction in a particular tile-neighbor
+orientation will have \code{(0,1)}.  Those in the opposite index
+direction will have \code{(0,-1)} in order to reverse the ordering. \\
 
-
-\varlink{exch2\_oi}{exch2_oi},
+The arrays \varlink{exch2\_oi}{exch2_oi},
 \varlink{exch2\_oj}{exch2_oj}, \varlink{exch2\_oi\_f}{exch2_oi_f}, and
-\varlink{exch2\_oj\_f}{exch2_oj_f}
+\varlink{exch2\_oj\_f}{exch2_oj_f} are indexed to tile number and
+neighbor and specify the relative offset within the subdomain of the
+array index of a variable going from a neighboring tile \code{N} to a
+local tile \code{T}.  Consider \code{T=1} in the six-tile topology
+(Fig. \ref{fig:6tile}), where
 
+\begin{verbatim}
+       exch2_oi(1,1)=33
+       exch2_oi(2,1)=0
+       exch2_oi(3,1)=32
+       exch2_oi(4,1)=-32
+\end{verbatim}
 
+The simplest case is \code{exch2\_oi(2,1)}, the southern neighbor,
+which is \code{Tn=6}.  The axes of \code{T} and \code{Tn} have the
+same orientation and their $x$ axes have the same origin, and so an
+exchange between the two requires no changes to the $x$ index.  For
+the western neighbor (\code{Tn=5}), \code{code\_oi(3,1)=32} since the
+\code{x=0} vector on \code{T} corresponds to the \code{y=32} vector on
+\code{Tn}.  The eastern edge of \code{T} shows the reverse case
+(\code{exch2\_oi(4,1)=-32)}), where \code{x=32} on \code{T} exchanges
+with \code{x=0} on \code{Tn=2}. \\
+
+ The most interesting case, where \code{exch2\_oi(1,1)=33} and
+\code{Tn=3}, involves a reversal of indices.  As in every case, the
+offset \code{exch2\_oi} is added to the original $x$ index of \code{T}
+multiplied by the transformation factor \code{exch2\_pi(t,N,T)}.  Here
+\code{exch2\_pi(1,1,1)=0} since the $x$ axis of \code{T} is orthogonal
+to the $x$ axis of \code{Tn}.  \code{exch2\_pi(2,1,1)=-1} since the
+$x$ axis of \code{T} corresponds to the $y$ axis of \code{Tn}, but the
+index is reversed.  The result is that the index of the northern edge
+of \code{T}, which runs \code{(1:32)}, is transformed to
+\code{(-1:-32)}. \code{exch2\_oi(1,1)} is then added to this range to
+get back \code{(32:1)} -- the index of the $y$ axis of \code{Tn}
+relative to \code{T}.  This transformation may seem overly convoluted
+for the six-tile case, but it is necessary to provide a general
+solution for various topologies. \\
 
 
-This needs some diagrams. \\
 
+Finally, \varlink{exch2\_itlo\_c}{exch2_itlo_c},
+\varlink{exch2\_ithi\_c}{exch2_ithi_c},
+\varlink{exch2\_jtlo\_c}{exch2_jtlo_c} and
+\varlink{exch2\_jthi\_c}{exch2_jthi_c} hold the location and index
+bounds of the edge segment of the neighbor tile \code{N}'s subdomain
+that gets exchanged with the local tile \code{T}.  To take the example
+of tile \code{T=2} in the twelve-tile topology
+(Fig. \ref{fig:12tile}): \\
 
-{\footnotesize
 \begin{verbatim}
-C      exch2_pi          :: X index row of target to source permutation 
-C                        :: matrix for each neighbour entry.            
-C      exch2_pj          :: Y index row of target to source permutation 
-C                        :: matrix for each neighbour entry.            
-C      exch2_oi          :: X index element of target to source 
-C                        :: offset vector for cell-centered quantities  
-C                        :: of each neighbor entry.                     
-C      exch2_oj          :: Y index element of target to source 
-C                        :: offset vector for cell-centered quantities  
-C                        :: of each neighbor entry.                     
-C      exch2_oi_f        :: X index element of target to source 
-C                        :: offset vector for face quantities           
-C                        :: of each neighbor entry.                     
-C      exch2_oj_f        :: Y index element of target to source 
-C                        :: offset vector for face quantities           
-C                        :: of each neighbor entry.                     
+       exch2_itlo_c(4,2)=17
+       exch2_ithi_c(4,2)=17
+       exch2_jtlo_c(4,2)=0
+       exch2_jthi_c(4,2)=33
 \end{verbatim}
-}
+ 
+Here \code{N=4}, indicating the western neighbor, which is
+\code{Tn=1}.  \code{Tn} resides on the same subdomain as \code{T}, so
+the tiles have the same orientation and the same $x$ and $y$ axes.
+The $x$ axis is orthogonal to the western edge and the tile is 16
+points wide, so \code{exch2\_itlo\_c} and \code{exch2\_ithi\_c}
+indicate the column beyond \code{Tn}'s eastern edge, in that tile's
+halo region. Since the border of the tiles extends through the entire
+height of the subdomain, the $y$ axis bounds \code{exch2\_jtlo\_c} to
+\code{exch2\_jthi\_c} cover the height of \code{(1:32)}, plus 1 in
+either direction to cover part of the halo. \\
 
+For the north edge of the same tile \code{T=2} where \code{N=1} and 
+the neighbor tile is \code{Tn=5}:
 
+\begin{verbatim}
+       exch2_itlo_c(1,2)=0
+       exch2_ithi_c(1,2)=0
+       exch2_jtlo_c(1,2)=0
+       exch2_jthi_c(1,2)=17
+\end{verbatim}
+ 
+\code{T}'s northern edge is parallel to the $x$ axis, but since
+\code{Tn}'s $y$ axis corresponds to \code{T}'s $x$ axis, \code{T}'s
+northern edge exchanges with \code{Tn}'s western edge.  The western
+edge of the tiles corresponds to the lower bound of the $x$ axis, so
+\code{exch2\_itlo\_c} \code{exch2\_ithi\_c} are \code{0}. The range of
+\code{exch2\_jtlo\_c} and \code{exch2\_jthi\_c} correspond to the
+width of \code{T}'s northern edge, plus the halo. \\
 
-\subsection{Key Routines}
 
+\subsection{Key Routines}
 
+Most of the subroutines particular to exch2 handle the exchanges
+themselves and are of the same format as those described in
+\ref{sect:cube_sphere_communication} \sectiontitle{Cube sphere
+communication}.  Like the original routines, they are written as
+templates which the local Makefile converts from RX into RL and RS
+forms. \\
+
+The interfaces with the core model subroutines are
+\code{EXCH\_UV\_XY\_RX}, \code{EXCH\_UV\_XYZ\_RX} and
+\code{EXCH\_XY\_RX}.  They override the standard exchange routines
+when \code{genmake2} is run with \code{exch2} option.  They in turn
+call the local exch2 subroutines \code{EXCH2\_UV\_XY\_RX} and
+\code{EXCH2\_UV\_XYZ\_RX} for two and three-dimensional vector
+quantities, and \code{EXCH2\_XY\_RX} and \code{EXCH2\_XYZ\_RX} for two
+and three-dimensional scalar quantities.  These subroutines set the
+dimensions of the area to be exchanged, call \code{EXCH2\_RX1\_CUBE}
+for scalars and \code{EXCH2\_RX2\_CUBE} for vectors, and then handle
+the singularities at the cube corners. \\
+
+The separate scalar and vector forms of \code{EXCH2\_RX1\_CUBE} and
+\code{EXCH2\_RX2\_CUBE} reflect that the vector-handling subrouine
+needs to pass both the $u$ and $v$ components of the phsical vectors.
+This arises from the topological folding discussed above, where the
+$x$ and $y$ axes get swapped in some cases.  This swapping is not an
+issue with the scalar version. These subroutines call
+\code{EXCH2\_SEND\_RX1} and \code{EXCH2\_SEND\_RX2}, which do most of
+the work using the variables discussed above. \\
 
-\subsection{References}