--- manual/s_phys_pkgs/text/exch2.tex 2004/03/18 22:20:38 1.16 +++ manual/s_phys_pkgs/text/exch2.tex 2004/03/19 21:25:45 1.17 @@ -1,4 +1,4 @@ -% $Header: /home/ubuntu/mnt/e9_copy/manual/s_phys_pkgs/text/exch2.tex,v 1.16 2004/03/18 22:20:38 afe Exp $ +% $Header: /home/ubuntu/mnt/e9_copy/manual/s_phys_pkgs/text/exch2.tex,v 1.17 2004/03/19 21:25:45 afe Exp $ % $Name: $ %% * Introduction @@ -16,16 +16,14 @@ \subsection{Introduction} -The \texttt{exch2} package extends the original cubed -sphere topology configuration to allow more flexible domain -decomposition and parallelization. Cube faces (also called -subdomains) may be divided into any number of tiles that divide evenly -into the grid point dimensions of the subdomain. Furthermore, the -individual tiles can run on separate processors in different -combinations, and whether exchanges between particular tiles occur -between different processors is determined at runtime. This -flexibility provides for manual compile-time load balancing across a -relatively arbitrary number of processors. \\ +The \texttt{exch2} package extends the original cubed sphere topology +configuration to allow more flexible domain decomposition and +parallelization. Cube faces (also called subdomains) may be divided +into any number of tiles that divide evenly into the grid point +dimensions of the subdomain. Furthermore, the tiles can run on +separate processors individually or in groups, which provides for +manual compile-time load balancing across a relatively arbitrary +number of processors. \\ The exchange parameters are declared in \filelink{pkg/exch2/W2\_EXCH2\_TOPOLOGY.h}{pkg-exch2-W2_EXCH2_TOPOLOGY.h} @@ -55,13 +53,13 @@ details. \\ $\bullet$ An example of \file{W2\_EXCH2\_TOPOLOGY.h} and - \file{w2\_e2setup.F} must reside in a directory containing code - linked when \file{genmake2} runs. The safest place to put these - is the directory indicated in the \code{-mods=DIR} command line - modifier (typically \file{../code}), or the build directory. The - default versions of these files reside in \file{pkg/exch2} and are - linked automatically if no other versions exist elsewhere in the - link path, but they should be left untouched to avoid breaking + \file{w2\_e2setup.F} must reside in a directory containing files + symbolically linked when \file{genmake2} runs. The safest place to + put these is the directory indicated in the \code{-mods=DIR} command + line modifier (typically \file{../code}), or the build directory. + The default versions of these files reside in \file{pkg/exch2} and + are linked automatically if no other versions exist elsewhere in the + build path, but they should be left untouched to avoid breaking configurations other than the one you intend to modify.\\ $\bullet$ Files containing grid parameters, named @@ -84,8 +82,8 @@ Section \ref{sect:specifying_a_decomposition} \sectiontitle{Specifying a decomposition}.\\ -As of the time of writing the following examples use exch2 and may be -used for guidance: +At the time of this writing the following examples use exch2 and may +be used for guidance: \begin{verbatim} verification/adjust_nlfs.cs-32x32x1 @@ -129,7 +127,7 @@ the tiles into which the subdomains are decomposed, and must evenly divide the integer assigned to \code{nr}, \code{nb} and \code{ng}. The result is a rectangular tiling of the subdomain. Figure -\ref{fig:24tile} shows one possible topology for a twentyfour-tile +\ref{fig:24tile} shows one possible topology for a twenty-four-tile cube, and figure \ref{fig:12tile} shows one for twelve tiles. \\ \begin{figure} @@ -140,9 +138,9 @@ \end{center} \caption{Plot of a cubed sphere topology with a 32$\times$192 domain -divided into six 32$\times$32 subdomains, each of which is divided into four tiles -(\code{tnx=16, tny=16}) for a total of twentyfour tiles. -} \label{fig:24tile} +divided into six 32$\times$32 subdomains, each of which is divided +into four tiles (\code{tnx=16, tny=16}) for a total of twenty-four +tiles. } \label{fig:24tile} \end{figure} \begin{figure} @@ -206,7 +204,7 @@ The parameters \varlink{nSx}{nSx}, \varlink{nSy}{nSy}, \varlink{nPx}{nPx}, and \varlink{nPy}{nPy} relate to the number of tiles and how they are distributed on processors. When using exch2, -the tiles are stored in single dimension, and so +the tiles are stored in a single dimension, and so \code{\varlink{nSy}{nSy}=1} in all cases. Since the tiles as configured by exch2 cannot be split up accross processors without regenerating the topology, \code{\varlink{nPy}{nPy}=1} as well. \\ @@ -237,8 +235,8 @@ & Nr = 5) \end{verbatim} -The following is an example for the twentyfour-tile topology in figure -\ref{fig:24tile} running on six processors: +The following is an example for the twenty-four-tile topology in +figure \ref{fig:24tile} running on six processors: \begin{verbatim} PARAMETER ( @@ -262,9 +260,9 @@ \subsection{Key Variables} The descriptions of the variables are divided up into scalars, -one-dimensional arrays indexed to the tile number, and two and three -dimensional arrays indexed to tile number and neighboring tile. This -division reflects the functionality of these variables: The +one-dimensional arrays indexed to the tile number, and two and +three-dimensional arrays indexed to tile number and neighboring tile. +This division reflects the functionality of these variables: The scalars are common to every part of the topology, the tile-indexed arrays to individual tiles, and the arrays indexed by tile and neighbor to relationships between tiles and their neighbors. \\ @@ -288,14 +286,14 @@ \code{exch2\_domain\_nyt=2}. Note that these parameters express the tile layout to allow global data files that are tile-layout-neutral and have no bearing on the internal storage of the arrays. The tiles -are internally stored in a range from \code{(1:\varlink{bi}{bi})} the -$x$ axis, and the $y$ axis variable \varlink{bj}{bj} is generally +are stored internally in a range from \code{(1:\varlink{bi}{bi})} the +$x$ axis, and the $y$ axis variable \varlink{bj}{bj} generally is ignored within the package. \\ \subsubsection{Arrays Indexed to Tile Number} -The following arrays are of length \code{NTILES}and are indexed to the -tile number, which is indicated in the diagrams with the notation +The following arrays are of length \code{NTILES} and are indexed to +the tile number, which is indicated in the diagrams with the notation \textsf{t}$n$. The indices are omitted in the descriptions. \\ The arrays \varlink{exch2\_tnx}{exch2_tnx} and @@ -312,22 +310,23 @@ relate the location of the edges of different tiles to each other. As an example, in the default six-tile topology (Fig. \ref{fig:6tile}) each index in these arrays is set to \code{0} since a tile occupies -its entire subdomain. The twentyfour-tile case discussed above will +its entire subdomain. The twenty-four-tile case discussed above will have values of \code{0} or \code{16}, depending on the quadrant the tile falls within the subdomain. The elements of the arrays \varlink{exch2\_txglobalo}{exch2_txglobalo} and \varlink{exch2\_txglobalo}{exch2_txglobalo} are similar to \varlink{exch2\_tbasex}{exch2_tbasex} and \varlink{exch2\_tbasey}{exch2_tbasey}, but locate the tiles within the -global address space, similar to that used by global files. \\ +global address space, similar to that used by global output and input +files. \\ The array \varlink{exch2\_myFace}{exch2_myFace} contains the number of the subdomain of each tile, in a range \code{(1:6)} in the case of the standard cube topology and indicated by \textbf{\textsf{f}}$n$ in figures \ref{fig:12tile} and \ref{fig:24tile}. \varlink{exch2\_nNeighbours}{exch2_nNeighbours} -contains a count the neighboring tiles each tile has, and is -used for setting bounds for looping over neighboring tiles. +contains a count of the neighboring tiles each tile has, and is used +for setting bounds for looping over neighboring tiles. \varlink{exch2\_tProc}{exch2_tProc} holds the process rank of each tile, and is used in interprocess communication. \\ @@ -346,18 +345,17 @@ \subsubsection{Arrays Indexed to Tile Number and Neighbor} -The following arrays are all of size -\code{MAX\_NEIGHBOURS}$\times$\code{NTILES} and describe the -orientations between the the tiles. \\ +The following arrays have vectors of length \code{MAX\_NEIGHBOURS} and +\code{NTILES} and describe the orientations between the the tiles. \\ The array \code{exch2\_neighbourId(a,T)} holds the tile number \code{Tn} for each of the tile number \code{T}'s neighboring tiles \code{a}. The neighbor tiles are indexed -\code{(1:exch2\_NNeighbours(T))} in the order right to left on the -north then south edges, and then top to bottom on the east and west -edges. Maybe throw in a fig here, eh? \\ +\code{(1:exch2\_nNeighbours(T))} in the order right to left on the +north then south edges, and then top to bottom on the east then west +edges. \\ -\sloppy The \code{exch2\_opposingSend\_record(a,T)} array holds the + The \code{exch2\_opposingSend\_record(a,T)} array holds the index \code{b} of the element in \texttt{exch2\_neighbourId(b,Tn)} that holds the tile number \code{T}, given \code{Tn=exch2\_neighborId(a,T)}. In other words, @@ -373,7 +371,8 @@ necessary in exchanges between subdomains because the array index in one dimension may map to the other index in an adjacent subdomain, and may be have its indexing reversed. This swapping arises from the -``folding'' of two-dimensional arrays into a three-dimensional cube. +``folding'' of two-dimensional arrays into a three-dimensional +cube. \\ The dimensions of \code{exch2\_pi(t,N,T)} and \code{exch2\_pj(t,N,T)} are the neighbor ID \code{N} and the tile number \code{T} as explained @@ -402,8 +401,8 @@ \varlink{exch2\_oj}{exch2_oj}, \varlink{exch2\_oi\_f}{exch2_oi_f}, and \varlink{exch2\_oj\_f}{exch2_oj_f} are indexed to tile number and neighbor and specify the relative offset within the subdomain of the -array index of a variable going from a neighboring tile $N$ to a local -tile $T$. Consider \code{T=1} in the six-tile topology +array index of a variable going from a neighboring tile \code{N} to a +local tile \code{T}. Consider \code{T=1} in the six-tile topology (Fig. \ref{fig:6tile}), where \begin{verbatim} @@ -420,22 +419,23 @@ the western neighbor (\code{Tn=5}), \code{code\_oi(3,1)=32} since the \code{x=0} vector on \code{T} corresponds to the \code{y=32} vector on \code{Tn}. The eastern edge of \code{T} shows the reverse case -(\code{exch2\_oi(4,1)=-32)}, where \code{x=32} on \code{T} exchanges -with \code{x=0} on \code{Tn=2}. The most interesting case, where -\code{exch2\_oi(1,1)=33} and \code{Tn=3}, involves a reversal of -indices. As in every case, the offset \code{exch2\_oi} is added to -the original $x$ index of \code{T} multiplied by the transformation -factor \code{exch2\_pi(t,N,T)}. Here \code{exch2\_pi(1,1,1)=0} since -the $x$ axis of \code{T} is orthogonal to the $x$ axis of \code{Tn}. -\code{exch2\_pi(2,1,1)=-1} since the $x$ axis of \code{T} corresponds -to the $y$ axis of \code{Tn}, but the axes are reversed. The result -is that the index of the northern edge of \code{T}, which runs -\code{(1:32)}, is transformed to +(\code{exch2\_oi(4,1)=-32)}), where \code{x=32} on \code{T} exchanges +with \code{x=0} on \code{Tn=2}. \\ + + The most interesting case, where \code{exch2\_oi(1,1)=33} and +\code{Tn=3}, involves a reversal of indices. As in every case, the +offset \code{exch2\_oi} is added to the original $x$ index of \code{T} +multiplied by the transformation factor \code{exch2\_pi(t,N,T)}. Here +\code{exch2\_pi(1,1,1)=0} since the $x$ axis of \code{T} is orthogonal +to the $x$ axis of \code{Tn}. \code{exch2\_pi(2,1,1)=-1} since the +$x$ axis of \code{T} corresponds to the $y$ axis of \code{Tn}, but the +index is reversed. The result is that the index of the northern edge +of \code{T}, which runs \code{(1:32)}, is transformed to \code{(-1:-32)}. \code{exch2\_oi(1,1)} is then added to this range to -get back \code{(1:32)} -- the index of the $y$ axis of \code{Tn}. -This transformation may seem overly convoluted for the six-tile case, -but it is necessary to provide a general solution for various -topologies. \\ +get back \code{(32:1)} -- the index of the $y$ axis of \code{Tn} +relative to \code{T}. This transformation may seem overly convoluted +for the six-tile case, but it is necessary to provide a general +solution for various topologies. \\ @@ -455,16 +455,16 @@ exch2_jthi_c(4,2)=33 \end{verbatim} -Here \code{N=4}, indicating the western neighbor, which is \code{Tn=1}. -\code{Tn=1} resides on the same subdomain as \code{T=2}, so the tiles -have the same orientation and the same $x$ and $y$ axes. The $i$ -component is orthogonal to the western edge and the tile is 16 points -wide, so \code{exch2\_itlo\_c} and \code{exch2\_ithi\_c} indicate the -column beyond \code{Tn=1}'s eastern edge, in that tile's halo -region. Since the border of the tiles extends through the entire +Here \code{N=4}, indicating the western neighbor, which is +\code{Tn=1}. \code{Tn} resides on the same subdomain as \code{T}, so +the tiles have the same orientation and the same $x$ and $y$ axes. +The $x$ axis is orthogonal to the western edge and the tile is 16 +points wide, so \code{exch2\_itlo\_c} and \code{exch2\_ithi\_c} +indicate the column beyond \code{Tn}'s eastern edge, in that tile's +halo region. Since the border of the tiles extends through the entire height of the subdomain, the $y$ axis bounds \code{exch2\_jtlo\_c} to -\code{exch2\_jthi\_c} cover the height, plus 1 in either direction to -cover part of the halo. \\ +\code{exch2\_jthi\_c} cover the height of \code{(1:32)}, plus 1 in +either direction to cover part of the halo. \\ For the north edge of the same tile \code{T=2} where \code{N=1} and the neighbor tile is \code{Tn=5}: @@ -477,27 +477,14 @@ \end{verbatim} \code{T}'s northern edge is parallel to the $x$ axis, but since -\code{Tn}'s $y$ axis corresponds to \code{T}'s $x$ axis, -\code{T}'s northern edge exchanges with \code{Tn}'s western edge. -The western edge of the tiles corresponds to the lower bound of the -$x$ axis, so \code{exch2\_itlo\_c} \code{exch2\_ithi\_c} are \code{0}. The -range of \code{exch2\_jtlo\_c} and \code{exch2\_jthi\_c} correspond to the +\code{Tn}'s $y$ axis corresponds to \code{T}'s $x$ axis, \code{T}'s +northern edge exchanges with \code{Tn}'s western edge. The western +edge of the tiles corresponds to the lower bound of the $x$ axis, so +\code{exch2\_itlo\_c} \code{exch2\_ithi\_c} are \code{0}. The range of +\code{exch2\_jtlo\_c} and \code{exch2\_jthi\_c} correspond to the width of \code{T}'s northern edge, plus the halo. \\ - - - - - - - - - -This needs some diagrams. \\ - - - \subsection{Key Routines} Most of the subroutines particular to exch2 handle the exchanges @@ -508,22 +495,23 @@ forms. \\ The interfaces with the core model subroutines are -\code{EXCH\_UV\_XY\_RX}, \code{EXCH\_UV\_XYZ\_RX} and \code{EXCH\_XY\_RX}. -They override the standard exchange routines when \code{genmake2} is -run with \code{exch2} option. They in turn call the local exch2 -subroutines \code{EXCH2\_UV\_XY\_RX} and \code{EXCH2\_UV\_XYZ\_RX} for two -and three dimensional vector quantities, and \code{EXCH2\_XY\_RX} and -\code{EXCH2\_XYZ\_RX} for two and three dimensional scalar quantities. -These subroutines set the dimensions of the area to be exchanged, call -\code{EXCH2\_RX1\_CUBE} for scalars and \code{EXCH2\_RX2\_CUBE} for -vectors, and then handle the singularities at the cube corners. \\ +\code{EXCH\_UV\_XY\_RX}, \code{EXCH\_UV\_XYZ\_RX} and +\code{EXCH\_XY\_RX}. They override the standard exchange routines +when \code{genmake2} is run with \code{exch2} option. They in turn +call the local exch2 subroutines \code{EXCH2\_UV\_XY\_RX} and +\code{EXCH2\_UV\_XYZ\_RX} for two and three-dimensional vector +quantities, and \code{EXCH2\_XY\_RX} and \code{EXCH2\_XYZ\_RX} for two +and three-dimensional scalar quantities. These subroutines set the +dimensions of the area to be exchanged, call \code{EXCH2\_RX1\_CUBE} +for scalars and \code{EXCH2\_RX2\_CUBE} for vectors, and then handle +the singularities at the cube corners. \\ The separate scalar and vector forms of \code{EXCH2\_RX1\_CUBE} and -\code{EXCH2\_RX2\_CUBE} reflect that the vector-handling subrouine needs -to pass both the $x$ and $y$ components of the vectors. This arises -from the topological folding discussed above, where the $x$ and $y$ -axes get swapped in some cases. This swapping is not an issue with -the scalar version. These subroutines call \code{EXCH2\_SEND\_RX1} and -\code{EXCH2\_SEND\_RX2}, which do most of the work using the variables -discussed above. \\ +\code{EXCH2\_RX2\_CUBE} reflect that the vector-handling subrouine +needs to pass both the $u$ and $v$ components of the phsical vectors. +This arises from the topological folding discussed above, where the +$x$ and $y$ axes get swapped in some cases. This swapping is not an +issue with the scalar version. These subroutines call +\code{EXCH2\_SEND\_RX1} and \code{EXCH2\_SEND\_RX2}, which do most of +the work using the variables discussed above. \\