--- manual/s_phys_pkgs/text/exch2.tex 2004/03/17 19:49:22 1.13 +++ manual/s_phys_pkgs/text/exch2.tex 2004/03/19 21:25:45 1.17 @@ -1,4 +1,4 @@ -% $Header: /home/ubuntu/mnt/e9_copy/manual/s_phys_pkgs/text/exch2.tex,v 1.13 2004/03/17 19:49:22 afe Exp $ +% $Header: /home/ubuntu/mnt/e9_copy/manual/s_phys_pkgs/text/exch2.tex,v 1.17 2004/03/19 21:25:45 afe Exp $ % $Name: $ %% * Introduction @@ -16,16 +16,14 @@ \subsection{Introduction} -The \texttt{exch2} package extends the original cubed -sphere topology configuration to allow more flexible domain -decomposition and parallelization. Cube faces (also called -subdomains) may be divided into any number of tiles that divide evenly -into the grid point dimensions of the subdomain. Furthermore, the -individual tiles can run on separate processors in different -combinations, and whether exchanges between particular tiles occur -between different processors is determined at runtime. This -flexibility provides for manual compile-time load balancing across a -relatively arbitrary number of processors. \\ +The \texttt{exch2} package extends the original cubed sphere topology +configuration to allow more flexible domain decomposition and +parallelization. Cube faces (also called subdomains) may be divided +into any number of tiles that divide evenly into the grid point +dimensions of the subdomain. Furthermore, the tiles can run on +separate processors individually or in groups, which provides for +manual compile-time load balancing across a relatively arbitrary +number of processors. \\ The exchange parameters are declared in \filelink{pkg/exch2/W2\_EXCH2\_TOPOLOGY.h}{pkg-exch2-W2_EXCH2_TOPOLOGY.h} @@ -55,13 +53,13 @@ details. \\ $\bullet$ An example of \file{W2\_EXCH2\_TOPOLOGY.h} and - \file{w2\_e2setup.F} must reside in a directory containing code - linked when \file{genmake2} runs. The safest place to put these - is the directory indicated in the \code{-mods=DIR} command line - modifier (typically \file{../code}), or the build directory. The - default versions of these files reside in \file{pkg/exch2} and are - linked automatically if no other versions exist elsewhere in the - link path, but they should be left untouched to avoid breaking + \file{w2\_e2setup.F} must reside in a directory containing files + symbolically linked when \file{genmake2} runs. The safest place to + put these is the directory indicated in the \code{-mods=DIR} command + line modifier (typically \file{../code}), or the build directory. + The default versions of these files reside in \file{pkg/exch2} and + are linked automatically if no other versions exist elsewhere in the + build path, but they should be left untouched to avoid breaking configurations other than the one you intend to modify.\\ $\bullet$ Files containing grid parameters, named @@ -84,8 +82,8 @@ Section \ref{sect:specifying_a_decomposition} \sectiontitle{Specifying a decomposition}.\\ -As of the time of writing the following examples use exch2 and may be -used for guidance: +At the time of this writing the following examples use exch2 and may +be used for guidance: \begin{verbatim} verification/adjust_nlfs.cs-32x32x1 @@ -129,7 +127,7 @@ the tiles into which the subdomains are decomposed, and must evenly divide the integer assigned to \code{nr}, \code{nb} and \code{ng}. The result is a rectangular tiling of the subdomain. Figure -\ref{fig:24tile} shows one possible topology for a twentyfour-tile +\ref{fig:24tile} shows one possible topology for a twenty-four-tile cube, and figure \ref{fig:12tile} shows one for twelve tiles. \\ \begin{figure} @@ -140,9 +138,9 @@ \end{center} \caption{Plot of a cubed sphere topology with a 32$\times$192 domain -divided into six 32$\times$32 subdomains, each of which is divided into four tiles -(\code{tnx=16, tny=16}) for a total of twentyfour tiles. -} \label{fig:24tile} +divided into six 32$\times$32 subdomains, each of which is divided +into four tiles (\code{tnx=16, tny=16}) for a total of twenty-four +tiles. } \label{fig:24tile} \end{figure} \begin{figure} @@ -206,7 +204,7 @@ The parameters \varlink{nSx}{nSx}, \varlink{nSy}{nSy}, \varlink{nPx}{nPx}, and \varlink{nPy}{nPy} relate to the number of tiles and how they are distributed on processors. When using exch2, -the tiles are stored in single dimension, and so +the tiles are stored in a single dimension, and so \code{\varlink{nSy}{nSy}=1} in all cases. Since the tiles as configured by exch2 cannot be split up accross processors without regenerating the topology, \code{\varlink{nPy}{nPy}=1} as well. \\ @@ -237,8 +235,8 @@ & Nr = 5) \end{verbatim} -The following is an example for the twentyfour-tile topology in figure -\ref{fig:24tile} running on six processors: +The following is an example for the twenty-four-tile topology in +figure \ref{fig:24tile} running on six processors: \begin{verbatim} PARAMETER ( @@ -262,9 +260,9 @@ \subsection{Key Variables} The descriptions of the variables are divided up into scalars, -one-dimensional arrays indexed to the tile number, and two and three -dimensional arrays indexed to tile number and neighboring tile. This -division reflects the functionality of these variables: The +one-dimensional arrays indexed to the tile number, and two and +three-dimensional arrays indexed to tile number and neighboring tile. +This division reflects the functionality of these variables: The scalars are common to every part of the topology, the tile-indexed arrays to individual tiles, and the arrays indexed by tile and neighbor to relationships between tiles and their neighbors. \\ @@ -281,20 +279,22 @@ The scalar parameters \varlink{exch2\_domain\_nxt}{exch2_domain_nxt} and \varlink{exch2\_domain\_nyt}{exch2_domain_nyt} express the number of tiles in the $x$ and $y$ global indices. For example, the default -setup of six tiles (Fig. \ref{fig:6tile}) has \code{exch2\_domain\_nxt=6} and -\code{exch2\_domain\_nyt=1}. A topology of twenty-four square tiles, -four per subdomain (as in figure \ref{fig:24tile}), will have -\code{exch2\_domain\_nxt=12} and \code{exch2\_domain\_nyt=2}. Note -that these parameters express the tile layout to allow global data -files that are tile-layout-neutral and have no bearing on the internal -storage of the arrays. The tiles are internally stored in a range -from \code{(1:\varlink{bi}{bi})} the $x$ axis, and $y$ axis variable -\varlink{bj}{bj} is generally ignored within the package. \\ +setup of six tiles (Fig. \ref{fig:6tile}) has +\code{exch2\_domain\_nxt=6} and \code{exch2\_domain\_nyt=1}. A +topology of twenty-four square tiles, four per subdomain (as in figure +\ref{fig:24tile}), will have \code{exch2\_domain\_nxt=12} and +\code{exch2\_domain\_nyt=2}. Note that these parameters express the +tile layout to allow global data files that are tile-layout-neutral +and have no bearing on the internal storage of the arrays. The tiles +are stored internally in a range from \code{(1:\varlink{bi}{bi})} the +$x$ axis, and the $y$ axis variable \varlink{bj}{bj} generally is +ignored within the package. \\ \subsubsection{Arrays Indexed to Tile Number} -The following arrays are of length \code{NTILES}, are indexed to the -tile number, and the indices are omitted in their descriptions. \\ +The following arrays are of length \code{NTILES} and are indexed to +the tile number, which is indicated in the diagrams with the notation +\textsf{t}$n$. The indices are omitted in the descriptions. \\ The arrays \varlink{exch2\_tnx}{exch2_tnx} and \varlink{exch2\_tny}{exch2_tny} express the $x$ and $y$ dimensions of @@ -310,22 +310,23 @@ relate the location of the edges of different tiles to each other. As an example, in the default six-tile topology (Fig. \ref{fig:6tile}) each index in these arrays is set to \code{0} since a tile occupies -its entire subdomain. The twentyfour-tile case discussed above will +its entire subdomain. The twenty-four-tile case discussed above will have values of \code{0} or \code{16}, depending on the quadrant the tile falls within the subdomain. The elements of the arrays \varlink{exch2\_txglobalo}{exch2_txglobalo} and \varlink{exch2\_txglobalo}{exch2_txglobalo} are similar to \varlink{exch2\_tbasex}{exch2_tbasex} and \varlink{exch2\_tbasey}{exch2_tbasey}, but locate the tiles within the -global address space, similar to that used by global files. \\ +global address space, similar to that used by global output and input +files. \\ The array \varlink{exch2\_myFace}{exch2_myFace} contains the number of the subdomain of each tile, in a range \code{(1:6)} in the case of the standard cube topology and indicated by \textbf{\textsf{f}}$n$ in figures \ref{fig:12tile} and \ref{fig:24tile}. \varlink{exch2\_nNeighbours}{exch2_nNeighbours} -contains a count of how many neighboring tiles each tile has, and is -used for setting bounds for looping over neighboring tiles. +contains a count of the neighboring tiles each tile has, and is used +for setting bounds for looping over neighboring tiles. \varlink{exch2\_tProc}{exch2_tProc} holds the process rank of each tile, and is used in interprocess communication. \\ @@ -334,31 +335,30 @@ \varlink{exch2\_isEedge}{exch2_isEedge}, \varlink{exch2\_isSedge}{exch2_isSedge}, and \varlink{exch2\_isNedge}{exch2_isNedge} are set to \code{1} if the -indexed tile lies on the edge of a subdomain, \code{0} if not. The -values are used within the topology generator to determine the -orientation of neighboring tiles, and to indicate whether a tile lies -on the corner of a subdomain. The latter case requires special +indexed tile lies on the respective edge of a subdomain, \code{0} if +not. The values are used within the topology generator to determine +the orientation of neighboring tiles, and to indicate whether a tile +lies on the corner of a subdomain. The latter case requires special exchange and numerical handling for the singularities at the eight corners of the cube. \\ \subsubsection{Arrays Indexed to Tile Number and Neighbor} -The following arrays are all of size -\code{MAX\_NEIGHBOURS}$\times$\code{NTILES} and describe the -orientations between the the tiles. \\ +The following arrays have vectors of length \code{MAX\_NEIGHBOURS} and +\code{NTILES} and describe the orientations between the the tiles. \\ The array \code{exch2\_neighbourId(a,T)} holds the tile number \code{Tn} for each of the tile number \code{T}'s neighboring tiles -\code{a}. The neighbor tiles are indexed \code{(1:MAX\_NEIGHBOURS)} -in the order right to left on the north then south edges, and then top -to bottom on the east and west edges. Maybe throw in a fig here, eh? -\\ - -\sloppy -The \code{exch2\_opposingSend\_record(a,T)} array holds the index -\code{b} in \texttt{exch2\_neighbourId(b,Tn)} that holds the tile -number \code{T}. In other words, +\code{a}. The neighbor tiles are indexed +\code{(1:exch2\_nNeighbours(T))} in the order right to left on the +north then south edges, and then top to bottom on the east then west +edges. \\ + + The \code{exch2\_opposingSend\_record(a,T)} array holds the +index \code{b} of the element in \texttt{exch2\_neighbourId(b,Tn)} +that holds the tile number \code{T}, given +\code{Tn=exch2\_neighborId(a,T)}. In other words, \begin{verbatim} exch2_neighbourId( exch2_opposingSend_record(a,T), exch2_neighbourId(a,T) ) = T @@ -366,77 +366,152 @@ This provides a back-reference from the neighbor tiles. \\ The arrays \varlink{exch2\_pi}{exch2_pi} and -\varlink{exch2\_pj}{exch2_pj} specify the transformations of variables +\varlink{exch2\_pj}{exch2_pj} specify the transformations of indices in exchanges between the neighboring tiles. These transformations are -necessary in exchanges between subdomains because a physical vector -component in one direction may map to one in a different direction in -an adjacent subdomain, and may be have its indexing reversed. This -swapping arises from the ``folding'' of two-dimensional arrays into a -three-dimensional cube. +necessary in exchanges between subdomains because the array index in +one dimension may map to the other index in an adjacent subdomain, and +may be have its indexing reversed. This swapping arises from the +``folding'' of two-dimensional arrays into a three-dimensional +cube. \\ The dimensions of \code{exch2\_pi(t,N,T)} and \code{exch2\_pj(t,N,T)} are the neighbor ID \code{N} and the tile number \code{T} as explained -above, plus a vector of length 2 containing transformation factors -\code{t}. The first element of the transformation vector indicates -the factor \code{t} by which variables representing the same -\emph{physical} vector component of a tile \code{T} will be multiplied -in exchanges with neighbor \code{N}, and the second element indicates -the transform to the physical vector in the other direction. To -clarify (hopefully), \code{exch2\_pi(1,N,T)} holds the transform of -the $i$ component of a vector variable in tile \code{T} to the $i$ -component of tile \code{T}'s neighbor \code{N}, and -\code{exch2\_pi(2,N,T)} holds the transform of \code{T}'s $i$ -components to the neighbor \code{N}'s $j$ component. \\ +above, plus a vector of length \code{2} containing transformation +factors \code{t}. The first element of the transformation vector +holds the factor to multiply the index in the same axis, and the +second element holds the the same for the orthogonal index. To +clarify, \code{exch2\_pi(1,N,T)} holds the mapping of the $x$ axis +index of tile \code{T} to the $x$ axis of tile \code{T}'s neighbor +\code{N}, and \code{exch2\_pi(2,N,T)} holds the mapping of \code{T}'s +$x$ index to the neighbor \code{N}'s $y$ index. \\ -Under the current cube topology, one of the two elements of -\code{exch2\_pi} or \code{exch2\_pj} for a given tile \code{T} and -neighbor \code{N} will be \code{0}, reflecting the fact that the two -vector components are orthogonal. The other element will be \code{1} -or \code{-1}, depending on whether the components are indexed in the -same or opposite directions. For example, the transform vector of the -arrays for all tile neighbors on the same subdomain will be +One of the two elements of \code{exch2\_pi} or \code{exch2\_pj} for a +given tile \code{T} and neighbor \code{N} will be \code{0}, reflecting +the fact that the two axes are orthogonal. The other element will be +\code{1} or \code{-1}, depending on whether the axes are indexed in +the same or opposite directions. For example, the transform vector of +the arrays for all tile neighbors on the same subdomain will be \code{(1,0)}, since all tiles on the same subdomain are oriented -identically. A vector direction that corresponds to the orthogonal -dimension with the same index direction in a particular tile-neighbor -orientation will have \code{(0,1)}, whereas those in the opposite -index direction will have \code{(0,-1)}. \\ +identically. An axis that corresponds to the orthogonal dimension +with the same index direction in a particular tile-neighbor +orientation will have \code{(0,1)}. Those in the opposite index +direction will have \code{(0,-1)} in order to reverse the ordering. \\ - -\varlink{exch2\_oi}{exch2_oi}, +The arrays \varlink{exch2\_oi}{exch2_oi}, \varlink{exch2\_oj}{exch2_oj}, \varlink{exch2\_oi\_f}{exch2_oi_f}, and -\varlink{exch2\_oj\_f}{exch2_oj_f} +\varlink{exch2\_oj\_f}{exch2_oj_f} are indexed to tile number and +neighbor and specify the relative offset within the subdomain of the +array index of a variable going from a neighboring tile \code{N} to a +local tile \code{T}. Consider \code{T=1} in the six-tile topology +(Fig. \ref{fig:6tile}), where +\begin{verbatim} + exch2_oi(1,1)=33 + exch2_oi(2,1)=0 + exch2_oi(3,1)=32 + exch2_oi(4,1)=-32 +\end{verbatim} +The simplest case is \code{exch2\_oi(2,1)}, the southern neighbor, +which is \code{Tn=6}. The axes of \code{T} and \code{Tn} have the +same orientation and their $x$ axes have the same origin, and so an +exchange between the two requires no changes to the $x$ index. For +the western neighbor (\code{Tn=5}), \code{code\_oi(3,1)=32} since the +\code{x=0} vector on \code{T} corresponds to the \code{y=32} vector on +\code{Tn}. The eastern edge of \code{T} shows the reverse case +(\code{exch2\_oi(4,1)=-32)}), where \code{x=32} on \code{T} exchanges +with \code{x=0} on \code{Tn=2}. \\ + + The most interesting case, where \code{exch2\_oi(1,1)=33} and +\code{Tn=3}, involves a reversal of indices. As in every case, the +offset \code{exch2\_oi} is added to the original $x$ index of \code{T} +multiplied by the transformation factor \code{exch2\_pi(t,N,T)}. Here +\code{exch2\_pi(1,1,1)=0} since the $x$ axis of \code{T} is orthogonal +to the $x$ axis of \code{Tn}. \code{exch2\_pi(2,1,1)=-1} since the +$x$ axis of \code{T} corresponds to the $y$ axis of \code{Tn}, but the +index is reversed. The result is that the index of the northern edge +of \code{T}, which runs \code{(1:32)}, is transformed to +\code{(-1:-32)}. \code{exch2\_oi(1,1)} is then added to this range to +get back \code{(32:1)} -- the index of the $y$ axis of \code{Tn} +relative to \code{T}. This transformation may seem overly convoluted +for the six-tile case, but it is necessary to provide a general +solution for various topologies. \\ -This needs some diagrams. \\ +Finally, \varlink{exch2\_itlo\_c}{exch2_itlo_c}, +\varlink{exch2\_ithi\_c}{exch2_ithi_c}, +\varlink{exch2\_jtlo\_c}{exch2_jtlo_c} and +\varlink{exch2\_jthi\_c}{exch2_jthi_c} hold the location and index +bounds of the edge segment of the neighbor tile \code{N}'s subdomain +that gets exchanged with the local tile \code{T}. To take the example +of tile \code{T=2} in the twelve-tile topology +(Fig. \ref{fig:12tile}): \\ -{\footnotesize \begin{verbatim} -C exch2_pi :: X index row of target to source permutation -C :: matrix for each neighbour entry. -C exch2_pj :: Y index row of target to source permutation -C :: matrix for each neighbour entry. -C exch2_oi :: X index element of target to source -C :: offset vector for cell-centered quantities -C :: of each neighbor entry. -C exch2_oj :: Y index element of target to source -C :: offset vector for cell-centered quantities -C :: of each neighbor entry. -C exch2_oi_f :: X index element of target to source -C :: offset vector for face quantities -C :: of each neighbor entry. -C exch2_oj_f :: Y index element of target to source -C :: offset vector for face quantities -C :: of each neighbor entry. + exch2_itlo_c(4,2)=17 + exch2_ithi_c(4,2)=17 + exch2_jtlo_c(4,2)=0 + exch2_jthi_c(4,2)=33 \end{verbatim} -} + +Here \code{N=4}, indicating the western neighbor, which is +\code{Tn=1}. \code{Tn} resides on the same subdomain as \code{T}, so +the tiles have the same orientation and the same $x$ and $y$ axes. +The $x$ axis is orthogonal to the western edge and the tile is 16 +points wide, so \code{exch2\_itlo\_c} and \code{exch2\_ithi\_c} +indicate the column beyond \code{Tn}'s eastern edge, in that tile's +halo region. Since the border of the tiles extends through the entire +height of the subdomain, the $y$ axis bounds \code{exch2\_jtlo\_c} to +\code{exch2\_jthi\_c} cover the height of \code{(1:32)}, plus 1 in +either direction to cover part of the halo. \\ +For the north edge of the same tile \code{T=2} where \code{N=1} and +the neighbor tile is \code{Tn=5}: +\begin{verbatim} + exch2_itlo_c(1,2)=0 + exch2_ithi_c(1,2)=0 + exch2_jtlo_c(1,2)=0 + exch2_jthi_c(1,2)=17 +\end{verbatim} + +\code{T}'s northern edge is parallel to the $x$ axis, but since +\code{Tn}'s $y$ axis corresponds to \code{T}'s $x$ axis, \code{T}'s +northern edge exchanges with \code{Tn}'s western edge. The western +edge of the tiles corresponds to the lower bound of the $x$ axis, so +\code{exch2\_itlo\_c} \code{exch2\_ithi\_c} are \code{0}. The range of +\code{exch2\_jtlo\_c} and \code{exch2\_jthi\_c} correspond to the +width of \code{T}'s northern edge, plus the halo. \\ -\subsection{Key Routines} +\subsection{Key Routines} +Most of the subroutines particular to exch2 handle the exchanges +themselves and are of the same format as those described in +\ref{sect:cube_sphere_communication} \sectiontitle{Cube sphere +communication}. Like the original routines, they are written as +templates which the local Makefile converts from RX into RL and RS +forms. \\ + +The interfaces with the core model subroutines are +\code{EXCH\_UV\_XY\_RX}, \code{EXCH\_UV\_XYZ\_RX} and +\code{EXCH\_XY\_RX}. They override the standard exchange routines +when \code{genmake2} is run with \code{exch2} option. They in turn +call the local exch2 subroutines \code{EXCH2\_UV\_XY\_RX} and +\code{EXCH2\_UV\_XYZ\_RX} for two and three-dimensional vector +quantities, and \code{EXCH2\_XY\_RX} and \code{EXCH2\_XYZ\_RX} for two +and three-dimensional scalar quantities. These subroutines set the +dimensions of the area to be exchanged, call \code{EXCH2\_RX1\_CUBE} +for scalars and \code{EXCH2\_RX2\_CUBE} for vectors, and then handle +the singularities at the cube corners. \\ + +The separate scalar and vector forms of \code{EXCH2\_RX1\_CUBE} and +\code{EXCH2\_RX2\_CUBE} reflect that the vector-handling subrouine +needs to pass both the $u$ and $v$ components of the phsical vectors. +This arises from the topological folding discussed above, where the +$x$ and $y$ axes get swapped in some cases. This swapping is not an +issue with the scalar version. These subroutines call +\code{EXCH2\_SEND\_RX1} and \code{EXCH2\_SEND\_RX2}, which do most of +the work using the variables discussed above. \\ -\subsection{References}