--- manual/s_phys_pkgs/text/exch2.tex 2004/03/16 21:52:15 1.12 +++ manual/s_phys_pkgs/text/exch2.tex 2004/03/18 22:20:38 1.16 @@ -1,4 +1,4 @@ -% $Header: /home/ubuntu/mnt/e9_copy/manual/s_phys_pkgs/text/exch2.tex,v 1.12 2004/03/16 21:52:15 afe Exp $ +% $Header: /home/ubuntu/mnt/e9_copy/manual/s_phys_pkgs/text/exch2.tex,v 1.16 2004/03/18 22:20:38 afe Exp $ % $Name: $ %% * Introduction @@ -21,7 +21,7 @@ decomposition and parallelization. Cube faces (also called subdomains) may be divided into any number of tiles that divide evenly into the grid point dimensions of the subdomain. Furthermore, the -individual tiles may be run on separate processors in different +individual tiles can run on separate processors in different combinations, and whether exchanges between particular tiles occur between different processors is determined at runtime. This flexibility provides for manual compile-time load balancing across a @@ -65,15 +65,15 @@ configurations other than the one you intend to modify.\\ $\bullet$ Files containing grid parameters, named - \file{tile00$n$.mitgrid} where $n$=[1,6] (one per subdomain), must - be in the working directory when the MITgcm executable is run. + \file{tile00$n$.mitgrid} where $n$=\code{(1:6)} (one per subdomain), + must be in the working directory when the MITgcm executable is run. These files are provided in the example experiments for cubed sphere configurations with 32$\times$32 cube sides and are non-trivial to generate -- please contact MITgcm support if you want to generate files for other configurations. \\ $\bullet$ As always when compiling MITgcm, the file \file{SIZE.h} must - be placed where \file{genmake2} will find it. In particular for the + be placed where \file{genmake2} will find it. In particular for exch2, the domain decomposition specified in \file{SIZE.h} must correspond with the particular configuration's topology specified in \file{W2\_EXCH2\_TOPOLOGY.h} and \file{w2\_e2setup.F}. Domain @@ -119,7 +119,7 @@ The first three determine the size of the subdomains and hence the size of the overall domain. Each one determines the number of grid points, and therefore the resolution, along the subdomain -sides in a ``great circle'' around each axis of the cube. At the time +sides in a ``great circle'' around an axis of the cube. At the time of this writing MITgcm requires these three parameters to be equal, but they provide for future releases to accomodate different resolutions around the axes to allow (for example) greater resolution @@ -129,7 +129,7 @@ the tiles into which the subdomains are decomposed, and must evenly divide the integer assigned to \code{nr}, \code{nb} and \code{ng}. The result is a rectangular tiling of the subdomain. Figure -\ref{fig:24tile} shows one possible topology for a twenty-four tile +\ref{fig:24tile} shows one possible topology for a twentyfour-tile cube, and figure \ref{fig:12tile} shows one for twelve tiles. \\ \begin{figure} @@ -139,9 +139,9 @@ } \end{center} -\caption{Plot of cubed sphere topology with a 32$\times$192 domain +\caption{Plot of a cubed sphere topology with a 32$\times$192 domain divided into six 32$\times$32 subdomains, each of which is divided into four tiles -(\code{tnx=16, tny=16}) for a total of twenty-four tiles. +(\code{tnx=16, tny=16}) for a total of twentyfour tiles. } \label{fig:24tile} \end{figure} @@ -151,12 +151,26 @@ \includegraphics{part6/s12t_16x32.ps} } \end{center} -\caption{Plot of cubed sphere topology with a 32$\times$192 domain +\caption{Plot of a cubed sphere topology with a 32$\times$192 domain divided into six 32$\times$32 subdomains of two tiles each (\code{tnx=16, tny=32}). } \label{fig:12tile} \end{figure} +\begin{figure} +\begin{center} + \resizebox{4in}{!}{ + \includegraphics{part6/s6t_32x32.ps} + } +\end{center} +\caption{Plot of a cubed sphere topology with a 32$\times$192 domain +divided into six 32$\times$32 subdomains with one tile each +(\code{tnx=32, tny=32}). This is the default configuration. + } +\label{fig:6tile} +\end{figure} + + Tiles can be selected from the topology to be omitted from being allocated memory and processors. This tuning is useful in ocean modeling for omitting tiles that fall entirely on land. The tiles @@ -171,12 +185,13 @@ \label{sec:exch2mpi} Once the topology configuration files are created, the Fortran -parameters in \file{SIZE.h} must be configured to match. Section -\ref{sect:specifying_a_decomposition} \sectiontitle{Specifying a -decomposition} provides a general description of domain decomposition -within MITgcm and its relation to \file{SIZE.h}. The current section -specifies certain constraints the exch2 package imposes as well as -describes how to enable parallel execution with MPI. \\ +\code{PARAMETER}s in \file{SIZE.h} must be configured to match. +Section \ref{sect:specifying_a_decomposition} \sectiontitle{Specifying +a decomposition} provides a general description of domain +decomposition within MITgcm and its relation to \file{SIZE.h}. The +current section specifies certain constraints the exch2 package +imposes as well as describes how to enable parallel execution with +MPI. \\ As in the general case, the parameters \varlink{sNx}{sNx} and \varlink{sNy}{sNy} define the size of the individual tiles, and so @@ -266,20 +281,22 @@ The scalar parameters \varlink{exch2\_domain\_nxt}{exch2_domain_nxt} and \varlink{exch2\_domain\_nyt}{exch2_domain_nyt} express the number of tiles in the $x$ and $y$ global indices. For example, the default -setup of six tiles has \code{exch2\_domain\_nxt=6} and -\code{exch2\_domain\_nyt=1}. A topology of twenty-four square tiles, -four per subdomain (as in figure \ref{fig:24tile}), will have -\code{exch2\_domain\_nxt=12} and \code{exch2\_domain\_nyt=2}. Note -that these parameters express the tile layout to allow global data -files that are tile-layout-neutral and have no bearing on the internal -storage of the arrays. The tiles are internally stored in a range -from [1,\varlink{bi}{bi}] the $x$ axis and $y$ axis variable -\varlink{bj}{bj} is generally ignored within the package. \\ +setup of six tiles (Fig. \ref{fig:6tile}) has +\code{exch2\_domain\_nxt=6} and \code{exch2\_domain\_nyt=1}. A +topology of twenty-four square tiles, four per subdomain (as in figure +\ref{fig:24tile}), will have \code{exch2\_domain\_nxt=12} and +\code{exch2\_domain\_nyt=2}. Note that these parameters express the +tile layout to allow global data files that are tile-layout-neutral +and have no bearing on the internal storage of the arrays. The tiles +are internally stored in a range from \code{(1:\varlink{bi}{bi})} the +$x$ axis, and the $y$ axis variable \varlink{bj}{bj} is generally +ignored within the package. \\ \subsubsection{Arrays Indexed to Tile Number} -The following arrays are of size \code{NTILES}, are indexed to the -tile number, and the indices are omitted in their descriptions. \\ +The following arrays are of length \code{NTILES}and are indexed to the +tile number, which is indicated in the diagrams with the notation +\textsf{t}$n$. The indices are omitted in the descriptions. \\ The arrays \varlink{exch2\_tnx}{exch2_tnx} and \varlink{exch2\_tny}{exch2_tny} express the $x$ and $y$ dimensions of @@ -293,35 +310,39 @@ determined by the arrays \varlink{exch2\_tbasex}{exch2_tbasex} and \varlink{exch2\_tbasey}{exch2_tbasey}. These variables are used to relate the location of the edges of different tiles to each other. As -an example, in the default six-tile topology ?? each index in these -arrays are set to \code{0}. The twentyfour-tile case discussed above -will have values of \code{0} or \code{16}, depending on the quadrant -the tile falls within the subdomain. The array -\varlink{exch2\_myFace}{exch2_myFace} contains the number of the -subdomain of each tile, numbered \code{(1:6)} in the case of the -standard cube topology and indicated by \textbf{\textsf{f}}$n$ in -figures \ref{fig:12tile}) and \ref{fig:24tile}). \\ - -The elements of the arrays \varlink{exch2\_txglobalo}{exch2_txglobalo} -and \varlink{exch2\_txglobalo}{exch2_txglobalo} are similar to +an example, in the default six-tile topology (Fig. \ref{fig:6tile}) +each index in these arrays is set to \code{0} since a tile occupies +its entire subdomain. The twentyfour-tile case discussed above will +have values of \code{0} or \code{16}, depending on the quadrant the +tile falls within the subdomain. The elements of the arrays +\varlink{exch2\_txglobalo}{exch2_txglobalo} and +\varlink{exch2\_txglobalo}{exch2_txglobalo} are similar to \varlink{exch2\_tbasex}{exch2_tbasex} and \varlink{exch2\_tbasey}{exch2_tbasey}, but locate the tiles within the global address space, similar to that used by global files. \\ +The array \varlink{exch2\_myFace}{exch2_myFace} contains the number of +the subdomain of each tile, in a range \code{(1:6)} in the case of the +standard cube topology and indicated by \textbf{\textsf{f}}$n$ in +figures \ref{fig:12tile} and +\ref{fig:24tile}. \varlink{exch2\_nNeighbours}{exch2_nNeighbours} +contains a count the neighboring tiles each tile has, and is +used for setting bounds for looping over neighboring tiles. +\varlink{exch2\_tProc}{exch2_tProc} holds the process rank of each +tile, and is used in interprocess communication. \\ + + The arrays \varlink{exch2\_isWedge}{exch2_isWedge}, \varlink{exch2\_isEedge}{exch2_isEedge}, \varlink{exch2\_isSedge}{exch2_isSedge}, and \varlink{exch2\_isNedge}{exch2_isNedge} are set to \code{1} if the -indexed tile lies on the edge of a subdomain, \code{0} if not. The -values are used within the topology generator to determine the -orientation of neighboring tiles, and to indicate whether a tile lies -on the corner of a subdomain. The latter case requires special +indexed tile lies on the respective edge of a subdomain, \code{0} if +not. The values are used within the topology generator to determine +the orientation of neighboring tiles, and to indicate whether a tile +lies on the corner of a subdomain. The latter case requires special exchange and numerical handling for the singularities at the eight -corners of the cube. \varlink{exch2\_nNeighbours}{exch2_nNeighbours} -contains a count of how many neighboring tiles each tile has, and is -used for setting bounds for looping over neighboring tiles. -\varlink{exch2\_tProc}{exch2_tProc} holds the process rank of each -tile, and is used in interprocess communication. \\ +corners of the cube. \\ + \subsubsection{Arrays Indexed to Tile Number and Neighbor} @@ -331,77 +352,178 @@ The array \code{exch2\_neighbourId(a,T)} holds the tile number \code{Tn} for each of the tile number \code{T}'s neighboring tiles -\code{a}. The neighbor tiles are indexed \code{(1:MAX\_NEIGHBOURS)} -in the order right to left on the north then south edges, and then top -to bottom on the east and west edges. Maybe throw in a fig here, eh? -\\ - -The \code{exch2\_opposingSend\_record(a,T)} array holds the index -\code{b} in \texttt{exch2\_neighbourId(b,Tn)} that holds the tile -number \code{T}. In other words, +\code{a}. The neighbor tiles are indexed +\code{(1:exch2\_NNeighbours(T))} in the order right to left on the +north then south edges, and then top to bottom on the east and west +edges. Maybe throw in a fig here, eh? \\ + +\sloppy The \code{exch2\_opposingSend\_record(a,T)} array holds the +index \code{b} of the element in \texttt{exch2\_neighbourId(b,Tn)} +that holds the tile number \code{T}, given +\code{Tn=exch2\_neighborId(a,T)}. In other words, \begin{verbatim} exch2_neighbourId( exch2_opposingSend_record(a,T), exch2_neighbourId(a,T) ) = T \end{verbatim} This provides a back-reference from the neighbor tiles. \\ -The arrays \varlink{exch2\_pi}{exch2_pi}, -\varlink{exch2\_pj}{exch2_pj}, \varlink{exch2\_oi}{exch2_oi}, +The arrays \varlink{exch2\_pi}{exch2_pi} and +\varlink{exch2\_pj}{exch2_pj} specify the transformations of indices +in exchanges between the neighboring tiles. These transformations are +necessary in exchanges between subdomains because the array index in +one dimension may map to the other index in an adjacent subdomain, and +may be have its indexing reversed. This swapping arises from the +``folding'' of two-dimensional arrays into a three-dimensional cube. + +The dimensions of \code{exch2\_pi(t,N,T)} and \code{exch2\_pj(t,N,T)} +are the neighbor ID \code{N} and the tile number \code{T} as explained +above, plus a vector of length \code{2} containing transformation +factors \code{t}. The first element of the transformation vector +holds the factor to multiply the index in the same axis, and the +second element holds the the same for the orthogonal index. To +clarify, \code{exch2\_pi(1,N,T)} holds the mapping of the $x$ axis +index of tile \code{T} to the $x$ axis of tile \code{T}'s neighbor +\code{N}, and \code{exch2\_pi(2,N,T)} holds the mapping of \code{T}'s +$x$ index to the neighbor \code{N}'s $y$ index. \\ + +One of the two elements of \code{exch2\_pi} or \code{exch2\_pj} for a +given tile \code{T} and neighbor \code{N} will be \code{0}, reflecting +the fact that the two axes are orthogonal. The other element will be +\code{1} or \code{-1}, depending on whether the axes are indexed in +the same or opposite directions. For example, the transform vector of +the arrays for all tile neighbors on the same subdomain will be +\code{(1,0)}, since all tiles on the same subdomain are oriented +identically. An axis that corresponds to the orthogonal dimension +with the same index direction in a particular tile-neighbor +orientation will have \code{(0,1)}. Those in the opposite index +direction will have \code{(0,-1)} in order to reverse the ordering. \\ + +The arrays \varlink{exch2\_oi}{exch2_oi}, \varlink{exch2\_oj}{exch2_oj}, \varlink{exch2\_oi\_f}{exch2_oi_f}, and -\varlink{exch2\_oj\_f}{exch2_oj_f} specify the transformations in -exchanges between the neighboring tiles. The dimensions of -\code{exch2\_pi(t,N,T)} and \code{exch2\_pj(t,N,T)} are the neighbor -ID \code{N} and the tile number \code{T} as explained above, plus a -vector of length 2 containing transformation factors \code{t}. The -first element of the transformation vector indicates the factor -\code{t} by which variables representing the same vector component of -a tile \code{T} will be multiplied in exchanges with neighbor -\code{N}, and the second element indicates the transform to the -variable in the other direction. As an example, -\code{exch2\_pi(1,N,T)} holds the transform of the $i$ component of a -vector variable in tile \code{T} to the $i$ component of tile -\code{T}'s neighbor \code{N}, and \code{exch2\_pi(2,N,T)} hold the -component of neighbor \code{N}'s $j$ component. \\ +\varlink{exch2\_oj\_f}{exch2_oj_f} are indexed to tile number and +neighbor and specify the relative offset within the subdomain of the +array index of a variable going from a neighboring tile $N$ to a local +tile $T$. Consider \code{T=1} in the six-tile topology +(Fig. \ref{fig:6tile}), where + +\begin{verbatim} + exch2_oi(1,1)=33 + exch2_oi(2,1)=0 + exch2_oi(3,1)=32 + exch2_oi(4,1)=-32 +\end{verbatim} + +The simplest case is \code{exch2\_oi(2,1)}, the southern neighbor, +which is \code{Tn=6}. The axes of \code{T} and \code{Tn} have the +same orientation and their $x$ axes have the same origin, and so an +exchange between the two requires no changes to the $x$ index. For +the western neighbor (\code{Tn=5}), \code{code\_oi(3,1)=32} since the +\code{x=0} vector on \code{T} corresponds to the \code{y=32} vector on +\code{Tn}. The eastern edge of \code{T} shows the reverse case +(\code{exch2\_oi(4,1)=-32)}, where \code{x=32} on \code{T} exchanges +with \code{x=0} on \code{Tn=2}. The most interesting case, where +\code{exch2\_oi(1,1)=33} and \code{Tn=3}, involves a reversal of +indices. As in every case, the offset \code{exch2\_oi} is added to +the original $x$ index of \code{T} multiplied by the transformation +factor \code{exch2\_pi(t,N,T)}. Here \code{exch2\_pi(1,1,1)=0} since +the $x$ axis of \code{T} is orthogonal to the $x$ axis of \code{Tn}. +\code{exch2\_pi(2,1,1)=-1} since the $x$ axis of \code{T} corresponds +to the $y$ axis of \code{Tn}, but the axes are reversed. The result +is that the index of the northern edge of \code{T}, which runs +\code{(1:32)}, is transformed to +\code{(-1:-32)}. \code{exch2\_oi(1,1)} is then added to this range to +get back \code{(1:32)} -- the index of the $y$ axis of \code{Tn}. +This transformation may seem overly convoluted for the six-tile case, +but it is necessary to provide a general solution for various +topologies. \\ + + + +Finally, \varlink{exch2\_itlo\_c}{exch2_itlo_c}, +\varlink{exch2\_ithi\_c}{exch2_ithi_c}, +\varlink{exch2\_jtlo\_c}{exch2_jtlo_c} and +\varlink{exch2\_jthi\_c}{exch2_jthi_c} hold the location and index +bounds of the edge segment of the neighbor tile \code{N}'s subdomain +that gets exchanged with the local tile \code{T}. To take the example +of tile \code{T=2} in the twelve-tile topology +(Fig. \ref{fig:12tile}): \\ + +\begin{verbatim} + exch2_itlo_c(4,2)=17 + exch2_ithi_c(4,2)=17 + exch2_jtlo_c(4,2)=0 + exch2_jthi_c(4,2)=33 +\end{verbatim} -Under the current cube topology, one of the two elements of -\code{exch2\_pi} or \code{exch2\_pj} for a given tile \code{T} and -neighbor \code{N} will be \code{0}, reflecting the fact that the two -vector components are orthogonal. The other element will be 1 or -1, -depending on whether the components are indexed in the same or -opposite directions. For example, the transform vector of the arrays -for all tile neighbors on the same subdomain will be \code{(1,0)}, -since all tiles on the same subdomain are oriented identically. A -vector direction that corresponds to the orthogonal dimension with the -same index direction in a particular tile-neighbor orientation will -have \code{(0,1)}, whereas those in the opposite index direction will -have \code{(0,-1)}. This needs some diagrams. +Here \code{N=4}, indicating the western neighbor, which is \code{Tn=1}. +\code{Tn=1} resides on the same subdomain as \code{T=2}, so the tiles +have the same orientation and the same $x$ and $y$ axes. The $i$ +component is orthogonal to the western edge and the tile is 16 points +wide, so \code{exch2\_itlo\_c} and \code{exch2\_ithi\_c} indicate the +column beyond \code{Tn=1}'s eastern edge, in that tile's halo +region. Since the border of the tiles extends through the entire +height of the subdomain, the $y$ axis bounds \code{exch2\_jtlo\_c} to +\code{exch2\_jthi\_c} cover the height, plus 1 in either direction to +cover part of the halo. \\ +For the north edge of the same tile \code{T=2} where \code{N=1} and +the neighbor tile is \code{Tn=5}: -{\footnotesize \begin{verbatim} -C exch2_pi :: X index row of target to source permutation -C :: matrix for each neighbour entry. -C exch2_pj :: Y index row of target to source permutation -C :: matrix for each neighbour entry. -C exch2_oi :: X index element of target to source -C :: offset vector for cell-centered quantities -C :: of each neighbor entry. -C exch2_oj :: Y index element of target to source -C :: offset vector for cell-centered quantities -C :: of each neighbor entry. -C exch2_oi_f :: X index element of target to source -C :: offset vector for face quantities -C :: of each neighbor entry. -C exch2_oj_f :: Y index element of target to source -C :: offset vector for face quantities -C :: of each neighbor entry. + exch2_itlo_c(1,2)=0 + exch2_ithi_c(1,2)=0 + exch2_jtlo_c(1,2)=0 + exch2_jthi_c(1,2)=17 \end{verbatim} -} + +\code{T}'s northern edge is parallel to the $x$ axis, but since +\code{Tn}'s $y$ axis corresponds to \code{T}'s $x$ axis, +\code{T}'s northern edge exchanges with \code{Tn}'s western edge. +The western edge of the tiles corresponds to the lower bound of the +$x$ axis, so \code{exch2\_itlo\_c} \code{exch2\_ithi\_c} are \code{0}. The +range of \code{exch2\_jtlo\_c} and \code{exch2\_jthi\_c} correspond to the +width of \code{T}'s northern edge, plus the halo. \\ -\subsection{Key Routines} -\subsection{References} + + + + + +This needs some diagrams. \\ + + + +\subsection{Key Routines} + +Most of the subroutines particular to exch2 handle the exchanges +themselves and are of the same format as those described in +\ref{sect:cube_sphere_communication} \sectiontitle{Cube sphere +communication}. Like the original routines, they are written as +templates which the local Makefile converts from RX into RL and RS +forms. \\ + +The interfaces with the core model subroutines are +\code{EXCH\_UV\_XY\_RX}, \code{EXCH\_UV\_XYZ\_RX} and \code{EXCH\_XY\_RX}. +They override the standard exchange routines when \code{genmake2} is +run with \code{exch2} option. They in turn call the local exch2 +subroutines \code{EXCH2\_UV\_XY\_RX} and \code{EXCH2\_UV\_XYZ\_RX} for two +and three dimensional vector quantities, and \code{EXCH2\_XY\_RX} and +\code{EXCH2\_XYZ\_RX} for two and three dimensional scalar quantities. +These subroutines set the dimensions of the area to be exchanged, call +\code{EXCH2\_RX1\_CUBE} for scalars and \code{EXCH2\_RX2\_CUBE} for +vectors, and then handle the singularities at the cube corners. \\ + +The separate scalar and vector forms of \code{EXCH2\_RX1\_CUBE} and +\code{EXCH2\_RX2\_CUBE} reflect that the vector-handling subrouine needs +to pass both the $x$ and $y$ components of the vectors. This arises +from the topological folding discussed above, where the $x$ and $y$ +axes get swapped in some cases. This swapping is not an issue with +the scalar version. These subroutines call \code{EXCH2\_SEND\_RX1} and +\code{EXCH2\_SEND\_RX2}, which do most of the work using the variables +discussed above. \\ +