| 1 |
afe |
1.23 |
% $Header: /u/gcmpack/manual/part6/exch2.tex,v 1.19 2004/05/10 21:39:11 afe Exp $ |
| 2 |
afe |
1.1 |
% $Name: $ |
| 3 |
|
|
|
| 4 |
|
|
%% * Introduction |
| 5 |
|
|
%% o what it does, citations (refs go into mitgcm_manual.bib, |
| 6 |
|
|
%% preferably in alphabetic order) |
| 7 |
|
|
%% o Equations |
| 8 |
|
|
%% * Key subroutines and parameters |
| 9 |
|
|
%% * Reference material (auto generated from Protex and structured comments) |
| 10 |
|
|
%% o automatically inserted at \section{Reference} |
| 11 |
|
|
|
| 12 |
|
|
|
| 13 |
afe |
1.10 |
\section{exch2: Extended Cubed Sphere \mbox{Topology}} |
| 14 |
afe |
1.3 |
\label{sec:exch2} |
| 15 |
|
|
|
| 16 |
afe |
1.1 |
|
| 17 |
|
|
\subsection{Introduction} |
| 18 |
afe |
1.2 |
|
| 19 |
afe |
1.17 |
The \texttt{exch2} package extends the original cubed sphere topology |
| 20 |
|
|
configuration to allow more flexible domain decomposition and |
| 21 |
|
|
parallelization. Cube faces (also called subdomains) may be divided |
| 22 |
|
|
into any number of tiles that divide evenly into the grid point |
| 23 |
|
|
dimensions of the subdomain. Furthermore, the tiles can run on |
| 24 |
|
|
separate processors individually or in groups, which provides for |
| 25 |
|
|
manual compile-time load balancing across a relatively arbitrary |
| 26 |
|
|
number of processors. \\ |
| 27 |
edhill |
1.8 |
|
| 28 |
|
|
The exchange parameters are declared in |
| 29 |
|
|
\filelink{pkg/exch2/W2\_EXCH2\_TOPOLOGY.h}{pkg-exch2-W2_EXCH2_TOPOLOGY.h} |
| 30 |
|
|
and assigned in |
| 31 |
afe |
1.9 |
\filelink{pkg/exch2/w2\_e2setup.F}{pkg-exch2-w2_e2setup.F}. The |
| 32 |
afe |
1.11 |
validity of the cube topology depends on the \file{SIZE.h} file as |
| 33 |
afe |
1.12 |
detailed below. The default files provided in the release configure a |
| 34 |
|
|
cubed sphere topology of six tiles, one per subdomain, each with |
| 35 |
afe |
1.18 |
32$\times$32 grid points, with all tiles running on a single processor. Both |
| 36 |
afe |
1.12 |
files are generated by Matlab scripts in |
| 37 |
afe |
1.11 |
\file{utils/exch2/matlab-topology-generator}; see Section |
| 38 |
|
|
\ref{sec:topogen} \sectiontitle{Generating Topology Files for exch2} |
| 39 |
afe |
1.12 |
for details on creating alternate topologies. Pregenerated examples |
| 40 |
|
|
of these files with alternate topologies are provided under |
| 41 |
afe |
1.11 |
\file{utils/exch2/code-mods} along with the appropriate \file{SIZE.h} |
| 42 |
|
|
file for single-processor execution. |
| 43 |
afe |
1.9 |
|
| 44 |
|
|
\subsection{Invoking exch2} |
| 45 |
|
|
|
| 46 |
afe |
1.10 |
To use exch2 with the cubed sphere, the following conditions must be |
| 47 |
afe |
1.23 |
met: \\ |
| 48 |
afe |
1.9 |
|
| 49 |
afe |
1.23 |
$\bullet$ The exch2 package is included when \file{genmake2} is run. |
| 50 |
|
|
The easiest way to do this is to add the line \code{exch2} to the |
| 51 |
|
|
\file{profile.conf} file -- see Section |
| 52 |
|
|
\ref{sect:buildingCode} \sectiontitle{Building the code} for general |
| 53 |
|
|
details. \\ |
| 54 |
|
|
|
| 55 |
|
|
$\bullet$ An example of \file{W2\_EXCH2\_TOPOLOGY.h} and |
| 56 |
afe |
1.17 |
\file{w2\_e2setup.F} must reside in a directory containing files |
| 57 |
afe |
1.23 |
symbolically linked by the \file{genmake2} script. The safest place to |
| 58 |
|
|
put these is the directory indicated in the \code{-mods=DIR} command |
| 59 |
|
|
line modifier (typically \file{../code}), or the build directory. |
| 60 |
|
|
The default versions of these files reside in \file{pkg/exch2} and |
| 61 |
|
|
are linked automatically if no other versions exist elsewhere in the |
| 62 |
|
|
build path, but they should be left untouched to avoid breaking |
| 63 |
|
|
configurations other than the one you intend to modify.\\ |
| 64 |
|
|
|
| 65 |
|
|
$\bullet$ Files containing grid parameters, named |
| 66 |
|
|
\file{tile00$n$.mitgrid} where $n$=\code{(1:6)} (one per subdomain), |
| 67 |
|
|
must be in the working directory when the MITgcm executable is run. |
| 68 |
|
|
These files are provided in the example experiments for cubed sphere |
| 69 |
|
|
configurations with 32$\times$32 cube sides |
| 70 |
|
|
-- please contact MITgcm support if you want to generate |
| 71 |
|
|
files for other configurations. \\ |
| 72 |
|
|
|
| 73 |
|
|
$\bullet$ As always when compiling MITgcm, the file \file{SIZE.h} must |
| 74 |
|
|
be placed where \file{genmake2} will find it. In particular for |
| 75 |
|
|
exch2, the domain decomposition specified in \file{SIZE.h} must |
| 76 |
|
|
correspond with the particular configuration's topology specified in |
| 77 |
afe |
1.12 |
\file{W2\_EXCH2\_TOPOLOGY.h} and \file{w2\_e2setup.F}. Domain |
| 78 |
|
|
decomposition issues particular to exch2 are addressed in Section |
| 79 |
|
|
\ref{sec:topogen} \sectiontitle{Generating Topology Files for exch2} |
| 80 |
afe |
1.23 |
and \ref{sec:exch2mpi} \sectiontitle{exch2, SIZE.h, and Multiprocessing}; a more |
| 81 |
|
|
general background on the subject relevant to MITgcm is presented in |
| 82 |
|
|
Section \ref{sect:specifying_a_decomposition} |
| 83 |
|
|
\sectiontitle{Specifying a decomposition}.\\ |
| 84 |
afe |
1.9 |
|
| 85 |
afe |
1.17 |
At the time of this writing the following examples use exch2 and may |
| 86 |
|
|
be used for guidance: |
| 87 |
afe |
1.9 |
|
| 88 |
|
|
\begin{verbatim} |
| 89 |
|
|
verification/adjust_nlfs.cs-32x32x1 |
| 90 |
|
|
verification/adjustment.cs-32x32x1 |
| 91 |
|
|
verification/aim.5l_cs |
| 92 |
|
|
verification/global_ocean.cs32x15 |
| 93 |
|
|
verification/hs94.cs-32x32x5 |
| 94 |
|
|
\end{verbatim} |
| 95 |
|
|
|
| 96 |
|
|
|
| 97 |
|
|
|
| 98 |
|
|
|
| 99 |
afe |
1.10 |
\subsection{Generating Topology Files for exch2} |
| 100 |
|
|
\label{sec:topogen} |
| 101 |
|
|
|
| 102 |
|
|
Alternate cubed sphere topologies may be created using the Matlab |
| 103 |
afe |
1.11 |
scripts in \file{utils/exch2/matlab-topology-generator}. Running the |
| 104 |
afe |
1.12 |
m-file |
| 105 |
|
|
\filelink{driver.m}{utils-exch2-matlab-topology-generator_driver.m} |
| 106 |
|
|
from the Matlab prompt (there are no parameters to pass) generates |
| 107 |
|
|
exch2 topology files \file{W2\_EXCH2\_TOPOLOGY.h} and |
| 108 |
|
|
\file{w2\_e2setup.F} in the working directory and displays a figure of |
| 109 |
afe |
1.18 |
the topology via Matlab -- figures \ref{fig:6tile}, \ref{fig:12tile}, |
| 110 |
afe |
1.19 |
and \ref{fig:24tile} are examples of the generated diagrams. The other |
| 111 |
|
|
m-files in the directory are |
| 112 |
|
|
subroutines called from \file{driver.m} and should not be run ``bare'' except |
| 113 |
afe |
1.12 |
for development purposes. \\ |
| 114 |
afe |
1.10 |
|
| 115 |
|
|
The parameters that determine the dimensions and topology of the |
| 116 |
afe |
1.11 |
generated configuration are \code{nr}, \code{nb}, \code{ng}, |
| 117 |
afe |
1.12 |
\code{tnx} and \code{tny}, and all are assigned early in the script. \\ |
| 118 |
afe |
1.10 |
|
| 119 |
afe |
1.19 |
The first three determine the height and width of the subdomains and |
| 120 |
afe |
1.10 |
hence the size of the overall domain. Each one determines the number |
| 121 |
|
|
of grid points, and therefore the resolution, along the subdomain |
| 122 |
afe |
1.18 |
sides in a ``great circle'' around each the three spatial axes of the cube. At the time |
| 123 |
afe |
1.10 |
of this writing MITgcm requires these three parameters to be equal, |
| 124 |
afe |
1.12 |
but they provide for future releases to accomodate different |
| 125 |
afe |
1.19 |
resolutions around the axes to allow subdomains with differing resolutions.\\ |
| 126 |
afe |
1.10 |
|
| 127 |
afe |
1.18 |
The parameters \code{tnx} and \code{tny} determine the width and height of |
| 128 |
afe |
1.11 |
the tiles into which the subdomains are decomposed, and must evenly |
| 129 |
|
|
divide the integer assigned to \code{nr}, \code{nb} and \code{ng}. |
| 130 |
|
|
The result is a rectangular tiling of the subdomain. Figure |
| 131 |
afe |
1.17 |
\ref{fig:24tile} shows one possible topology for a twenty-four-tile |
| 132 |
afe |
1.11 |
cube, and figure \ref{fig:12tile} shows one for twelve tiles. \\ |
| 133 |
afe |
1.10 |
|
| 134 |
|
|
\begin{figure} |
| 135 |
|
|
\begin{center} |
| 136 |
|
|
\resizebox{4in}{!}{ |
| 137 |
|
|
\includegraphics{part6/s24t_16x16.ps} |
| 138 |
|
|
} |
| 139 |
|
|
\end{center} |
| 140 |
afe |
1.12 |
|
| 141 |
afe |
1.13 |
\caption{Plot of a cubed sphere topology with a 32$\times$192 domain |
| 142 |
afe |
1.17 |
divided into six 32$\times$32 subdomains, each of which is divided |
| 143 |
afe |
1.18 |
into four tiles of width \code{tnx=16} and height \code{tny=16} for a |
| 144 |
|
|
total of twenty-four tiles. The colored borders of the subdomains |
| 145 |
|
|
represent the parameters \code{nr} (red), \code{nb} (blue), and |
| 146 |
|
|
\code{ng} (green). } \label{fig:24tile} |
| 147 |
afe |
1.10 |
\end{figure} |
| 148 |
|
|
|
| 149 |
|
|
\begin{figure} |
| 150 |
|
|
\begin{center} |
| 151 |
|
|
\resizebox{4in}{!}{ |
| 152 |
|
|
\includegraphics{part6/s12t_16x32.ps} |
| 153 |
|
|
} |
| 154 |
|
|
\end{center} |
| 155 |
afe |
1.13 |
\caption{Plot of a cubed sphere topology with a 32$\times$192 domain |
| 156 |
afe |
1.12 |
divided into six 32$\times$32 subdomains of two tiles each |
| 157 |
|
|
(\code{tnx=16, tny=32}). |
| 158 |
afe |
1.10 |
} \label{fig:12tile} |
| 159 |
|
|
\end{figure} |
| 160 |
|
|
|
| 161 |
afe |
1.13 |
\begin{figure} |
| 162 |
|
|
\begin{center} |
| 163 |
|
|
\resizebox{4in}{!}{ |
| 164 |
|
|
\includegraphics{part6/s6t_32x32.ps} |
| 165 |
|
|
} |
| 166 |
|
|
\end{center} |
| 167 |
|
|
\caption{Plot of a cubed sphere topology with a 32$\times$192 domain |
| 168 |
|
|
divided into six 32$\times$32 subdomains with one tile each |
| 169 |
|
|
(\code{tnx=32, tny=32}). This is the default configuration. |
| 170 |
|
|
} |
| 171 |
|
|
\label{fig:6tile} |
| 172 |
|
|
\end{figure} |
| 173 |
|
|
|
| 174 |
|
|
|
| 175 |
afe |
1.10 |
Tiles can be selected from the topology to be omitted from being |
| 176 |
afe |
1.12 |
allocated memory and processors. This tuning is useful in ocean |
| 177 |
|
|
modeling for omitting tiles that fall entirely on land. The tiles |
| 178 |
|
|
omitted are specified in the file |
| 179 |
|
|
\filelink{blanklist.txt}{utils-exch2-matlab-topology-generator_blanklist.txt} |
| 180 |
|
|
by their tile number in the topology, separated by a newline. \\ |
| 181 |
|
|
|
| 182 |
afe |
1.10 |
|
| 183 |
|
|
|
| 184 |
|
|
|
| 185 |
afe |
1.19 |
\subsection{exch2, SIZE.h, and Multiprocessing} |
| 186 |
afe |
1.12 |
\label{sec:exch2mpi} |
| 187 |
|
|
|
| 188 |
|
|
Once the topology configuration files are created, the Fortran |
| 189 |
afe |
1.13 |
\code{PARAMETER}s in \file{SIZE.h} must be configured to match. |
| 190 |
|
|
Section \ref{sect:specifying_a_decomposition} \sectiontitle{Specifying |
| 191 |
|
|
a decomposition} provides a general description of domain |
| 192 |
|
|
decomposition within MITgcm and its relation to \file{SIZE.h}. The |
| 193 |
afe |
1.19 |
current section specifies constraints that the exch2 package |
| 194 |
|
|
imposes and describes how to enable parallel execution with |
| 195 |
afe |
1.13 |
MPI. \\ |
| 196 |
afe |
1.12 |
|
| 197 |
|
|
As in the general case, the parameters \varlink{sNx}{sNx} and |
| 198 |
|
|
\varlink{sNy}{sNy} define the size of the individual tiles, and so |
| 199 |
|
|
must be assigned the same respective values as \code{tnx} and |
| 200 |
|
|
\code{tny} in \file{driver.m}.\\ |
| 201 |
|
|
|
| 202 |
|
|
The halo width parameters \varlink{OLx}{OLx} and \varlink{OLy}{OLy} |
| 203 |
|
|
have no special bearing on exch2 and may be assigned as in the general |
| 204 |
|
|
case. The same holds for \varlink{Nr}{Nr}, the number of vertical |
| 205 |
|
|
levels in the model.\\ |
| 206 |
|
|
|
| 207 |
|
|
The parameters \varlink{nSx}{nSx}, \varlink{nSy}{nSy}, |
| 208 |
|
|
\varlink{nPx}{nPx}, and \varlink{nPy}{nPy} relate to the number of |
| 209 |
|
|
tiles and how they are distributed on processors. When using exch2, |
| 210 |
afe |
1.19 |
the tiles are stored in the $x$ dimension, and so |
| 211 |
afe |
1.12 |
\code{\varlink{nSy}{nSy}=1} in all cases. Since the tiles as |
| 212 |
|
|
configured by exch2 cannot be split up accross processors without |
| 213 |
|
|
regenerating the topology, \code{\varlink{nPy}{nPy}=1} as well. \\ |
| 214 |
|
|
|
| 215 |
|
|
The number of tiles MITgcm allocates and how they are distributed |
| 216 |
|
|
between processors depends on \varlink{nPx}{nPx} and |
| 217 |
|
|
\varlink{nSx}{nSx}. \varlink{nSx}{nSx} is the number of tiles per |
| 218 |
afe |
1.23 |
processor and \varlink{nPx}{nPx} is the number of processors. The |
| 219 |
|
|
total number of tiles in the topology minus those listed in |
| 220 |
|
|
\file{blanklist.txt} must equal \code{nSx*nPx}. Note that in order to |
| 221 |
afe |
1.19 |
obtain maximum usage from a given number of processors in some cases, |
| 222 |
afe |
1.23 |
this restriction might entail sharing a processor with a tile that |
| 223 |
|
|
would otherwise be excluded because it is topographically outside of |
| 224 |
|
|
the domain and therefore in \file{blanklist.txt}. For example, |
| 225 |
|
|
suppose you have five processors and a domain decomposition of |
| 226 |
|
|
thirty-six tiles that allows you to exclude seven tiles. To evenly |
| 227 |
|
|
distribute the remaining twenty-nine tiles among five processors, you |
| 228 |
|
|
would have to run one ``dummy'' tile to make an even six tiles per |
| 229 |
|
|
processor. Such dummy tiles are \emph{not} listed in |
| 230 |
|
|
\file{blanklist.txt}.\\ |
| 231 |
|
|
|
| 232 |
afe |
1.12 |
|
| 233 |
|
|
The following is an example of \file{SIZE.h} for the twelve-tile |
| 234 |
|
|
configuration illustrated in figure \ref{fig:12tile} running on |
| 235 |
|
|
one processor: \\ |
| 236 |
|
|
|
| 237 |
|
|
\begin{verbatim} |
| 238 |
|
|
PARAMETER ( |
| 239 |
|
|
& sNx = 16, |
| 240 |
|
|
& sNy = 32, |
| 241 |
|
|
& OLx = 2, |
| 242 |
|
|
& OLy = 2, |
| 243 |
|
|
& nSx = 12, |
| 244 |
|
|
& nSy = 1, |
| 245 |
|
|
& nPx = 1, |
| 246 |
|
|
& nPy = 1, |
| 247 |
|
|
& Nx = sNx*nSx*nPx, |
| 248 |
|
|
& Ny = sNy*nSy*nPy, |
| 249 |
|
|
& Nr = 5) |
| 250 |
|
|
\end{verbatim} |
| 251 |
|
|
|
| 252 |
afe |
1.17 |
The following is an example for the twenty-four-tile topology in |
| 253 |
|
|
figure \ref{fig:24tile} running on six processors: |
| 254 |
afe |
1.12 |
|
| 255 |
|
|
\begin{verbatim} |
| 256 |
|
|
PARAMETER ( |
| 257 |
|
|
& sNx = 16, |
| 258 |
|
|
& sNy = 16, |
| 259 |
|
|
& OLx = 2, |
| 260 |
|
|
& OLy = 2, |
| 261 |
|
|
& nSx = 4, |
| 262 |
|
|
& nSy = 1, |
| 263 |
|
|
& nPx = 6, |
| 264 |
|
|
& nPy = 1, |
| 265 |
|
|
& Nx = sNx*nSx*nPx, |
| 266 |
|
|
& Ny = sNy*nSy*nPy, |
| 267 |
|
|
& Nr = 5) |
| 268 |
|
|
\end{verbatim} |
| 269 |
|
|
|
| 270 |
|
|
|
| 271 |
afe |
1.10 |
|
| 272 |
|
|
|
| 273 |
afe |
1.4 |
|
| 274 |
|
|
\subsection{Key Variables} |
| 275 |
|
|
|
| 276 |
|
|
The descriptions of the variables are divided up into scalars, |
| 277 |
afe |
1.17 |
one-dimensional arrays indexed to the tile number, and two and |
| 278 |
|
|
three-dimensional arrays indexed to tile number and neighboring tile. |
| 279 |
|
|
This division reflects the functionality of these variables: The |
| 280 |
edhill |
1.8 |
scalars are common to every part of the topology, the tile-indexed |
| 281 |
afe |
1.12 |
arrays to individual tiles, and the arrays indexed by tile and |
| 282 |
|
|
neighbor to relationships between tiles and their neighbors. \\ |
| 283 |
afe |
1.4 |
|
| 284 |
|
|
\subsubsection{Scalars} |
| 285 |
|
|
|
| 286 |
|
|
The number of tiles in a particular topology is set with the parameter |
| 287 |
afe |
1.12 |
\code{NTILES}, and the maximum number of neighbors of any tiles by |
| 288 |
|
|
\code{MAX\_NEIGHBOURS}. These parameters are used for defining the |
| 289 |
edhill |
1.8 |
size of the various one and two dimensional arrays that store tile |
| 290 |
afe |
1.12 |
parameters indexed to the tile number and are assigned in the files |
| 291 |
|
|
generated by \file{driver.m}.\\ |
| 292 |
edhill |
1.8 |
|
| 293 |
|
|
The scalar parameters \varlink{exch2\_domain\_nxt}{exch2_domain_nxt} |
| 294 |
|
|
and \varlink{exch2\_domain\_nyt}{exch2_domain_nyt} express the number |
| 295 |
afe |
1.12 |
of tiles in the $x$ and $y$ global indices. For example, the default |
| 296 |
afe |
1.15 |
setup of six tiles (Fig. \ref{fig:6tile}) has |
| 297 |
|
|
\code{exch2\_domain\_nxt=6} and \code{exch2\_domain\_nyt=1}. A |
| 298 |
|
|
topology of twenty-four square tiles, four per subdomain (as in figure |
| 299 |
|
|
\ref{fig:24tile}), will have \code{exch2\_domain\_nxt=12} and |
| 300 |
|
|
\code{exch2\_domain\_nyt=2}. Note that these parameters express the |
| 301 |
afe |
1.19 |
tile layout in order to allow global data files that are tile-layout-neutral. |
| 302 |
|
|
They have no bearing on the internal storage of the arrays. The tiles |
| 303 |
|
|
are stored internally in a range from \code{\varlink{bi}{bi}=(1:NTILES)} in the |
| 304 |
afe |
1.18 |
$x$ axis, and the $y$ axis variable \varlink{bj}{bj} is assumed to |
| 305 |
|
|
equal \code{1} throughout the package. \\ |
| 306 |
afe |
1.4 |
|
| 307 |
afe |
1.19 |
\subsubsection{Arrays indexed to tile number} |
| 308 |
afe |
1.4 |
|
| 309 |
afe |
1.17 |
The following arrays are of length \code{NTILES} and are indexed to |
| 310 |
|
|
the tile number, which is indicated in the diagrams with the notation |
| 311 |
afe |
1.23 |
\textsf{t}$n$. The indices are omitted in the descriptions. \\ |
| 312 |
afe |
1.4 |
|
| 313 |
edhill |
1.8 |
The arrays \varlink{exch2\_tnx}{exch2_tnx} and |
| 314 |
afe |
1.12 |
\varlink{exch2\_tny}{exch2_tny} express the $x$ and $y$ dimensions of |
| 315 |
|
|
each tile. At present for each tile \texttt{exch2\_tnx=sNx} and |
| 316 |
|
|
\texttt{exch2\_tny=sNy}, as assigned in \file{SIZE.h} and described in |
| 317 |
afe |
1.19 |
Section \ref{sec:exch2mpi} \sectiontitle{exch2, SIZE.h, and |
| 318 |
|
|
Multiprocessing}. Future releases of MITgcm may allow varying tile |
| 319 |
afe |
1.12 |
sizes. \\ |
| 320 |
edhill |
1.8 |
|
| 321 |
afe |
1.19 |
The arrays \varlink{exch2\_tbasex}{exch2_tbasex} and |
| 322 |
|
|
\varlink{exch2\_tbasey}{exch2_tbasey} determine the tiles' |
| 323 |
|
|
Cartesian origin within a subdomain |
| 324 |
|
|
and locate the edges of different tiles relative to each other. As |
| 325 |
afe |
1.13 |
an example, in the default six-tile topology (Fig. \ref{fig:6tile}) |
| 326 |
|
|
each index in these arrays is set to \code{0} since a tile occupies |
| 327 |
afe |
1.17 |
its entire subdomain. The twenty-four-tile case discussed above will |
| 328 |
afe |
1.19 |
have values of \code{0} or \code{16}, depending on the quadrant of the |
| 329 |
|
|
tile within the subdomain. The elements of the arrays |
| 330 |
afe |
1.13 |
\varlink{exch2\_txglobalo}{exch2_txglobalo} and |
| 331 |
|
|
\varlink{exch2\_txglobalo}{exch2_txglobalo} are similar to |
| 332 |
edhill |
1.8 |
\varlink{exch2\_tbasex}{exch2_tbasex} and |
| 333 |
afe |
1.19 |
\varlink{exch2\_tbasey}{exch2_tbasey}, but locate the tile edges within the |
| 334 |
afe |
1.17 |
global address space, similar to that used by global output and input |
| 335 |
|
|
files. \\ |
| 336 |
edhill |
1.8 |
|
| 337 |
afe |
1.13 |
The array \varlink{exch2\_myFace}{exch2_myFace} contains the number of |
| 338 |
|
|
the subdomain of each tile, in a range \code{(1:6)} in the case of the |
| 339 |
afe |
1.23 |
standard cube topology and indicated by \textbf{\textsf{f}}$n$ in |
| 340 |
|
|
figures \ref{fig:12tile} and |
| 341 |
|
|
\ref{fig:24tile}. \varlink{exch2\_nNeighbours}{exch2_nNeighbours} |
| 342 |
|
|
contains a count of the neighboring tiles each tile has, and sets |
| 343 |
|
|
the bounds for looping over neighboring tiles. |
| 344 |
afe |
1.13 |
\varlink{exch2\_tProc}{exch2_tProc} holds the process rank of each |
| 345 |
|
|
tile, and is used in interprocess communication. \\ |
| 346 |
|
|
|
| 347 |
|
|
|
| 348 |
edhill |
1.8 |
The arrays \varlink{exch2\_isWedge}{exch2_isWedge}, |
| 349 |
|
|
\varlink{exch2\_isEedge}{exch2_isEedge}, |
| 350 |
|
|
\varlink{exch2\_isSedge}{exch2_isSedge}, and |
| 351 |
afe |
1.12 |
\varlink{exch2\_isNedge}{exch2_isNedge} are set to \code{1} if the |
| 352 |
afe |
1.19 |
indexed tile lies on the edge of its subdomain, \code{0} if |
| 353 |
afe |
1.15 |
not. The values are used within the topology generator to determine |
| 354 |
|
|
the orientation of neighboring tiles, and to indicate whether a tile |
| 355 |
|
|
lies on the corner of a subdomain. The latter case requires special |
| 356 |
afe |
1.12 |
exchange and numerical handling for the singularities at the eight |
| 357 |
afe |
1.13 |
corners of the cube. \\ |
| 358 |
|
|
|
| 359 |
afe |
1.4 |
|
| 360 |
afe |
1.6 |
\subsubsection{Arrays Indexed to Tile Number and Neighbor} |
| 361 |
afe |
1.4 |
|
| 362 |
afe |
1.17 |
The following arrays have vectors of length \code{MAX\_NEIGHBOURS} and |
| 363 |
|
|
\code{NTILES} and describe the orientations between the the tiles. \\ |
| 364 |
afe |
1.12 |
|
| 365 |
|
|
The array \code{exch2\_neighbourId(a,T)} holds the tile number |
| 366 |
|
|
\code{Tn} for each of the tile number \code{T}'s neighboring tiles |
| 367 |
afe |
1.15 |
\code{a}. The neighbor tiles are indexed |
| 368 |
afe |
1.17 |
\code{(1:exch2\_nNeighbours(T))} in the order right to left on the |
| 369 |
|
|
north then south edges, and then top to bottom on the east then west |
| 370 |
|
|
edges. \\ |
| 371 |
afe |
1.15 |
|
| 372 |
afe |
1.17 |
The \code{exch2\_opposingSend\_record(a,T)} array holds the |
| 373 |
afe |
1.15 |
index \code{b} of the element in \texttt{exch2\_neighbourId(b,Tn)} |
| 374 |
|
|
that holds the tile number \code{T}, given |
| 375 |
|
|
\code{Tn=exch2\_neighborId(a,T)}. In other words, |
| 376 |
edhill |
1.8 |
\begin{verbatim} |
| 377 |
|
|
exch2_neighbourId( exch2_opposingSend_record(a,T), |
| 378 |
|
|
exch2_neighbourId(a,T) ) = T |
| 379 |
afe |
1.5 |
\end{verbatim} |
| 380 |
afe |
1.12 |
This provides a back-reference from the neighbor tiles. \\ |
| 381 |
afe |
1.5 |
|
| 382 |
afe |
1.13 |
The arrays \varlink{exch2\_pi}{exch2_pi} and |
| 383 |
afe |
1.15 |
\varlink{exch2\_pj}{exch2_pj} specify the transformations of indices |
| 384 |
afe |
1.13 |
in exchanges between the neighboring tiles. These transformations are |
| 385 |
afe |
1.19 |
necessary in exchanges between subdomains because a horizontal dimension |
| 386 |
|
|
in one subdomain |
| 387 |
|
|
may map to other horizonal dimension in an adjacent subdomain, and |
| 388 |
|
|
may also have its indexing reversed. This swapping arises from the |
| 389 |
afe |
1.17 |
``folding'' of two-dimensional arrays into a three-dimensional |
| 390 |
|
|
cube. \\ |
| 391 |
afe |
1.13 |
|
| 392 |
|
|
The dimensions of \code{exch2\_pi(t,N,T)} and \code{exch2\_pj(t,N,T)} |
| 393 |
|
|
are the neighbor ID \code{N} and the tile number \code{T} as explained |
| 394 |
afe |
1.15 |
above, plus a vector of length \code{2} containing transformation |
| 395 |
|
|
factors \code{t}. The first element of the transformation vector |
| 396 |
afe |
1.19 |
holds the factor to multiply the index in the same dimension, and the |
| 397 |
|
|
second element holds the the same for the orthogonal dimension. To |
| 398 |
afe |
1.15 |
clarify, \code{exch2\_pi(1,N,T)} holds the mapping of the $x$ axis |
| 399 |
|
|
index of tile \code{T} to the $x$ axis of tile \code{T}'s neighbor |
| 400 |
|
|
\code{N}, and \code{exch2\_pi(2,N,T)} holds the mapping of \code{T}'s |
| 401 |
|
|
$x$ index to the neighbor \code{N}'s $y$ index. \\ |
| 402 |
afe |
1.12 |
|
| 403 |
afe |
1.15 |
One of the two elements of \code{exch2\_pi} or \code{exch2\_pj} for a |
| 404 |
|
|
given tile \code{T} and neighbor \code{N} will be \code{0}, reflecting |
| 405 |
|
|
the fact that the two axes are orthogonal. The other element will be |
| 406 |
|
|
\code{1} or \code{-1}, depending on whether the axes are indexed in |
| 407 |
|
|
the same or opposite directions. For example, the transform vector of |
| 408 |
|
|
the arrays for all tile neighbors on the same subdomain will be |
| 409 |
afe |
1.13 |
\code{(1,0)}, since all tiles on the same subdomain are oriented |
| 410 |
afe |
1.15 |
identically. An axis that corresponds to the orthogonal dimension |
| 411 |
|
|
with the same index direction in a particular tile-neighbor |
| 412 |
afe |
1.19 |
orientation will have \code{(0,1)}. Those with the opposite index |
| 413 |
afe |
1.15 |
direction will have \code{(0,-1)} in order to reverse the ordering. \\ |
| 414 |
afe |
1.13 |
|
| 415 |
afe |
1.14 |
The arrays \varlink{exch2\_oi}{exch2_oi}, |
| 416 |
|
|
\varlink{exch2\_oj}{exch2_oj}, \varlink{exch2\_oi\_f}{exch2_oi_f}, and |
| 417 |
|
|
\varlink{exch2\_oj\_f}{exch2_oj_f} are indexed to tile number and |
| 418 |
|
|
neighbor and specify the relative offset within the subdomain of the |
| 419 |
afe |
1.17 |
array index of a variable going from a neighboring tile \code{N} to a |
| 420 |
|
|
local tile \code{T}. Consider \code{T=1} in the six-tile topology |
| 421 |
afe |
1.16 |
(Fig. \ref{fig:6tile}), where |
| 422 |
|
|
|
| 423 |
|
|
\begin{verbatim} |
| 424 |
|
|
exch2_oi(1,1)=33 |
| 425 |
|
|
exch2_oi(2,1)=0 |
| 426 |
|
|
exch2_oi(3,1)=32 |
| 427 |
|
|
exch2_oi(4,1)=-32 |
| 428 |
|
|
\end{verbatim} |
| 429 |
|
|
|
| 430 |
|
|
The simplest case is \code{exch2\_oi(2,1)}, the southern neighbor, |
| 431 |
|
|
which is \code{Tn=6}. The axes of \code{T} and \code{Tn} have the |
| 432 |
|
|
same orientation and their $x$ axes have the same origin, and so an |
| 433 |
|
|
exchange between the two requires no changes to the $x$ index. For |
| 434 |
|
|
the western neighbor (\code{Tn=5}), \code{code\_oi(3,1)=32} since the |
| 435 |
|
|
\code{x=0} vector on \code{T} corresponds to the \code{y=32} vector on |
| 436 |
|
|
\code{Tn}. The eastern edge of \code{T} shows the reverse case |
| 437 |
afe |
1.17 |
(\code{exch2\_oi(4,1)=-32)}), where \code{x=32} on \code{T} exchanges |
| 438 |
|
|
with \code{x=0} on \code{Tn=2}. \\ |
| 439 |
|
|
|
| 440 |
|
|
The most interesting case, where \code{exch2\_oi(1,1)=33} and |
| 441 |
|
|
\code{Tn=3}, involves a reversal of indices. As in every case, the |
| 442 |
|
|
offset \code{exch2\_oi} is added to the original $x$ index of \code{T} |
| 443 |
|
|
multiplied by the transformation factor \code{exch2\_pi(t,N,T)}. Here |
| 444 |
|
|
\code{exch2\_pi(1,1,1)=0} since the $x$ axis of \code{T} is orthogonal |
| 445 |
|
|
to the $x$ axis of \code{Tn}. \code{exch2\_pi(2,1,1)=-1} since the |
| 446 |
|
|
$x$ axis of \code{T} corresponds to the $y$ axis of \code{Tn}, but the |
| 447 |
|
|
index is reversed. The result is that the index of the northern edge |
| 448 |
|
|
of \code{T}, which runs \code{(1:32)}, is transformed to |
| 449 |
afe |
1.16 |
\code{(-1:-32)}. \code{exch2\_oi(1,1)} is then added to this range to |
| 450 |
afe |
1.17 |
get back \code{(32:1)} -- the index of the $y$ axis of \code{Tn} |
| 451 |
|
|
relative to \code{T}. This transformation may seem overly convoluted |
| 452 |
|
|
for the six-tile case, but it is necessary to provide a general |
| 453 |
|
|
solution for various topologies. \\ |
| 454 |
afe |
1.16 |
|
| 455 |
|
|
|
| 456 |
afe |
1.14 |
|
| 457 |
|
|
Finally, \varlink{exch2\_itlo\_c}{exch2_itlo_c}, |
| 458 |
|
|
\varlink{exch2\_ithi\_c}{exch2_ithi_c}, |
| 459 |
|
|
\varlink{exch2\_jtlo\_c}{exch2_jtlo_c} and |
| 460 |
|
|
\varlink{exch2\_jthi\_c}{exch2_jthi_c} hold the location and index |
| 461 |
|
|
bounds of the edge segment of the neighbor tile \code{N}'s subdomain |
| 462 |
|
|
that gets exchanged with the local tile \code{T}. To take the example |
| 463 |
|
|
of tile \code{T=2} in the twelve-tile topology |
| 464 |
|
|
(Fig. \ref{fig:12tile}): \\ |
| 465 |
|
|
|
| 466 |
|
|
\begin{verbatim} |
| 467 |
|
|
exch2_itlo_c(4,2)=17 |
| 468 |
|
|
exch2_ithi_c(4,2)=17 |
| 469 |
|
|
exch2_jtlo_c(4,2)=0 |
| 470 |
|
|
exch2_jthi_c(4,2)=33 |
| 471 |
|
|
\end{verbatim} |
| 472 |
|
|
|
| 473 |
afe |
1.17 |
Here \code{N=4}, indicating the western neighbor, which is |
| 474 |
|
|
\code{Tn=1}. \code{Tn} resides on the same subdomain as \code{T}, so |
| 475 |
|
|
the tiles have the same orientation and the same $x$ and $y$ axes. |
| 476 |
|
|
The $x$ axis is orthogonal to the western edge and the tile is 16 |
| 477 |
|
|
points wide, so \code{exch2\_itlo\_c} and \code{exch2\_ithi\_c} |
| 478 |
|
|
indicate the column beyond \code{Tn}'s eastern edge, in that tile's |
| 479 |
|
|
halo region. Since the border of the tiles extends through the entire |
| 480 |
afe |
1.14 |
height of the subdomain, the $y$ axis bounds \code{exch2\_jtlo\_c} to |
| 481 |
afe |
1.17 |
\code{exch2\_jthi\_c} cover the height of \code{(1:32)}, plus 1 in |
| 482 |
|
|
either direction to cover part of the halo. \\ |
| 483 |
afe |
1.14 |
|
| 484 |
|
|
For the north edge of the same tile \code{T=2} where \code{N=1} and |
| 485 |
|
|
the neighbor tile is \code{Tn=5}: |
| 486 |
|
|
|
| 487 |
|
|
\begin{verbatim} |
| 488 |
|
|
exch2_itlo_c(1,2)=0 |
| 489 |
|
|
exch2_ithi_c(1,2)=0 |
| 490 |
|
|
exch2_jtlo_c(1,2)=0 |
| 491 |
|
|
exch2_jthi_c(1,2)=17 |
| 492 |
|
|
\end{verbatim} |
| 493 |
|
|
|
| 494 |
|
|
\code{T}'s northern edge is parallel to the $x$ axis, but since |
| 495 |
afe |
1.17 |
\code{Tn}'s $y$ axis corresponds to \code{T}'s $x$ axis, \code{T}'s |
| 496 |
|
|
northern edge exchanges with \code{Tn}'s western edge. The western |
| 497 |
|
|
edge of the tiles corresponds to the lower bound of the $x$ axis, so |
| 498 |
afe |
1.19 |
\code{exch2\_itlo\_c} and \code{exch2\_ithi\_c} are \code{0}, in the |
| 499 |
|
|
western halo region of \code{Tn}. The range of |
| 500 |
afe |
1.17 |
\code{exch2\_jtlo\_c} and \code{exch2\_jthi\_c} correspond to the |
| 501 |
afe |
1.19 |
width of \code{T}'s northern edge, expanded by one into the halo. \\ |
| 502 |
afe |
1.14 |
|
| 503 |
|
|
|
| 504 |
afe |
1.1 |
\subsection{Key Routines} |
| 505 |
|
|
|
| 506 |
afe |
1.16 |
Most of the subroutines particular to exch2 handle the exchanges |
| 507 |
|
|
themselves and are of the same format as those described in |
| 508 |
|
|
\ref{sect:cube_sphere_communication} \sectiontitle{Cube sphere |
| 509 |
|
|
communication}. Like the original routines, they are written as |
| 510 |
afe |
1.19 |
templates which the local Makefile converts from \code{RX} into |
| 511 |
|
|
\code{RL} and \code{RS} forms. \\ |
| 512 |
afe |
1.16 |
|
| 513 |
|
|
The interfaces with the core model subroutines are |
| 514 |
afe |
1.17 |
\code{EXCH\_UV\_XY\_RX}, \code{EXCH\_UV\_XYZ\_RX} and |
| 515 |
|
|
\code{EXCH\_XY\_RX}. They override the standard exchange routines |
| 516 |
|
|
when \code{genmake2} is run with \code{exch2} option. They in turn |
| 517 |
|
|
call the local exch2 subroutines \code{EXCH2\_UV\_XY\_RX} and |
| 518 |
|
|
\code{EXCH2\_UV\_XYZ\_RX} for two and three-dimensional vector |
| 519 |
|
|
quantities, and \code{EXCH2\_XY\_RX} and \code{EXCH2\_XYZ\_RX} for two |
| 520 |
|
|
and three-dimensional scalar quantities. These subroutines set the |
| 521 |
|
|
dimensions of the area to be exchanged, call \code{EXCH2\_RX1\_CUBE} |
| 522 |
|
|
for scalars and \code{EXCH2\_RX2\_CUBE} for vectors, and then handle |
| 523 |
|
|
the singularities at the cube corners. \\ |
| 524 |
afe |
1.16 |
|
| 525 |
|
|
The separate scalar and vector forms of \code{EXCH2\_RX1\_CUBE} and |
| 526 |
afe |
1.19 |
\code{EXCH2\_RX2\_CUBE} reflect that the vector-handling subroutine |
| 527 |
|
|
needs to pass both the $u$ and $v$ components of the physical vectors. |
| 528 |
|
|
This swapping arises from the topological folding discussed above, where the |
| 529 |
|
|
$x$ and $y$ axes get swapped in some cases, and is not an |
| 530 |
|
|
issue with the scalar case. These subroutines call |
| 531 |
afe |
1.17 |
\code{EXCH2\_SEND\_RX1} and \code{EXCH2\_SEND\_RX2}, which do most of |
| 532 |
|
|
the work using the variables discussed above. \\ |
| 533 |
afe |
1.1 |
|