1 |
% $Header$ |
% $Header$ |
2 |
|
|
3 |
This chapter focuses on describing the {\bf WRAPPER} environment within which |
This chapter focuses on describing the {\bf WRAPPER} environment |
4 |
both the core numerics and the pluggable packages operate. The description |
within which both the core numerics and the pluggable packages |
5 |
presented here is intended to be a detailed exposition and contains significant |
operate. The description presented here is intended to be a detailed |
6 |
background material, as well as advanced details on working with the WRAPPER. |
exposition and contains significant background material, as well as |
7 |
The tutorial sections of this manual (see sections |
advanced details on working with the WRAPPER. The tutorial sections |
8 |
\ref{sect:tutorials} and \ref{sect:tutorialIII}) |
of this manual (see sections \ref{sec:modelExamples} and |
9 |
contain more succinct, step-by-step instructions on running basic numerical |
\ref{sec:tutorialIII}) contain more succinct, step-by-step |
10 |
experiments, of varous types, both sequentially and in parallel. For many |
instructions on running basic numerical experiments, of varous types, |
11 |
projects simply starting from an example code and adapting it to suit a |
both sequentially and in parallel. For many projects simply starting |
12 |
particular situation |
from an example code and adapting it to suit a particular situation |
13 |
will be all that is required. |
will be all that is required. The first part of this chapter |
14 |
The first part of this chapter discusses the MITgcm architecture at an |
discusses the MITgcm architecture at an abstract level. In the second |
15 |
abstract level. In the second part of the chapter we described practical |
part of the chapter we described practical details of the MITgcm |
16 |
details of the MITgcm implementation and of current tools and operating system |
implementation and of current tools and operating system features that |
17 |
features that are employed. |
are employed. |
18 |
|
|
19 |
\section{Overall architectural goals} |
\section{Overall architectural goals} |
20 |
\begin{rawhtml} |
\begin{rawhtml} |
25 |
three-fold |
three-fold |
26 |
|
|
27 |
\begin{itemize} |
\begin{itemize} |
28 |
\item We wish to be able to study a very broad range |
\item We wish to be able to study a very broad range of interesting |
29 |
of interesting and challenging rotating fluids problems. |
and challenging rotating fluids problems. |
30 |
\item We wish the model code to be readily targeted to |
\item We wish the model code to be readily targeted to a wide range of |
31 |
a wide range of platforms |
platforms |
32 |
\item On any given platform we would like to be |
\item On any given platform we would like to be able to achieve |
33 |
able to achieve performance comparable to an implementation |
performance comparable to an implementation developed and |
34 |
developed and specialized specifically for that platform. |
specialized specifically for that platform. |
35 |
\end{itemize} |
\end{itemize} |
36 |
|
|
37 |
These points are summarized in figure \ref{fig:mitgcm_architecture_goals} |
These points are summarized in figure |
38 |
which conveys the goals of the MITgcm design. The goals lead to |
\ref{fig:mitgcm_architecture_goals} which conveys the goals of the |
39 |
a software architecture which at the high-level can be viewed as consisting |
MITgcm design. The goals lead to a software architecture which at the |
40 |
of |
high-level can be viewed as consisting of |
41 |
|
|
42 |
\begin{enumerate} |
\begin{enumerate} |
43 |
\item A core set of numerical and support code. This is discussed in |
\item A core set of numerical and support code. This is discussed in |
69 |
|
|
70 |
\begin{figure} |
\begin{figure} |
71 |
\begin{center} |
\begin{center} |
72 |
\resizebox{!}{2.5in}{\includegraphics{part4/mitgcm_goals.eps}} |
\resizebox{!}{2.5in}{\includegraphics{s_software/figs/mitgcm_goals.eps}} |
73 |
\end{center} |
\end{center} |
74 |
\caption{ |
\caption{ The MITgcm architecture is designed to allow simulation of a |
75 |
The MITgcm architecture is designed to allow simulation of a wide |
wide range of physical problems on a wide range of hardware. The |
76 |
range of physical problems on a wide range of hardware. The computational |
computational resource requirements of the applications targeted |
77 |
resource requirements of the applications targeted range from around |
range from around $10^7$ bytes ($\approx 10$ megabytes) of memory to |
78 |
$10^7$ bytes ( $\approx 10$ megabytes ) of memory to $10^{11}$ bytes |
$10^{11}$ bytes ($\approx 100$ gigabytes). Arithmetic operation |
79 |
( $\approx 100$ gigabytes). Arithmetic operation counts for the applications of |
counts for the applications of interest range from $10^{9}$ floating |
80 |
interest range from $10^{9}$ floating point operations to more than $10^{17}$ |
point operations to more than $10^{17}$ floating point operations.} |
|
floating point operations.} |
|
81 |
\label{fig:mitgcm_architecture_goals} |
\label{fig:mitgcm_architecture_goals} |
82 |
\end{figure} |
\end{figure} |
83 |
|
|
86 |
<!-- CMIREDIR:wrapper: --> |
<!-- CMIREDIR:wrapper: --> |
87 |
\end{rawhtml} |
\end{rawhtml} |
88 |
|
|
89 |
A significant element of the software architecture utilized in |
A significant element of the software architecture utilized in MITgcm |
90 |
MITgcm is a software superstructure and substructure collectively |
is a software superstructure and substructure collectively called the |
91 |
called the WRAPPER (Wrappable Application Parallel Programming |
WRAPPER (Wrappable Application Parallel Programming Environment |
92 |
Environment Resource). All numerical and support code in MITgcm is written |
Resource). All numerical and support code in MITgcm is written to |
93 |
to ``fit'' within the WRAPPER infrastructure. Writing code to ``fit'' within |
``fit'' within the WRAPPER infrastructure. Writing code to ``fit'' |
94 |
the WRAPPER means that coding has to follow certain, relatively |
within the WRAPPER means that coding has to follow certain, relatively |
95 |
straightforward, rules and conventions (these are discussed further in |
straightforward, rules and conventions (these are discussed further in |
96 |
section \ref{sect:specifying_a_decomposition}). |
section \ref{sec:specifying_a_decomposition}). |
97 |
|
|
98 |
The approach taken by the WRAPPER is illustrated in figure |
The approach taken by the WRAPPER is illustrated in figure |
99 |
\ref{fig:fit_in_wrapper} which shows how the WRAPPER serves to insulate code |
\ref{fig:fit_in_wrapper} which shows how the WRAPPER serves to |
100 |
that fits within it from architectural differences between hardware platforms |
insulate code that fits within it from architectural differences |
101 |
and operating systems. This allows numerical code to be easily retargetted. |
between hardware platforms and operating systems. This allows |
102 |
|
numerical code to be easily retargetted. |
103 |
|
|
104 |
|
|
105 |
\begin{figure} |
\begin{figure} |
106 |
\begin{center} |
\begin{center} |
107 |
\resizebox{!}{4.5in}{\includegraphics{part4/fit_in_wrapper.eps}} |
\resizebox{!}{4.5in}{\includegraphics{s_software/figs/fit_in_wrapper.eps}} |
108 |
\end{center} |
\end{center} |
109 |
\caption{ |
\caption{ |
110 |
Numerical code is written to fit within a software support |
Numerical code is written to fit within a software support |
118 |
\end{figure} |
\end{figure} |
119 |
|
|
120 |
\subsection{Target hardware} |
\subsection{Target hardware} |
121 |
\label{sect:target_hardware} |
\label{sec:target_hardware} |
122 |
|
|
123 |
The WRAPPER is designed to target as broad as possible a range of |
The WRAPPER is designed to target as broad as possible a range of |
124 |
computer systems. The original development of the WRAPPER took place |
computer systems. The original development of the WRAPPER took place |
136 |
IBM SP systems). In all cases numerical code, operating within the |
IBM SP systems). In all cases numerical code, operating within the |
137 |
WRAPPER, performs and scales very competitively with equivalent |
WRAPPER, performs and scales very competitively with equivalent |
138 |
numerical code that has been modified to contain native optimizations |
numerical code that has been modified to contain native optimizations |
139 |
for a particular system \ref{ref hoe and hill, ecmwf}. |
for a particular system \cite{hoe-hill:99}. |
140 |
|
|
141 |
\subsection{Supporting hardware neutrality} |
\subsection{Supporting hardware neutrality} |
142 |
|
|
143 |
The different systems listed in section \ref{sect:target_hardware} can |
The different systems listed in section \ref{sec:target_hardware} can |
144 |
be categorized in many different ways. For example, one common |
be categorized in many different ways. For example, one common |
145 |
distinction is between shared-memory parallel systems (SMP and PVP) |
distinction is between shared-memory parallel systems (SMP and PVP) |
146 |
and distributed memory parallel systems (for example x86 clusters and |
and distributed memory parallel systems (for example x86 clusters and |
165 |
scientific computing community. |
scientific computing community. |
166 |
|
|
167 |
\subsection{Machine model parallelism} |
\subsection{Machine model parallelism} |
168 |
|
\label{sec:domain_decomposition} |
169 |
\begin{rawhtml} |
\begin{rawhtml} |
170 |
<!-- CMIREDIR:domain_decomp: --> |
<!-- CMIREDIR:domain_decomp: --> |
171 |
\end{rawhtml} |
\end{rawhtml} |
211 |
\begin{figure} |
\begin{figure} |
212 |
\begin{center} |
\begin{center} |
213 |
\resizebox{5in}{!}{ |
\resizebox{5in}{!}{ |
214 |
\includegraphics{part4/domain_decomp.eps} |
\includegraphics{s_software/figs/domain_decomp.eps} |
215 |
} |
} |
216 |
\end{center} |
\end{center} |
217 |
\caption{ The WRAPPER provides support for one and two dimensional |
\caption{ The WRAPPER provides support for one and two dimensional |
240 |
domain it owns. Periodically processors will make calls to WRAPPER |
domain it owns. Periodically processors will make calls to WRAPPER |
241 |
functions to communicate data between tiles, in order to keep the |
functions to communicate data between tiles, in order to keep the |
242 |
overlap regions up to date (see section |
overlap regions up to date (see section |
243 |
\ref{sect:communication_primitives}). The WRAPPER functions can use a |
\ref{sec:communication_primitives}). The WRAPPER functions can use a |
244 |
variety of different mechanisms to communicate data between tiles. |
variety of different mechanisms to communicate data between tiles. |
245 |
|
|
246 |
\begin{figure} |
\begin{figure} |
247 |
\begin{center} |
\begin{center} |
248 |
\resizebox{5in}{!}{ |
\resizebox{5in}{!}{ |
249 |
\includegraphics{part4/tiled-world.eps} |
\includegraphics{s_software/figs/tiled-world.eps} |
250 |
} |
} |
251 |
\end{center} |
\end{center} |
252 |
\caption{ A global grid subdivided into tiles. |
\caption{ A global grid subdivided into tiles. |
280 |
call a function in the API of the communication library to |
call a function in the API of the communication library to |
281 |
communicate data from a tile that it owns to a tile that another CPU |
communicate data from a tile that it owns to a tile that another CPU |
282 |
owns. By default the WRAPPER binds to the MPI communication library |
owns. By default the WRAPPER binds to the MPI communication library |
283 |
\ref{MPI} for this style of communication. |
\cite{MPI-std-20} for this style of communication. |
284 |
\end{itemize} |
\end{itemize} |
285 |
|
|
286 |
The WRAPPER assumes that communication will use one of these two styles |
The WRAPPER assumes that communication will use one of these two styles |
329 |
\end{figure} |
\end{figure} |
330 |
|
|
331 |
\subsection{Shared memory communication} |
\subsection{Shared memory communication} |
332 |
\label{sect:shared_memory_communication} |
\label{sec:shared_memory_communication} |
333 |
|
|
334 |
Under shared communication independent CPUs are operating on the |
Under shared communication independent CPUs are operating on the |
335 |
exact same global address space at the application level. This means |
exact same global address space at the application level. This means |
356 |
appropriately. |
appropriately. |
357 |
|
|
358 |
\subsubsection{Memory consistency} |
\subsubsection{Memory consistency} |
359 |
\label{sect:memory_consistency} |
\label{sec:memory_consistency} |
360 |
|
|
361 |
When using shared memory communication between multiple processors the |
When using shared memory communication between multiple processors the |
362 |
WRAPPER level shields user applications from certain counter-intuitive |
WRAPPER level shields user applications from certain counter-intuitive |
382 |
particular platform. |
particular platform. |
383 |
|
|
384 |
\subsubsection{Cache effects and false sharing} |
\subsubsection{Cache effects and false sharing} |
385 |
\label{sect:cache_effects_and_false_sharing} |
\label{sec:cache_effects_and_false_sharing} |
386 |
|
|
387 |
Shared-memory machines often have local to processor memory caches |
Shared-memory machines often have local to processor memory caches |
388 |
which contain mirrored copies of main memory. Automatic cache-coherence |
which contain mirrored copies of main memory. Automatic cache-coherence |
402 |
the standard mechanism for supporting shared memory that the WRAPPER |
the standard mechanism for supporting shared memory that the WRAPPER |
403 |
utilizes. Configuring and launching code to run in multi-threaded mode |
utilizes. Configuring and launching code to run in multi-threaded mode |
404 |
on specific platforms is discussed in section |
on specific platforms is discussed in section |
405 |
\ref{sect:multi-threaded-execution}. However, on many systems, |
\ref{sec:multi_threaded_execution}. However, on many systems, |
406 |
potentially very efficient mechanisms for using shared memory |
potentially very efficient mechanisms for using shared memory |
407 |
communication between multiple processes (in contrast to multiple |
communication between multiple processes (in contrast to multiple |
408 |
threads within a single process) also exist. In most cases this works |
threads within a single process) also exist. In most cases this works |
409 |
by making a limited region of memory shared between processes. The |
by making a limited region of memory shared between processes. The |
410 |
MMAP \ref{magicgarden} and IPC \ref{magicgarden} facilities in UNIX |
MMAP %\ref{magicgarden} |
411 |
|
and IPC %\ref{magicgarden} |
412 |
|
facilities in UNIX |
413 |
systems provide this capability as do vendor specific tools like LAPI |
systems provide this capability as do vendor specific tools like LAPI |
414 |
\ref{IBMLAPI} and IMC \ref{Memorychannel}. Extensions exist for the |
%\ref{IBMLAPI} |
415 |
|
and IMC. %\ref{Memorychannel}. |
416 |
|
Extensions exist for the |
417 |
WRAPPER that allow these mechanisms to be used for shared memory |
WRAPPER that allow these mechanisms to be used for shared memory |
418 |
communication. However, these mechanisms are not distributed with the |
communication. However, these mechanisms are not distributed with the |
419 |
default WRAPPER sources, because of their proprietary nature. |
default WRAPPER sources, because of their proprietary nature. |
420 |
|
|
421 |
\subsection{Distributed memory communication} |
\subsection{Distributed memory communication} |
422 |
\label{sect:distributed_memory_communication} |
\label{sec:distributed_memory_communication} |
423 |
Many parallel systems are not constructed in a way where it is |
Many parallel systems are not constructed in a way where it is |
424 |
possible or practical for an application to use shared memory |
possible or practical for an application to use shared memory for |
425 |
for communication. For example cluster systems consist of individual computers |
communication. For example cluster systems consist of individual |
426 |
connected by a fast network. On such systems there is no notion of shared memory |
computers connected by a fast network. On such systems there is no |
427 |
at the system level. For this sort of system the WRAPPER provides support |
notion of shared memory at the system level. For this sort of system |
428 |
for communication based on a bespoke communication library |
the WRAPPER provides support for communication based on a bespoke |
429 |
(see figure \ref{fig:comm_msg}). The default communication library used is MPI |
communication library (see figure \ref{fig:comm_msg}). The default |
430 |
\ref{mpi}. However, it is relatively straightforward to implement bindings to |
communication library used is MPI \cite{MPI-std-20}. However, it is |
431 |
optimized platform specific communication libraries. For example the work |
relatively straightforward to implement bindings to optimized platform |
432 |
described in \ref{hoe-hill:99} substituted standard MPI communication for a |
specific communication libraries. For example the work described in |
433 |
highly optimized library. |
\cite{hoe-hill:99} substituted standard MPI communication for a highly |
434 |
|
optimized library. |
435 |
|
|
436 |
\subsection{Communication primitives} |
\subsection{Communication primitives} |
437 |
\label{sect:communication_primitives} |
\label{sec:communication_primitives} |
438 |
|
|
439 |
\begin{figure} |
\begin{figure} |
440 |
\begin{center} |
\begin{center} |
441 |
\resizebox{5in}{!}{ |
\resizebox{5in}{!}{ |
442 |
\includegraphics{part4/comm-primm.eps} |
\includegraphics{s_software/figs/comm-primm.eps} |
443 |
} |
} |
444 |
\end{center} |
\end{center} |
445 |
\caption{Three performance critical parallel primitives are provided |
\caption{Three performance critical parallel primitives are provided |
522 |
\begin{figure} |
\begin{figure} |
523 |
\begin{center} |
\begin{center} |
524 |
\resizebox{5in}{!}{ |
\resizebox{5in}{!}{ |
525 |
\includegraphics{part4/tiling_detail.eps} |
\includegraphics{s_software/figs/tiling_detail.eps} |
526 |
} |
} |
527 |
\end{center} |
\end{center} |
528 |
\caption{The tiling strategy that the WRAPPER supports allows tiles |
\caption{The tiling strategy that the WRAPPER supports allows tiles |
582 |
computing CPUs. |
computing CPUs. |
583 |
\end{enumerate} |
\end{enumerate} |
584 |
This section describes the details of each of these operations. |
This section describes the details of each of these operations. |
585 |
Section \ref{sect:specifying_a_decomposition} explains how the way in |
Section \ref{sec:specifying_a_decomposition} explains how the way in |
586 |
which a domain is decomposed (or composed) is expressed. Section |
which a domain is decomposed (or composed) is expressed. Section |
587 |
\ref{sect:starting_a_code} describes practical details of running |
\ref{sec:starting_the_code} describes practical details of running |
588 |
codes in various different parallel modes on contemporary computer |
codes in various different parallel modes on contemporary computer |
589 |
systems. Section \ref{sect:controlling_communication} explains the |
systems. Section \ref{sec:controlling_communication} explains the |
590 |
internal information that the WRAPPER uses to control how information |
internal information that the WRAPPER uses to control how information |
591 |
is communicated between tiles. |
is communicated between tiles. |
592 |
|
|
593 |
\subsection{Specifying a domain decomposition} |
\subsection{Specifying a domain decomposition} |
594 |
\label{sect:specifying_a_decomposition} |
\label{sec:specifying_a_decomposition} |
595 |
|
|
596 |
At its heart much of the WRAPPER works only in terms of a collection of tiles |
At its heart much of the WRAPPER works only in terms of a collection of tiles |
597 |
which are interconnected to each other. This is also true of application |
which are interconnected to each other. This is also true of application |
628 |
\begin{figure} |
\begin{figure} |
629 |
\begin{center} |
\begin{center} |
630 |
\resizebox{5in}{!}{ |
\resizebox{5in}{!}{ |
631 |
\includegraphics{part4/size_h.eps} |
\includegraphics{s_software/figs/size_h.eps} |
632 |
} |
} |
633 |
\end{center} |
\end{center} |
634 |
\caption{ The three level domain decomposition hierarchy employed by the |
\caption{ The three level domain decomposition hierarchy employed by the |
643 |
dimensions of {\em sNx} and {\em sNy}. If, when the code is executed, these tiles are |
dimensions of {\em sNx} and {\em sNy}. If, when the code is executed, these tiles are |
644 |
allocated to different threads of a process that are then bound to |
allocated to different threads of a process that are then bound to |
645 |
different physical processors ( see the multi-threaded |
different physical processors ( see the multi-threaded |
646 |
execution discussion in section \ref{sect:starting_the_code} ) then |
execution discussion in section \ref{sec:starting_the_code} ) then |
647 |
computation will be performed concurrently on each tile. However, it is also |
computation will be performed concurrently on each tile. However, it is also |
648 |
possible to run the same decomposition within a process running a single thread on |
possible to run the same decomposition within a process running a single thread on |
649 |
a single processor. In this case the tiles will be computed over sequentially. |
a single processor. In this case the tiles will be computed over sequentially. |
840 |
This set of values can be used for a cube sphere calculation. |
This set of values can be used for a cube sphere calculation. |
841 |
Each tile of size $32 \times 32$ represents a face of the |
Each tile of size $32 \times 32$ represents a face of the |
842 |
cube. Initializing the tile connectivity correctly ( see section |
cube. Initializing the tile connectivity correctly ( see section |
843 |
\ref{sect:cube_sphere_communication}. allows the rotations associated with |
\ref{sec:cube_sphere_communication}. allows the rotations associated with |
844 |
moving between the six cube faces to be embedded within the |
moving between the six cube faces to be embedded within the |
845 |
tile-tile communication code. |
tile-tile communication code. |
846 |
\end{enumerate} |
\end{enumerate} |
847 |
|
|
848 |
|
|
849 |
\subsection{Starting the code} |
\subsection{Starting the code} |
850 |
\label{sect:starting_the_code} |
\label{sec:starting_the_code} |
851 |
When code is started under the WRAPPER, execution begins in a main routine {\em |
When code is started under the WRAPPER, execution begins in a main routine {\em |
852 |
eesupp/src/main.F} that is owned by the WRAPPER. Control is transferred |
eesupp/src/main.F} that is owned by the WRAPPER. Control is transferred |
853 |
to the application through a routine called {\em THE\_MODEL\_MAIN()} |
to the application through a routine called {\em THE\_MODEL\_MAIN()} |
894 |
\end{figure} |
\end{figure} |
895 |
|
|
896 |
\subsubsection{Multi-threaded execution} |
\subsubsection{Multi-threaded execution} |
897 |
\label{sect:multi-threaded-execution} |
\label{sec:multi_threaded_execution} |
898 |
Prior to transferring control to the procedure {\em THE\_MODEL\_MAIN()} the |
Prior to transferring control to the procedure {\em THE\_MODEL\_MAIN()} the |
899 |
WRAPPER may cause several coarse grain threads to be initialized. The routine |
WRAPPER may cause several coarse grain threads to be initialized. The routine |
900 |
{\em THE\_MODEL\_MAIN()} is called once for each thread and is passed a single |
{\em THE\_MODEL\_MAIN()} is called once for each thread and is passed a single |
901 |
stack argument which is the thread number, stored in the |
stack argument which is the thread number, stored in the |
902 |
variable {\em myThid}. In addition to specifying a decomposition with |
variable {\em myThid}. In addition to specifying a decomposition with |
903 |
multiple tiles per process ( see section \ref{sect:specifying_a_decomposition}) |
multiple tiles per process ( see section \ref{sec:specifying_a_decomposition}) |
904 |
configuring and starting a code to run using multiple threads requires the following |
configuring and starting a code to run using multiple threads requires the following |
905 |
steps.\\ |
steps.\\ |
906 |
|
|
982 |
} \\ |
} \\ |
983 |
|
|
984 |
\subsubsection{Multi-process execution} |
\subsubsection{Multi-process execution} |
985 |
\label{sect:multi-process-execution} |
\label{sec:multi_process_execution} |
986 |
|
|
987 |
Despite its appealing programming model, multi-threaded execution |
Despite its appealing programming model, multi-threaded execution |
988 |
remains less common than multi-process execution. One major reason for |
remains less common than multi-process execution. One major reason for |
994 |
|
|
995 |
Multi-process execution is more ubiquitous. In order to run code in a |
Multi-process execution is more ubiquitous. In order to run code in a |
996 |
multi-process configuration a decomposition specification (see section |
multi-process configuration a decomposition specification (see section |
997 |
\ref{sect:specifying_a_decomposition}) is given (in which the at least |
\ref{sec:specifying_a_decomposition}) is given (in which the at least |
998 |
one of the parameters {\em nPx} or {\em nPy} will be greater than one) |
one of the parameters {\em nPx} or {\em nPy} will be greater than one) |
999 |
and then, as for multi-threaded operation, appropriate compile time |
and then, as for multi-threaded operation, appropriate compile time |
1000 |
and run time steps must be taken. |
and run time steps must be taken. |
1014 |
ALLOW\_USE\_MPI} and {\em ALWAYS\_USE\_MPI} flags in the {\em |
ALLOW\_USE\_MPI} and {\em ALWAYS\_USE\_MPI} flags in the {\em |
1015 |
CPP\_EEOPTIONS.h} file.) More detailed information about the use of |
CPP\_EEOPTIONS.h} file.) More detailed information about the use of |
1016 |
{\em genmake2} for specifying |
{\em genmake2} for specifying |
1017 |
local compiler flags is located in section \ref{sect:genmake}.\\ |
local compiler flags is located in section \ref{sec:genmake}.\\ |
1018 |
|
|
1019 |
|
|
1020 |
\fbox{ |
\fbox{ |
1118 |
|
|
1119 |
|
|
1120 |
\subsection{Controlling communication} |
\subsection{Controlling communication} |
1121 |
|
\label{sec:controlling_communication} |
1122 |
The WRAPPER maintains internal information that is used for communication |
The WRAPPER maintains internal information that is used for communication |
1123 |
operations and that can be customized for different platforms. This section |
operations and that can be customized for different platforms. This section |
1124 |
describes the information that is held and used. |
describes the information that is held and used. |
1141 |
a particular face. A value of {\em COMM\_MSG} is used to indicate |
a particular face. A value of {\em COMM\_MSG} is used to indicate |
1142 |
that some form of distributed memory communication is required to |
that some form of distributed memory communication is required to |
1143 |
communicate between these tile faces (see section |
communicate between these tile faces (see section |
1144 |
\ref{sect:distributed_memory_communication}). A value of {\em |
\ref{sec:distributed_memory_communication}). A value of {\em |
1145 |
COMM\_PUT} or {\em COMM\_GET} is used to indicate forms of shared |
COMM\_PUT} or {\em COMM\_GET} is used to indicate forms of shared |
1146 |
memory communication (see section |
memory communication (see section |
1147 |
\ref{sect:shared_memory_communication}). The {\em COMM\_PUT} value |
\ref{sec:shared_memory_communication}). The {\em COMM\_PUT} value |
1148 |
indicates that a CPU should communicate by writing to data |
indicates that a CPU should communicate by writing to data |
1149 |
structures owned by another CPU. A {\em COMM\_GET} value indicates |
structures owned by another CPU. A {\em COMM\_GET} value indicates |
1150 |
that a CPU should communicate by reading from data structures owned |
that a CPU should communicate by reading from data structures owned |
1201 |
the file {\em eedata}. If the value of {\em nThreads} is |
the file {\em eedata}. If the value of {\em nThreads} is |
1202 |
inconsistent with the number of threads requested from the operating |
inconsistent with the number of threads requested from the operating |
1203 |
system (for example by using an environment variable as described in |
system (for example by using an environment variable as described in |
1204 |
section \ref{sect:multi_threaded_execution}) then usually an error |
section \ref{sec:multi_threaded_execution}) then usually an error |
1205 |
will be reported by the routine {\em CHECK\_THREADS}. |
will be reported by the routine {\em CHECK\_THREADS}. |
1206 |
|
|
1207 |
\fbox{ |
\fbox{ |
1218 |
} |
} |
1219 |
|
|
1220 |
\item {\bf memsync flags} |
\item {\bf memsync flags} |
1221 |
As discussed in section \ref{sect:memory_consistency}, a low-level |
As discussed in section \ref{sec:memory_consistency}, a low-level |
1222 |
system function may be need to force memory consistency on some |
system function may be need to force memory consistency on some |
1223 |
shared memory systems. The routine {\em MEMSYNC()} is used for this |
shared memory systems. The routine {\em MEMSYNC()} is used for this |
1224 |
purpose. This routine should not need modifying and the information |
purpose. This routine should not need modifying and the information |
1244 |
\end{verbatim} |
\end{verbatim} |
1245 |
|
|
1246 |
\item {\bf Cache line size} |
\item {\bf Cache line size} |
1247 |
As discussed in section \ref{sect:cache_effects_and_false_sharing}, |
As discussed in section \ref{sec:cache_effects_and_false_sharing}, |
1248 |
milti-threaded codes explicitly avoid penalties associated with |
milti-threaded codes explicitly avoid penalties associated with |
1249 |
excessive coherence traffic on an SMP system. To do this the shared |
excessive coherence traffic on an SMP system. To do this the shared |
1250 |
memory data structures used by the {\em GLOBAL\_SUM}, {\em |
memory data structures used by the {\em GLOBAL\_SUM}, {\em |
1274 |
CPP\_EEMACROS.h}. The \_GSUM macro is a performance critical |
CPP\_EEMACROS.h}. The \_GSUM macro is a performance critical |
1275 |
operation, especially for large processor count, small tile size |
operation, especially for large processor count, small tile size |
1276 |
configurations. The custom communication example discussed in |
configurations. The custom communication example discussed in |
1277 |
section \ref{sect:jam_example} shows how the macro is used to invoke |
section \ref{sec:jam_example} shows how the macro is used to invoke |
1278 |
a custom global sum routine for a specific set of hardware. |
a custom global sum routine for a specific set of hardware. |
1279 |
|
|
1280 |
\item {\bf \_EXCH} |
\item {\bf \_EXCH} |
1287 |
the header file {\em CPP\_EEMACROS.h}. As with \_GSUM, the \_EXCH |
the header file {\em CPP\_EEMACROS.h}. As with \_GSUM, the \_EXCH |
1288 |
operation plays a crucial role in scaling to small tile, large |
operation plays a crucial role in scaling to small tile, large |
1289 |
logical and physical processor count configurations. The example in |
logical and physical processor count configurations. The example in |
1290 |
section \ref{sect:jam_example} discusses defining an optimized and |
section \ref{sec:jam_example} discusses defining an optimized and |
1291 |
specialized form on the \_EXCH operation. |
specialized form on the \_EXCH operation. |
1292 |
|
|
1293 |
The \_EXCH operation is also central to supporting grids such as the |
The \_EXCH operation is also central to supporting grids such as the |
1328 |
if this mechanism is unavailable then the work arrays can be extended |
if this mechanism is unavailable then the work arrays can be extended |
1329 |
with dimensions using the tile dimensioning scheme of {\em nSx} and |
with dimensions using the tile dimensioning scheme of {\em nSx} and |
1330 |
{\em nSy} (as described in section |
{\em nSy} (as described in section |
1331 |
\ref{sect:specifying_a_decomposition}). However, if the |
\ref{sec:specifying_a_decomposition}). However, if the |
1332 |
configuration being specified involves many more tiles than OS |
configuration being specified involves many more tiles than OS |
1333 |
threads then it can save memory resources to reduce the variable |
threads then it can save memory resources to reduce the variable |
1334 |
{\em MAX\_NO\_THREADS} to be equal to the actual number of threads |
{\em MAX\_NO\_THREADS} to be equal to the actual number of threads |
1386 |
how it can be used to adapt to new griding approaches. |
how it can be used to adapt to new griding approaches. |
1387 |
|
|
1388 |
\subsubsection{JAM example} |
\subsubsection{JAM example} |
1389 |
\label{sect:jam_example} |
\label{sec:jam_example} |
1390 |
On some platforms a big performance boost can be obtained by binding |
On some platforms a big performance boost can be obtained by binding |
1391 |
the communication routines {\em \_EXCH} and {\em \_GSUM} to |
the communication routines {\em \_EXCH} and {\em \_GSUM} to |
1392 |
specialized native libraries (for example, the shmem library on CRAY |
specialized native libraries (for example, the shmem library on CRAY |
1410 |
pattern. |
pattern. |
1411 |
|
|
1412 |
\subsubsection{Cube sphere communication} |
\subsubsection{Cube sphere communication} |
1413 |
\label{sect:cube_sphere_communication} |
\label{sec:cube_sphere_communication} |
1414 |
Actual {\em \_EXCH} routine code is generated automatically from a |
Actual {\em \_EXCH} routine code is generated automatically from a |
1415 |
series of template files, for example {\em exch\_rx.template}. This |
series of template files, for example {\em exch\_rx.template}. This |
1416 |
is done to allow a large number of variations on the exchange process |
is done to allow a large number of variations on the exchange process |
1445 |
|
|
1446 |
Fitting together the WRAPPER elements, package elements and |
Fitting together the WRAPPER elements, package elements and |
1447 |
MITgcm core equation elements of the source code produces calling |
MITgcm core equation elements of the source code produces calling |
1448 |
sequence shown in section \ref{sect:calling_sequence} |
sequence shown in section \ref{sec:calling_sequence} |
1449 |
|
|
1450 |
\subsection{Annotated call tree for MITgcm and WRAPPER} |
\subsection{Annotated call tree for MITgcm and WRAPPER} |
1451 |
\label{sect:calling_sequence} |
\label{sec:calling_sequence} |
1452 |
|
|
1453 |
WRAPPER layer. |
WRAPPER layer. |
1454 |
|
|
1487 |
{\footnotesize |
{\footnotesize |
1488 |
\begin{verbatim} |
\begin{verbatim} |
1489 |
C |
C |
|
C |
|
1490 |
C Invocation from WRAPPER level... |
C Invocation from WRAPPER level... |
1491 |
C : |
C : |
1492 |
C : |
C : |
1550 |
C | | |-OPTIM_READPARMS :: Optimisation support package. see pkg/ctrl |
C | | |-OPTIM_READPARMS :: Optimisation support package. see pkg/ctrl |
1551 |
C | | |-GRDCHK_READPARMS :: Gradient check package. see pkg/grdchk |
C | | |-GRDCHK_READPARMS :: Gradient check package. see pkg/grdchk |
1552 |
C | | |-ECCO_READPARMS :: ECCO Support Package. see pkg/ecco |
C | | |-ECCO_READPARMS :: ECCO Support Package. see pkg/ecco |
1553 |
|
C | | |-PTRACERS_READPARMS :: multiple tracer package, see pkg/ptracers |
1554 |
|
C | | |-GCHEM_READPARMS :: tracer interface package, see pkg/gchem |
1555 |
C | | |
C | | |
1556 |
C | |-PACKAGES_CHECK |
C | |-PACKAGES_CHECK |
1557 |
C | | | |
C | | | |
1558 |
C | | |-KPP_CHECK :: KPP Package. pkg/kpp |
C | | |-KPP_CHECK :: KPP Package. pkg/kpp |
1559 |
C | | |-OBCS_CHECK :: Open bndy Package. pkg/obcs |
C | | |-OBCS_CHECK :: Open bndy Pacakge. pkg/obcs |
1560 |
C | | |-GMREDI_CHECK :: GM Package. pkg/gmredi |
C | | |-GMREDI_CHECK :: GM Package. pkg/gmredi |
1561 |
C | | |
C | | |
1562 |
C | |-PACKAGES_INIT_FIXED |
C | |-PACKAGES_INIT_FIXED |
1563 |
C | | |-OBCS_INIT_FIXED :: Open bndy Package. see pkg/obcs |
C | | |-OBCS_INIT_FIXED :: Open bndy Package. see pkg/obcs |
1564 |
C | | |-FLT_INIT :: Floats Package. see pkg/flt |
C | | |-FLT_INIT :: Floats Package. see pkg/flt |
1565 |
|
C | | |-GCHEM_INIT_FIXED :: tracer interface pachage, see pkg/gchem |
1566 |
C | | |
C | | |
1567 |
C | |-ZONAL_FILT_INIT :: FFT filter Package. see pkg/zonal_filt |
C | |-ZONAL_FILT_INIT :: FFT filter Package. see pkg/zonal_filt |
1568 |
C | | |
C | | |
1569 |
C | |-INI_CG2D :: 2d con. grad solver initialisation. |
C | |-INI_CG2D :: 2d con. grad solver initialization. |
1570 |
C | | |
C | | |
1571 |
C | |-INI_CG3D :: 3d con. grad solver initialisation. |
C | |-INI_CG3D :: 3d con. grad solver initialization. |
1572 |
C | | |
C | | |
1573 |
C | |-CONFIG_SUMMARY :: Provide synopsis of kernel setup. |
C | |-CONFIG_SUMMARY :: Provide synopsis of kernel setup. |
1574 |
C | :: Includes annotated table of kernel |
C | :: Includes annotated table of kernel |
1593 |
C | | |-INI_CORI :: Set coriolis term. zero, f-plane, beta-plane, |
C | | |-INI_CORI :: Set coriolis term. zero, f-plane, beta-plane, |
1594 |
C | | | :: sphere options are coded. |
C | | | :: sphere options are coded. |
1595 |
C | | | |
C | | | |
1596 |
C | | |-INI_CG2D :: 2d con. grad solver initialisation. |
C | | |-INI_CG2D :: 2d con. grad solver initialization. |
1597 |
C | | |-INI_CG3D :: 3d con. grad solver initialisation. |
C | | |-INI_CG3D :: 3d con. grad solver initialization. |
1598 |
C | | |-INI_MIXING :: Initialise diapycnal diffusivity. |
C | | |-INI_MIXING :: Initialize diapycnal diffusivity. |
1599 |
C | | |-INI_DYNVARS :: Initialise to zero all DYNVARS.h arrays (dynamical |
C | | |-INI_DYNVARS :: Initialize to zero all DYNVARS.h arrays (dynamical |
1600 |
C | | | :: fields). |
C | | | :: fields). |
1601 |
C | | | |
C | | | |
1602 |
C | | |-INI_FIELDS :: Control initializing model fields to non-zero |
C | | |-INI_FIELDS :: Control initializing model fields to non-zero |
1604 |
C | | | |-INI_THETA :: Set model initial temperature field. |
C | | | |-INI_THETA :: Set model initial temperature field. |
1605 |
C | | | |-INI_SALT :: Set model initial salinity field. |
C | | | |-INI_SALT :: Set model initial salinity field. |
1606 |
C | | | |-INI_PSURF :: Set model initial free-surface height/pressure. |
C | | | |-INI_PSURF :: Set model initial free-surface height/pressure. |
1607 |
C | | | |
C | | | |-INI_PRESSURE :: Compute model initial hydrostatic pressure |
1608 |
C | | |-INI_TR1 :: Set initial tracer 1 distribution. |
C | | | |-READ_CHECKPOINT :: Read the checkpoint |
1609 |
C | | | |
C | | | |
1610 |
C | | |-THE_CORRECTION_STEP :: Step forward to next time step. |
C | | |-THE_CORRECTION_STEP :: Step forward to next time step. |
1611 |
C | | | | :: Here applied to move restart conditions |
C | | | | :: Here applied to move restart conditions |
1632 |
C | | | |-CONVECT :: Mix static instability. |
C | | | |-CONVECT :: Mix static instability. |
1633 |
C | | | |-TIMEAVE_CUMULATE :: Update convection statistics. |
C | | | |-TIMEAVE_CUMULATE :: Update convection statistics. |
1634 |
C | | | |
C | | | |
1635 |
C | | |-PACKAGES_INIT_VARIABLES :: Does initialisation of time evolving |
C | | |-PACKAGES_INIT_VARIABLES :: Does initialization of time evolving |
1636 |
C | | | | :: package data. |
C | | | | :: package data. |
1637 |
C | | | | |
C | | | | |
1638 |
C | | | |-GMREDI_INIT :: GM package. ( see pkg/gmredi ) |
C | | | |-GMREDI_INIT :: GM package. ( see pkg/gmredi ) |
1639 |
C | | | |-KPP_INIT :: KPP package. ( see pkg/kpp ) |
C | | | |-KPP_INIT :: KPP package. ( see pkg/kpp ) |
1640 |
C | | | |-KPP_OPEN_DIAGS |
C | | | |-KPP_OPEN_DIAGS |
1641 |
C | | | |-OBCS_INIT_VARIABLES :: Open bndy. package. ( see pkg/obcs ) |
C | | | |-OBCS_INIT_VARIABLES :: Open bndy. package. ( see pkg/obcs ) |
1642 |
|
C | | | |-PTRACERS_INIT :: multi. tracer package,(see pkg/ptracers) |
1643 |
|
C | | | |-GCHEM_INIT :: tracer interface pkg (see pkh/gchem) |
1644 |
C | | | |-AIM_INIT :: Interm. atmos package. ( see pkg/aim ) |
C | | | |-AIM_INIT :: Interm. atmos package. ( see pkg/aim ) |
1645 |
C | | | |-CTRL_MAP_INI :: Control vector package.( see pkg/ctrl ) |
C | | | |-CTRL_MAP_INI :: Control vector package.( see pkg/ctrl ) |
1646 |
C | | | |-COST_INIT :: Cost function package. ( see pkg/cost ) |
C | | | |-COST_INIT :: Cost function package. ( see pkg/cost ) |
1683 |
C/\ | | | | :: for forcing datasets. |
C/\ | | | | :: for forcing datasets. |
1684 |
C/\ | | | | |
C/\ | | | | |
1685 |
C/\ | | | |-EXCH :: Sync forcing. in overlap regions. |
C/\ | | | |-EXCH :: Sync forcing. in overlap regions. |
1686 |
|
C/\ | | |-SEAICE_MODEL :: Compute sea-ice terms. ( pkg/seaice ) |
1687 |
|
C/\ | | |-FREEZE :: Limit surface temperature. |
1688 |
|
C/\ | | |-GCHEM_FIELD_LOAD :: load tracer forcing fields (pkg/gchem) |
1689 |
C/\ | | | |
C/\ | | | |
1690 |
C/\ | | |-THERMODYNAMICS :: theta, salt + tracer equations driver. |
C/\ | | |-THERMODYNAMICS :: theta, salt + tracer equations driver. |
1691 |
C/\ | | | | |
C/\ | | | | |
1692 |
C/\ | | | |-INTEGRATE_FOR_W :: Integrate for vertical velocity. |
C/\ | | | |-INTEGRATE_FOR_W :: Integrate for vertical velocity. |
1693 |
C/\ | | | |-OBCS_APPLY_W :: Open bndy. package ( see pkg/obcs ). |
C/\ | | | |-OBCS_APPLY_W :: Open bndy. package ( see pkg/obcs ). |
1694 |
C/\ | | | |-FIND_RHO :: Calculates [rho(S,T,z)-Rhonil] of a slice |
C/\ | | | |-FIND_RHO :: Calculates [rho(S,T,z)-RhoConst] of a slice |
1695 |
C/\ | | | |-GRAD_SIGMA :: Calculate isoneutral gradients |
C/\ | | | |-GRAD_SIGMA :: Calculate isoneutral gradients |
1696 |
C/\ | | | |-CALC_IVDC :: Set Implicit Vertical Diffusivity for Convection |
C/\ | | | |-CALC_IVDC :: Set Implicit Vertical Diffusivity for Convection |
1697 |
C/\ | | | | |
C/\ | | | | |
1698 |
C/\ | | | |-OBCS_CALC :: Open bndy. package ( see pkg/obcs ). |
C/\ | | | |-OBCS_CALC :: Open bndy. package ( see pkg/obcs ). |
1699 |
C/\ | | | |-EXTERNAL_FORCING_SURF:: Accumulates appropriately dimensioned |
C/\ | | | |-EXTERNAL_FORCING_SURF:: Accumulates appropriately dimensioned |
1700 |
C/\ | | | | :: forcing terms. |
C/\ | | | | | :: forcing terms. |
1701 |
|
C/\ | | | | |-PTRACERS_FORCING_SURF :: Tracer package ( see pkg/ptracers ). |
1702 |
C/\ | | | | |
C/\ | | | | |
1703 |
C/\ | | | |-GMREDI_CALC_TENSOR :: GM package ( see pkg/gmredi ). |
C/\ | | | |-GMREDI_CALC_TENSOR :: GM package ( see pkg/gmredi ). |
1704 |
C/\ | | | |-GMREDI_CALC_TENSOR_DUMMY :: GM package ( see pkg/gmredi ). |
C/\ | | | |-GMREDI_CALC_TENSOR_DUMMY :: GM package ( see pkg/gmredi ). |
1716 |
C/\ | | | |-CALC_GT :: Calculate the temperature tendency terms |
C/\ | | | |-CALC_GT :: Calculate the temperature tendency terms |
1717 |
C/\ | | | | | |
C/\ | | | | | |
1718 |
C/\ | | | | |-GAD_CALC_RHS :: Generalised advection package |
C/\ | | | | |-GAD_CALC_RHS :: Generalised advection package |
1719 |
C/\ | | | | | :: ( see pkg/gad ) |
C/\ | | | | | | :: ( see pkg/gad ) |
1720 |
|
C/\ | | | | | |-KPP_TRANSPORT_T :: KPP non-local transport ( see pkg/kpp ). |
1721 |
|
C/\ | | | | | |
1722 |
C/\ | | | | |-EXTERNAL_FORCING_T :: Problem specific forcing for temperature. |
C/\ | | | | |-EXTERNAL_FORCING_T :: Problem specific forcing for temperature. |
1723 |
C/\ | | | | |-ADAMS_BASHFORTH2 :: Extrapolate tendencies forward in time. |
C/\ | | | | |-ADAMS_BASHFORTH2 :: Extrapolate tendencies forward in time. |
1724 |
C/\ | | | | |-FREESURF_RESCALE_G :: Re-scale Gt for free-surface height. |
C/\ | | | | |-FREESURF_RESCALE_G :: Re-scale Gt for free-surface height. |
1728 |
C/\ | | | |-CALC_GS :: Calculate the salinity tendency terms |
C/\ | | | |-CALC_GS :: Calculate the salinity tendency terms |
1729 |
C/\ | | | | | |
C/\ | | | | | |
1730 |
C/\ | | | | |-GAD_CALC_RHS :: Generalised advection package |
C/\ | | | | |-GAD_CALC_RHS :: Generalised advection package |
1731 |
C/\ | | | | | :: ( see pkg/gad ) |
C/\ | | | | | | :: ( see pkg/gad ) |
1732 |
|
C/\ | | | | | |-KPP_TRANSPORT_S :: KPP non-local transport ( see pkg/kpp ). |
1733 |
|
C/\ | | | | | |
1734 |
C/\ | | | | |-EXTERNAL_FORCING_S :: Problem specific forcing for salt. |
C/\ | | | | |-EXTERNAL_FORCING_S :: Problem specific forcing for salt. |
1735 |
C/\ | | | | |-ADAMS_BASHFORTH2 :: Extrapolate tendencies forward in time. |
C/\ | | | | |-ADAMS_BASHFORTH2 :: Extrapolate tendencies forward in time. |
1736 |
C/\ | | | | |-FREESURF_RESCALE_G :: Re-scale Gs for free-surface height. |
C/\ | | | | |-FREESURF_RESCALE_G :: Re-scale Gs for free-surface height. |
1737 |
C/\ | | | | |
C/\ | | | | |
1738 |
C/\ | | | |-TIMESTEP_TRACER :: Step tracer field forward in time |
C/\ | | | |-TIMESTEP_TRACER :: Step tracer field forward in time |
1739 |
C/\ | | | | |
C/\ | | | | |
1740 |
C/\ | | | |-CALC_GTR1 :: Calculate other tracer(s) tendency terms |
C/\ | | | |-TIMESTEP_TRACER :: Step tracer field forward in time |
1741 |
|
C/\ | | | | |
1742 |
|
C/\ | | | |-PTRACERS_INTEGRATE :: Integrate other tracer(s) (see pkg/ptracers). |
1743 |
C/\ | | | | | |
C/\ | | | | | |
1744 |
C/\ | | | | |-GAD_CALC_RHS :: Generalised advection package |
C/\ | | | | |-GAD_CALC_RHS :: Generalised advection package |
1745 |
C/\ | | | | | :: ( see pkg/gad ) |
C/\ | | | | | | :: ( see pkg/gad ) |
1746 |
C/\ | | | | |-EXTERNAL_FORCING_TR:: Problem specific forcing for tracer. |
C/\ | | | | | |-KPP_TRANSPORT_PTR:: KPP non-local transport ( see pkg/kpp ). |
1747 |
|
C/\ | | | | | |
1748 |
|
C/\ | | | | |-PTRACERS_FORCING :: Problem specific forcing for tracer. |
1749 |
|
C/\ | | | | |-GCHEM_FORCING_INT :: tracer forcing for gchem pkg (if all |
1750 |
|
C/\ | | | | | tendancy terms calcualted together) |
1751 |
C/\ | | | | |-ADAMS_BASHFORTH2 :: Extrapolate tendencies forward in time. |
C/\ | | | | |-ADAMS_BASHFORTH2 :: Extrapolate tendencies forward in time. |
1752 |
C/\ | | | | |-FREESURF_RESCALE_G :: Re-scale Gs for free-surface height. |
C/\ | | | | |-FREESURF_RESCALE_G :: Re-scale Gs for free-surface height. |
1753 |
|
C/\ | | | | |-TIMESTEP_TRACER :: Step tracer field forward in time |
1754 |
C/\ | | | | |
C/\ | | | | |
|
C/\ | | | |-TIMESTEP_TRACER :: Step tracer field forward in time |
|
1755 |
C/\ | | | |-OBCS_APPLY_TS :: Open bndy. package (see pkg/obcs ). |
C/\ | | | |-OBCS_APPLY_TS :: Open bndy. package (see pkg/obcs ). |
|
C/\ | | | |-FREEZE :: Limit range of temperature. |
|
1756 |
C/\ | | | | |
C/\ | | | | |
1757 |
C/\ | | | |-IMPLDIFF :: Solve vertical implicit diffusion equation. |
C/\ | | | |-IMPLDIFF :: Solve vertical implicit diffusion equation. |
1758 |
C/\ | | | |-OBCS_APPLY_TS :: Open bndy. package (see pkg/obcs ). |
C/\ | | | |-OBCS_APPLY_TS :: Open bndy. package (see pkg/obcs ). |
1811 |
C/\ | | |-DO_FIELDS_BLOCKING_EXCHANGES :: Sync up overlap regions. |
C/\ | | |-DO_FIELDS_BLOCKING_EXCHANGES :: Sync up overlap regions. |
1812 |
C/\ | | | |-EXCH |
C/\ | | | |-EXCH |
1813 |
C/\ | | | |
C/\ | | | |
1814 |
|
C/\ | | |-GCHEM_FORCING_SEP :: tracer forcing for gchem pkg (if |
1815 |
|
C/\ | | | tracer dependent tendencies calculated |
1816 |
|
C/\ | | | separatly) |
1817 |
|
C/\ | | | |
1818 |
C/\ | | |-FLT_MAIN :: Float package ( pkg/flt ). |
C/\ | | |-FLT_MAIN :: Float package ( pkg/flt ). |
1819 |
C/\ | | | |
C/\ | | | |
1820 |
C/\ | | |-MONITOR :: Monitor package ( pkg/monitor ). |
C/\ | | |-MONITOR :: Monitor package ( pkg/monitor ). |
1825 |
C/\ | | | |-AIM_WRITE_DIAGS :: Intermed. atmos diags. see pkg/aim |
C/\ | | | |-AIM_WRITE_DIAGS :: Intermed. atmos diags. see pkg/aim |
1826 |
C/\ | | | |-GMREDI_DIAGS :: GM diags. see pkg/gmredi |
C/\ | | | |-GMREDI_DIAGS :: GM diags. see pkg/gmredi |
1827 |
C/\ | | | |-KPP_DO_DIAGS :: KPP diags. see pkg/kpp |
C/\ | | | |-KPP_DO_DIAGS :: KPP diags. see pkg/kpp |
1828 |
|
C/\ | | | |-SBO_CALC :: SBO diags. see pkg/sbo |
1829 |
|
C/\ | | | |-SBO_DIAGS :: SBO diags. see pkg/sbo |
1830 |
|
C/\ | | | |-SEAICE_DO_DIAGS :: SEAICE diags. see pkg/seaice |
1831 |
|
C/\ | | | |-GCHEM_DIAGS :: gchem diags. see pkg/gchem |
1832 |
C/\ | | | |
C/\ | | | |
1833 |
C/\ | | |-WRITE_CHECKPOINT :: Do I/O for restart files. |
C/\ | | |-WRITE_CHECKPOINT :: Do I/O for restart files. |
1834 |
C/\ | | |
C/\ | | |
1846 |
C | |
C | |
1847 |
C |-COMM_STATS :: Summarise inter-proc and inter-thread communication |
C |-COMM_STATS :: Summarise inter-proc and inter-thread communication |
1848 |
C :: events. |
C :: events. |
1849 |
C |
C |
1850 |
\end{verbatim} |
\end{verbatim} |
1851 |
} |
} |
1852 |
|
|