| 1 |
% $Header$ |
% $Header$ |
| 2 |
|
|
| 3 |
In this chapter we describe the software architecture and |
This chapter focuses on describing the {\bf WRAPPER} environment within which |
| 4 |
implementation strategy for the MITgcm code. The first part of this |
both the core numerics and the pluggable packages operate. The description |
| 5 |
chapter discusses the MITgcm architecture at an abstract level. In the second |
presented here is intended to be a detailed exposition and contains significant |
| 6 |
part of the chapter we described practical details of the MITgcm implementation |
background material, as well as advanced details on working with the WRAPPER. |
| 7 |
and of current tools and operating system features that are employed. |
The tutorial sections of this manual (see sections |
| 8 |
|
\ref{sect:tutorials} and \ref{sect:tutorialIII}) |
| 9 |
|
contain more succinct, step-by-step instructions on running basic numerical |
| 10 |
|
experiments, of varous types, both sequentially and in parallel. For many |
| 11 |
|
projects simply starting from an example code and adapting it to suit a |
| 12 |
|
particular situation |
| 13 |
|
will be all that is required. |
| 14 |
|
The first part of this chapter discusses the MITgcm architecture at an |
| 15 |
|
abstract level. In the second part of the chapter we described practical |
| 16 |
|
details of the MITgcm implementation and of current tools and operating system |
| 17 |
|
features that are employed. |
| 18 |
|
|
| 19 |
\section{Overall architectural goals} |
\section{Overall architectural goals} |
| 20 |
|
|
| 38 |
|
|
| 39 |
\begin{enumerate} |
\begin{enumerate} |
| 40 |
\item A core set of numerical and support code. This is discussed in detail in |
\item A core set of numerical and support code. This is discussed in detail in |
| 41 |
section \ref{sec:partII}. |
section \ref{sect:partII}. |
| 42 |
\item A scheme for supporting optional "pluggable" {\bf packages} (containing |
\item A scheme for supporting optional "pluggable" {\bf packages} (containing |
| 43 |
for example mixed-layer schemes, biogeochemical schemes, atmospheric physics). |
for example mixed-layer schemes, biogeochemical schemes, atmospheric physics). |
| 44 |
These packages are used both to overlay alternate dynamics and to introduce |
These packages are used both to overlay alternate dynamics and to introduce |
| 84 |
to ``fit'' within the WRAPPER infrastructure. Writing code to ``fit'' within |
to ``fit'' within the WRAPPER infrastructure. Writing code to ``fit'' within |
| 85 |
the WRAPPER means that coding has to follow certain, relatively |
the WRAPPER means that coding has to follow certain, relatively |
| 86 |
straightforward, rules and conventions ( these are discussed further in |
straightforward, rules and conventions ( these are discussed further in |
| 87 |
section \ref{sec:specifying_a_decomposition} ). |
section \ref{sect:specifying_a_decomposition} ). |
| 88 |
|
|
| 89 |
The approach taken by the WRAPPER is illustrated in figure |
The approach taken by the WRAPPER is illustrated in figure |
| 90 |
\ref{fig:fit_in_wrapper} which shows how the WRAPPER serves to insulate code |
\ref{fig:fit_in_wrapper} which shows how the WRAPPER serves to insulate code |
| 108 |
\end{figure} |
\end{figure} |
| 109 |
|
|
| 110 |
\subsection{Target hardware} |
\subsection{Target hardware} |
| 111 |
\label{sec:target_hardware} |
\label{sect:target_hardware} |
| 112 |
|
|
| 113 |
The WRAPPER is designed to target as broad as possible a range of computer |
The WRAPPER is designed to target as broad as possible a range of computer |
| 114 |
systems. The original development of the WRAPPER took place on a |
systems. The original development of the WRAPPER took place on a |
| 128 |
|
|
| 129 |
\subsection{Supporting hardware neutrality} |
\subsection{Supporting hardware neutrality} |
| 130 |
|
|
| 131 |
The different systems listed in section \ref{sec:target_hardware} can be |
The different systems listed in section \ref{sect:target_hardware} can be |
| 132 |
categorized in many different ways. For example, one common distinction is |
categorized in many different ways. For example, one common distinction is |
| 133 |
between shared-memory parallel systems (SMP's, PVP's) and distributed memory |
between shared-memory parallel systems (SMP's, PVP's) and distributed memory |
| 134 |
parallel systems (for example x86 clusters and large MPP systems). This is one |
parallel systems (for example x86 clusters and large MPP systems). This is one |
| 221 |
whenever it requires values that outside the domain it owns. Periodically |
whenever it requires values that outside the domain it owns. Periodically |
| 222 |
processors will make calls to WRAPPER functions to communicate data between |
processors will make calls to WRAPPER functions to communicate data between |
| 223 |
tiles, in order to keep the overlap regions up to date (see section |
tiles, in order to keep the overlap regions up to date (see section |
| 224 |
\ref{sec:communication_primitives}). The WRAPPER functions can use a |
\ref{sect:communication_primitives}). The WRAPPER functions can use a |
| 225 |
variety of different mechanisms to communicate data between tiles. |
variety of different mechanisms to communicate data between tiles. |
| 226 |
|
|
| 227 |
\begin{figure} |
\begin{figure} |
| 308 |
\end{figure} |
\end{figure} |
| 309 |
|
|
| 310 |
\subsection{Shared memory communication} |
\subsection{Shared memory communication} |
| 311 |
\label{sec:shared_memory_communication} |
\label{sect:shared_memory_communication} |
| 312 |
|
|
| 313 |
Under shared communication independent CPU's are operating |
Under shared communication independent CPU's are operating |
| 314 |
on the exact same global address space at the application level. |
on the exact same global address space at the application level. |
| 334 |
communication very efficient provided it is used appropriately. |
communication very efficient provided it is used appropriately. |
| 335 |
|
|
| 336 |
\subsubsection{Memory consistency} |
\subsubsection{Memory consistency} |
| 337 |
\label{sec:memory_consistency} |
\label{sect:memory_consistency} |
| 338 |
|
|
| 339 |
When using shared memory communication between |
When using shared memory communication between |
| 340 |
multiple processors the WRAPPER level shields user applications from |
multiple processors the WRAPPER level shields user applications from |
| 358 |
ensure memory consistency for a particular platform. |
ensure memory consistency for a particular platform. |
| 359 |
|
|
| 360 |
\subsubsection{Cache effects and false sharing} |
\subsubsection{Cache effects and false sharing} |
| 361 |
\label{sec:cache_effects_and_false_sharing} |
\label{sect:cache_effects_and_false_sharing} |
| 362 |
|
|
| 363 |
Shared-memory machines often have local to processor memory caches |
Shared-memory machines often have local to processor memory caches |
| 364 |
which contain mirrored copies of main memory. Automatic cache-coherence |
which contain mirrored copies of main memory. Automatic cache-coherence |
| 377 |
threads operating within a single process is the standard mechanism for |
threads operating within a single process is the standard mechanism for |
| 378 |
supporting shared memory that the WRAPPER utilizes. Configuring and launching |
supporting shared memory that the WRAPPER utilizes. Configuring and launching |
| 379 |
code to run in multi-threaded mode on specific platforms is discussed in |
code to run in multi-threaded mode on specific platforms is discussed in |
| 380 |
section \ref{sec:running_with_threads}. However, on many systems, potentially |
section \ref{sect:running_with_threads}. However, on many systems, potentially |
| 381 |
very efficient mechanisms for using shared memory communication between |
very efficient mechanisms for using shared memory communication between |
| 382 |
multiple processes (in contrast to multiple threads within a single |
multiple processes (in contrast to multiple threads within a single |
| 383 |
process) also exist. In most cases this works by making a limited region of |
process) also exist. In most cases this works by making a limited region of |
| 390 |
nature. |
nature. |
| 391 |
|
|
| 392 |
\subsection{Distributed memory communication} |
\subsection{Distributed memory communication} |
| 393 |
\label{sec:distributed_memory_communication} |
\label{sect:distributed_memory_communication} |
| 394 |
Many parallel systems are not constructed in a way where it is |
Many parallel systems are not constructed in a way where it is |
| 395 |
possible or practical for an application to use shared memory |
possible or practical for an application to use shared memory |
| 396 |
for communication. For example cluster systems consist of individual computers |
for communication. For example cluster systems consist of individual computers |
| 404 |
highly optimized library. |
highly optimized library. |
| 405 |
|
|
| 406 |
\subsection{Communication primitives} |
\subsection{Communication primitives} |
| 407 |
\label{sec:communication_primitives} |
\label{sect:communication_primitives} |
| 408 |
|
|
| 409 |
\begin{figure} |
\begin{figure} |
| 410 |
\begin{center} |
\begin{center} |
| 548 |
computing CPU's. |
computing CPU's. |
| 549 |
\end{enumerate} |
\end{enumerate} |
| 550 |
This section describes the details of each of these operations. |
This section describes the details of each of these operations. |
| 551 |
Section \ref{sec:specifying_a_decomposition} explains how the way in which |
Section \ref{sect:specifying_a_decomposition} explains how the way in which |
| 552 |
a domain is decomposed (or composed) is expressed. Section |
a domain is decomposed (or composed) is expressed. Section |
| 553 |
\ref{sec:starting_a_code} describes practical details of running codes |
\ref{sect:starting_a_code} describes practical details of running codes |
| 554 |
in various different parallel modes on contemporary computer systems. |
in various different parallel modes on contemporary computer systems. |
| 555 |
Section \ref{sec:controlling_communication} explains the internal information |
Section \ref{sect:controlling_communication} explains the internal information |
| 556 |
that the WRAPPER uses to control how information is communicated between |
that the WRAPPER uses to control how information is communicated between |
| 557 |
tiles. |
tiles. |
| 558 |
|
|
| 559 |
\subsection{Specifying a domain decomposition} |
\subsection{Specifying a domain decomposition} |
| 560 |
\label{sec:specifying_a_decomposition} |
\label{sect:specifying_a_decomposition} |
| 561 |
|
|
| 562 |
At its heart much of the WRAPPER works only in terms of a collection of tiles |
At its heart much of the WRAPPER works only in terms of a collection of tiles |
| 563 |
which are interconnected to each other. This is also true of application |
which are interconnected to each other. This is also true of application |
| 609 |
dimensions of {\em sNx} and {\em sNy}. If, when the code is executed, these tiles are |
dimensions of {\em sNx} and {\em sNy}. If, when the code is executed, these tiles are |
| 610 |
allocated to different threads of a process that are then bound to |
allocated to different threads of a process that are then bound to |
| 611 |
different physical processors ( see the multi-threaded |
different physical processors ( see the multi-threaded |
| 612 |
execution discussion in section \ref{sec:starting_the_code} ) then |
execution discussion in section \ref{sect:starting_the_code} ) then |
| 613 |
computation will be performed concurrently on each tile. However, it is also |
computation will be performed concurrently on each tile. However, it is also |
| 614 |
possible to run the same decomposition within a process running a single thread on |
possible to run the same decomposition within a process running a single thread on |
| 615 |
a single processor. In this case the tiles will be computed over sequentially. |
a single processor. In this case the tiles will be computed over sequentially. |
| 800 |
This set of values can be used for a cube sphere calculation. |
This set of values can be used for a cube sphere calculation. |
| 801 |
Each tile of size $32 \times 32$ represents a face of the |
Each tile of size $32 \times 32$ represents a face of the |
| 802 |
cube. Initializing the tile connectivity correctly ( see section |
cube. Initializing the tile connectivity correctly ( see section |
| 803 |
\ref{sec:cube_sphere_communication}. allows the rotations associated with |
\ref{sect:cube_sphere_communication}. allows the rotations associated with |
| 804 |
moving between the six cube faces to be embedded within the |
moving between the six cube faces to be embedded within the |
| 805 |
tile-tile communication code. |
tile-tile communication code. |
| 806 |
\end{enumerate} |
\end{enumerate} |
| 807 |
|
|
| 808 |
|
|
| 809 |
\subsection{Starting the code} |
\subsection{Starting the code} |
| 810 |
\label{sec:starting_the_code} |
\label{sect:starting_the_code} |
| 811 |
When code is started under the WRAPPER, execution begins in a main routine {\em |
When code is started under the WRAPPER, execution begins in a main routine {\em |
| 812 |
eesupp/src/main.F} that is owned by the WRAPPER. Control is transferred |
eesupp/src/main.F} that is owned by the WRAPPER. Control is transferred |
| 813 |
to the application through a routine called {\em THE\_MODEL\_MAIN()} |
to the application through a routine called {\em THE\_MODEL\_MAIN()} |
| 817 |
WRAPPER is shown in figure \ref{fig:wrapper_startup}. |
WRAPPER is shown in figure \ref{fig:wrapper_startup}. |
| 818 |
|
|
| 819 |
\begin{figure} |
\begin{figure} |
| 820 |
|
{\footnotesize |
| 821 |
\begin{verbatim} |
\begin{verbatim} |
| 822 |
|
|
| 823 |
MAIN |
MAIN |
| 846 |
|
|
| 847 |
|
|
| 848 |
\end{verbatim} |
\end{verbatim} |
| 849 |
|
} |
| 850 |
\caption{Main stages of the WRAPPER startup procedure. |
\caption{Main stages of the WRAPPER startup procedure. |
| 851 |
This process proceeds transfer of control to application code, which |
This process proceeds transfer of control to application code, which |
| 852 |
occurs through the procedure {\em THE\_MODEL\_MAIN()}. |
occurs through the procedure {\em THE\_MODEL\_MAIN()}. |
| 854 |
\end{figure} |
\end{figure} |
| 855 |
|
|
| 856 |
\subsubsection{Multi-threaded execution} |
\subsubsection{Multi-threaded execution} |
| 857 |
\label{sec:multi-threaded-execution} |
\label{sect:multi-threaded-execution} |
| 858 |
Prior to transferring control to the procedure {\em THE\_MODEL\_MAIN()} the |
Prior to transferring control to the procedure {\em THE\_MODEL\_MAIN()} the |
| 859 |
WRAPPER may cause several coarse grain threads to be initialized. The routine |
WRAPPER may cause several coarse grain threads to be initialized. The routine |
| 860 |
{\em THE\_MODEL\_MAIN()} is called once for each thread and is passed a single |
{\em THE\_MODEL\_MAIN()} is called once for each thread and is passed a single |
| 861 |
stack argument which is the thread number, stored in the |
stack argument which is the thread number, stored in the |
| 862 |
variable {\em myThid}. In addition to specifying a decomposition with |
variable {\em myThid}. In addition to specifying a decomposition with |
| 863 |
multiple tiles per process ( see section \ref{sec:specifying_a_decomposition}) |
multiple tiles per process ( see section \ref{sect:specifying_a_decomposition}) |
| 864 |
configuring and starting a code to run using multiple threads requires the following |
configuring and starting a code to run using multiple threads requires the following |
| 865 |
steps.\\ |
steps.\\ |
| 866 |
|
|
| 942 |
} \\ |
} \\ |
| 943 |
|
|
| 944 |
\subsubsection{Multi-process execution} |
\subsubsection{Multi-process execution} |
| 945 |
\label{sec:multi-process-execution} |
\label{sect:multi-process-execution} |
| 946 |
|
|
| 947 |
Despite its appealing programming model, multi-threaded execution remains |
Despite its appealing programming model, multi-threaded execution remains |
| 948 |
less common then multi-process execution. One major reason for this |
less common then multi-process execution. One major reason for this |
| 954 |
|
|
| 955 |
Multi-process execution is more ubiquitous. |
Multi-process execution is more ubiquitous. |
| 956 |
In order to run code in a multi-process configuration a decomposition |
In order to run code in a multi-process configuration a decomposition |
| 957 |
specification ( see section \ref{sec:specifying_a_decomposition}) |
specification ( see section \ref{sect:specifying_a_decomposition}) |
| 958 |
is given ( in which the at least one of the |
is given ( in which the at least one of the |
| 959 |
parameters {\em nPx} or {\em nPy} will be greater than one) |
parameters {\em nPx} or {\em nPy} will be greater than one) |
| 960 |
and then, as for multi-threaded operation, |
and then, as for multi-threaded operation, |
| 1124 |
neighbor to communicate with on a particular face. A value |
neighbor to communicate with on a particular face. A value |
| 1125 |
of {\em COMM\_MSG} is used to indicated that some form of distributed |
of {\em COMM\_MSG} is used to indicated that some form of distributed |
| 1126 |
memory communication is required to communicate between |
memory communication is required to communicate between |
| 1127 |
these tile faces ( see section \ref{sec:distributed_memory_communication}). |
these tile faces ( see section \ref{sect:distributed_memory_communication}). |
| 1128 |
A value of {\em COMM\_PUT} or {\em COMM\_GET} is used to indicate |
A value of {\em COMM\_PUT} or {\em COMM\_GET} is used to indicate |
| 1129 |
forms of shared memory communication ( see section |
forms of shared memory communication ( see section |
| 1130 |
\ref{sec:shared_memory_communication}). The {\em COMM\_PUT} value indicates |
\ref{sect:shared_memory_communication}). The {\em COMM\_PUT} value indicates |
| 1131 |
that a CPU should communicate by writing to data structures owned by another |
that a CPU should communicate by writing to data structures owned by another |
| 1132 |
CPU. A {\em COMM\_GET} value indicates that a CPU should communicate by reading |
CPU. A {\em COMM\_GET} value indicates that a CPU should communicate by reading |
| 1133 |
from data structures owned by another CPU. These flags affect the behavior |
from data structures owned by another CPU. These flags affect the behavior |
| 1178 |
are read from the file {\em eedata}. If the value of {\em nThreads} |
are read from the file {\em eedata}. If the value of {\em nThreads} |
| 1179 |
is inconsistent with the number of threads requested from the |
is inconsistent with the number of threads requested from the |
| 1180 |
operating system (for example by using an environment |
operating system (for example by using an environment |
| 1181 |
variable as described in section \ref{sec:multi_threaded_execution}) |
variable as described in section \ref{sect:multi_threaded_execution}) |
| 1182 |
then usually an error will be reported by the routine |
then usually an error will be reported by the routine |
| 1183 |
{\em CHECK\_THREADS}.\\ |
{\em CHECK\_THREADS}.\\ |
| 1184 |
|
|
| 1196 |
} |
} |
| 1197 |
|
|
| 1198 |
\item {\bf memsync flags} |
\item {\bf memsync flags} |
| 1199 |
As discussed in section \ref{sec:memory_consistency}, when using shared memory, |
As discussed in section \ref{sect:memory_consistency}, when using shared memory, |
| 1200 |
a low-level system function may be need to force memory consistency. |
a low-level system function may be need to force memory consistency. |
| 1201 |
The routine {\em MEMSYNC()} is used for this purpose. This routine should |
The routine {\em MEMSYNC()} is used for this purpose. This routine should |
| 1202 |
not need modifying and the information below is only provided for |
not need modifying and the information below is only provided for |
| 1222 |
\end{verbatim} |
\end{verbatim} |
| 1223 |
|
|
| 1224 |
\item {\bf Cache line size} |
\item {\bf Cache line size} |
| 1225 |
As discussed in section \ref{sec:cache_effects_and_false_sharing}, |
As discussed in section \ref{sect:cache_effects_and_false_sharing}, |
| 1226 |
milti-threaded codes explicitly avoid penalties associated with excessive |
milti-threaded codes explicitly avoid penalties associated with excessive |
| 1227 |
coherence traffic on an SMP system. To do this the shared memory data structures |
coherence traffic on an SMP system. To do this the shared memory data structures |
| 1228 |
used by the {\em GLOBAL\_SUM}, {\em GLOBAL\_MAX} and {\em BARRIER} routines |
used by the {\em GLOBAL\_SUM}, {\em GLOBAL\_MAX} and {\em BARRIER} routines |
| 1250 |
setting for the \_GSUM macro is given in the file {\em CPP\_EEMACROS.h}. |
setting for the \_GSUM macro is given in the file {\em CPP\_EEMACROS.h}. |
| 1251 |
The \_GSUM macro is a performance critical operation, especially for |
The \_GSUM macro is a performance critical operation, especially for |
| 1252 |
large processor count, small tile size configurations. |
large processor count, small tile size configurations. |
| 1253 |
The custom communication example discussed in section \ref{sec:jam_example} |
The custom communication example discussed in section \ref{sect:jam_example} |
| 1254 |
shows how the macro is used to invoke a custom global sum routine |
shows how the macro is used to invoke a custom global sum routine |
| 1255 |
for a specific set of hardware. |
for a specific set of hardware. |
| 1256 |
|
|
| 1264 |
in the header file {\em CPP\_EEMACROS.h}. As with \_GSUM, the |
in the header file {\em CPP\_EEMACROS.h}. As with \_GSUM, the |
| 1265 |
\_EXCH operation plays a crucial role in scaling to small tile, |
\_EXCH operation plays a crucial role in scaling to small tile, |
| 1266 |
large logical and physical processor count configurations. |
large logical and physical processor count configurations. |
| 1267 |
The example in section \ref{sec:jam_example} discusses defining an |
The example in section \ref{sect:jam_example} discusses defining an |
| 1268 |
optimized and specialized form on the \_EXCH operation. |
optimized and specialized form on the \_EXCH operation. |
| 1269 |
|
|
| 1270 |
The \_EXCH operation is also central to supporting grids such as |
The \_EXCH operation is also central to supporting grids such as |
| 1304 |
if this might be unavailable then the work arrays can be extended |
if this might be unavailable then the work arrays can be extended |
| 1305 |
with dimensions use the tile dimensioning scheme of {\em nSx} |
with dimensions use the tile dimensioning scheme of {\em nSx} |
| 1306 |
and {\em nSy} ( as described in section |
and {\em nSy} ( as described in section |
| 1307 |
\ref{sec:specifying_a_decomposition}). However, if the configuration |
\ref{sect:specifying_a_decomposition}). However, if the configuration |
| 1308 |
being specified involves many more tiles than OS threads then |
being specified involves many more tiles than OS threads then |
| 1309 |
it can save memory resources to reduce the variable |
it can save memory resources to reduce the variable |
| 1310 |
{\em MAX\_NO\_THREADS} to be equal to the actual number of threads that |
{\em MAX\_NO\_THREADS} to be equal to the actual number of threads that |
| 1363 |
how it can be used to adapt to new griding approaches. |
how it can be used to adapt to new griding approaches. |
| 1364 |
|
|
| 1365 |
\subsubsection{JAM example} |
\subsubsection{JAM example} |
| 1366 |
\label{sec:jam_example} |
\label{sect:jam_example} |
| 1367 |
On some platforms a big performance boost can be obtained by |
On some platforms a big performance boost can be obtained by |
| 1368 |
binding the communication routines {\em \_EXCH} and |
binding the communication routines {\em \_EXCH} and |
| 1369 |
{\em \_GSUM} to specialized native libraries ) fro example the |
{\em \_GSUM} to specialized native libraries ) fro example the |
| 1386 |
pattern. |
pattern. |
| 1387 |
|
|
| 1388 |
\subsubsection{Cube sphere communication} |
\subsubsection{Cube sphere communication} |
| 1389 |
\label{sec:cube_sphere_communication} |
\label{sect:cube_sphere_communication} |
| 1390 |
Actual {\em \_EXCH} routine code is generated automatically from |
Actual {\em \_EXCH} routine code is generated automatically from |
| 1391 |
a series of template files, for example {\em exch\_rx.template}. |
a series of template files, for example {\em exch\_rx.template}. |
| 1392 |
This is done to allow a large number of variations on the exchange |
This is done to allow a large number of variations on the exchange |
| 1419 |
|
|
| 1420 |
Fitting together the WRAPPER elements, package elements and |
Fitting together the WRAPPER elements, package elements and |
| 1421 |
MITgcm core equation elements of the source code produces calling |
MITgcm core equation elements of the source code produces calling |
| 1422 |
sequence shown in section \ref{sec:calling_sequence} |
sequence shown in section \ref{sect:calling_sequence} |
| 1423 |
|
|
| 1424 |
\subsection{Annotated call tree for MITgcm and WRAPPER} |
\subsection{Annotated call tree for MITgcm and WRAPPER} |
| 1425 |
\label{sec:calling_sequence} |
\label{sect:calling_sequence} |
| 1426 |
|
|
| 1427 |
WRAPPER layer. |
WRAPPER layer. |
| 1428 |
|
|
| 1429 |
|
{\footnotesize |
| 1430 |
\begin{verbatim} |
\begin{verbatim} |
| 1431 |
|
|
| 1432 |
MAIN |
MAIN |
| 1454 |
|--THE_MODEL_MAIN :: Numerical code top-level driver routine |
|--THE_MODEL_MAIN :: Numerical code top-level driver routine |
| 1455 |
|
|
| 1456 |
\end{verbatim} |
\end{verbatim} |
| 1457 |
|
} |
| 1458 |
|
|
| 1459 |
Core equations plus packages. |
Core equations plus packages. |
| 1460 |
|
|
| 1461 |
|
{\footnotesize |
| 1462 |
\begin{verbatim} |
\begin{verbatim} |
| 1463 |
C |
C |
| 1464 |
C |
C |
| 1797 |
C :: events. |
C :: events. |
| 1798 |
C |
C |
| 1799 |
\end{verbatim} |
\end{verbatim} |
| 1800 |
|
} |
| 1801 |
|
|
| 1802 |
\subsection{Measuring and Characterizing Performance} |
\subsection{Measuring and Characterizing Performance} |
| 1803 |
|
|