/[MITgcm]/manual/s_autodiff/text/doc_ad_2.tex

Diff of /manual/s_autodiff/text/doc_ad_2.tex

Parent Directory | Revision Log | View Revision Graph Revision Graph | View Patch Patch

-revision 1.1.1.1 by adcroft,
Wed Aug  8 16:16:26 2001 UTC
+revision 1.7 by cnh,
Thu Oct 25 18:36:55 2001 UTC
 Line 18 
 In principle, a variety of derived algor
  can be generated automatically in this way.
  The MITGCM has been adapted for use with the
- Tangent linear and Adjoint Model Compiler (TAMC) and its succssor TAF
+ Tangent linear and Adjoint Model Compiler (TAMC) and its successor TAF
  (Transformation of Algorithms in Fortran), developed
  by Ralf Giering (\cite{gie-kam:98}, \cite{gie:99,gie:00}).
- The first application of the adjoint of the MITGCM for senistivity
+ The first application of the adjoint of the MITGCM for sensitivity
  studies has been published by \cite{maro-eta:99}.
  \cite{sta-eta:97,sta-eta:01} use the MITGCM and its adjoint
  for ocean state estimation studies.
+ In the following we shall refer to TAMC and TAF synonymously,
+ except were explicitly stated otherwise.
  TAMC exploits the chain rule for computing the first
  derivative of a function with
  respect to a set of input variables.
  Treating a given forward code as a composition of operations --
- each line representing a compositional element -- the chain rule is
+ each line representing a compositional element, the chain rule is
  rigorously applied to the code, line by line. The resulting
  tangent linear or adjoint code,
  then, may be thought of as the composition in
  forward or reverse order, respectively, of the
- Jacobian matrices of the forward code compositional elements.
+ Jacobian matrices of the forward code's compositional elements.
  %**********************************************************************
  \section{Some basic algebra}
-Line 50 
 $\vec{u}=(u_1,\ldots,u_m)$
+Line 52 
 $\vec{u}=(u_1,\ldots,u_m)$
  such as forcing functions) to the $n$-dimensional space
  $V \subset I\!\!R^n$ of
  model output variable $\vec{v}=(v_1,\ldots,v_n)$
- (model state, model diagnostcs, objective function, ...)
+ (model state, model diagnostics, objective function, ...)
  under consideration,
  %
  \begin{equation}
-Line 105 
 In contrast to the full nonlinear model
+Line 107 
 In contrast to the full nonlinear model
  $ M $ is just a matrix
  which can readily be used to find the forward sensitivity of $\vec{v}$ to
  perturbations in  $u$,
- but if there are very many input variables $(>>O(10^{6})$ for
+ but if there are very many input variables $(\gg O(10^{6})$ for
  large-scale oceanographic application), it quickly becomes
  prohibitive to proceed directly as in (\ref{tangent_linear}),
  if the impact of each component $ {\bf e_{i}} $ is to be assessed.
-Line 130 
 or a measure of some model-to-data misfi
+Line 132 
 or a measure of some model-to-data misfi
  \label{compo}
  \end{eqnarray}
  %
- The linear approximation of $ {\cal J} $,
+ The perturbation of $ {\cal J} $ around a fixed point $ {\cal J}_0 $,
  \[
- {\cal J} \, \approx \, {\cal J}_0 \, + \, \delta {\cal J}
+ {\cal J} \, = \, {\cal J}_0 \, + \, \delta {\cal J}
  \]
  can be expressed in both bases of $ \vec{u} $ and $ \vec{v} $
  w.r.t. their corresponding inner product
-Line 152 
 $\left\langle \,\, , \,\, \right\rangle
+Line 154 
 $\left\langle \,\, , \,\, \right\rangle
  \label{deljidentity}
  \end{equation}
  %
- (note, that the gradient $ \nabla f $ is a pseudo-vector, therefore
+ (note, that the gradient $ \nabla f $ is a co-vector, therefore
  its transpose is required in the above inner product).
  Then, using the representation of
  $ \delta {\cal J} =
-Line 168 
 transpose of $ A $,
+Line 170 
 transpose of $ A $,
  \[
  A^{\ast} \, = \, A^T
  \]
- and from eq. (\ref{tangent_linear}), we note that
+ and from eq. (\ref{tangent_linear}), (\ref{deljidentity}),
+ we note that
  (omitting $|$'s):
  %
  \begin{equation}
-Line 204 
 the adjoint variable of the model state
+Line 207 
 the adjoint variable of the model state
  $ \delta \vec{u}^{\ast} $ the adjoint variable of the control variable $ \vec{u} $.
  The {\sf reverse} nature of the adjoint calculation can be readily
- seen as follows. Let us decompose ${\cal J}(u)$, thus:
+ seen as follows.
+ Consider a model integration which consists of $ \Lambda $
+ consecutive operations
+ $ {\cal M}_{\Lambda} (  {\cal M}_{\Lambda-1} (
+ ...... ( {\cal M}_{\lambda} (
+ ......
+ ( {\cal M}_{1} ( {\cal M}_{0}(\vec{u}) )))) $,
+ where the ${\cal M}$'s could be the elementary steps, i.e. single lines
+ in the code of the model, or successive time steps of the
+ model integration,
+ starting at step 0 and moving up to step $\Lambda$, with intermediate
+ ${\cal M}_{\lambda} (\vec{u}) = \vec{v}^{(\lambda+1)}$ and final
+ ${\cal M}_{\Lambda} (\vec{u}) = \vec{v}^{(\Lambda+1)} = \vec{v}$.
+ Let ${\cal J}$ be a cost function which explicitly depends on the
+ final state $\vec{v}$ only
+ (this restriction is for clarity reasons only).
+ %
+ ${\cal J}(u)$ may be decomposed according to:
  %
  \begin{equation}
  {\cal J}({\cal M}(\vec{u})) \, = \,
-Line 215 
 seen as follows. Let us decompose ${\cal
+Line 235 
 seen as follows. Let us decompose ${\cal
  \label{compos}
  \end{equation}
  %
- where the ${\cal M}$'s could be the elementary steps, i.e. single lines
+ Then, according to the chain rule, the forward calculation reads,
- in the code of the model,
+ in terms of the Jacobi matrices
- starting at step 0 and moving up to step $\Lambda$, with intermediate
- ${\cal M}_{\lambda} (\vec{u}) = \vec{v}^{(\lambda+1)}$ and final
- ${\cal M}_{\Lambda} (\vec{u}) = \vec{v}^{(\Lambda+1)} = \vec{v}$
- Then, according to the chain rule the forward calculation reads in
- terms of the Jacobi matrices
  (we've omitted the $ | $'s which, nevertheless are important
  to the aspect of {\it tangent} linearity;
- note also that per definition
+ note also that by definition
  $ \langle \, \nabla _{v}{\cal J}^T \, , \, \delta \vec{v} \, \rangle
  = \nabla_v {\cal J} \cdot \delta \vec{v} $ )
  %
-Line 259 
 M_{\Lambda}^T \cdot \nabla_v {\cal J}^T
+Line 274 
 M_{\Lambda}^T \cdot \nabla_v {\cal J}^T
  %
  clearly expressing the reverse nature of the calculation.
  Eq. (\ref{reverse}) is at the heart of automatic adjoint compilers.
- The intermediate steps $\lambda$ in
+ If the intermediate steps $\lambda$ in
  eqn. (\ref{compos}) -- (\ref{reverse})
- could represent the model state (forward or adjoint) at each
+ represent the model state (forward or adjoint) at each
- intermediate time step in which case
+ intermediate time step as noted above, then correspondingly,
- $ {\cal M}(\vec{v}^{(\lambda)}) = \vec{v}^{(\lambda+1)} $, and correspondingly,
+ $ M^T (\delta \vec{v}^{(\lambda) \, \ast}) =
- $ M^T (\delta \vec{v}^{(\lambda) \, \ast}) = \delta \vec{v}^{(\lambda-1) \, \ast} $,
+ \delta \vec{v}^{(\lambda-1) \, \ast} $ for the adjoint variables.
- but they can also be viewed more generally as
+ It thus becomes evident that the adjoint calculation also
- single lines of code in the numerical algorithm.
+ yields the adjoint of each model state component
- In both cases it becomes evident that the adjoint calculation
+ $ \vec{v}^{(\lambda)} $ at each intermediate step $ \lambda $, namely
- yields at the same time the adjoint of each model state component
- $ \vec{v}^{(\lambda)} $ at each intermediate step $ l $, namely
  %
  \begin{equation}
  \boxed{
-Line 285 
 M_{\Lambda}^T |_{\vec{v}^{(\lambda)}} \c
+Line 298 
 M_{\Lambda}^T |_{\vec{v}^{(\lambda)}} \c
  %
  in close analogy to eq. (\ref{adjoint})
  We note in passing that that the $\delta \vec{v}^{(\lambda) \, \ast}$
- are the Lagrange multipliers of the model state $ \vec{v}^{(\lambda)}$.
+ are the Lagrange multipliers of the model equations which determine
+ $ \vec{v}^{(\lambda)}$.
- In coponents, eq. (\ref{adjoint}) reads as follows.
+ In components, eq. (\ref{adjoint}) reads as follows.
  Let
  \[
  \begin{array}{rclcrcl}
-Line 308 
 Let
+Line 322 
 Let
  \end{array}
  \]
  denote the perturbations in $\vec{u}$ and $\vec{v}$, respectively,
- and their adjoint varaiables;
+ and their adjoint variables;
  further
  \[
  M \, = \, \left(
-Line 395 
 and the shorthand notation for the adjoi
+Line 409 
 and the shorthand notation for the adjoi
  $ \delta v^{(\lambda) \, \ast}_{j} = \frac{\partial}{\partial v^{(\lambda)}_{j}}
  {\cal J}^T $, $ j = 1, \ldots , n_{\lambda} $,
  for intermediate components, yielding
- \[
+ \begin{equation}
- \footnotesize
+ \small
+ \begin{split}
  \left(
  \begin{array}{c}
  \delta v^{(\lambda) \, \ast}_1 \\
-Line 404 
 for intermediate components, yielding
+Line 419 
 for intermediate components, yielding
  \delta v^{(\lambda) \, \ast}_{n_{\lambda}} \\
  \end{array}
  \right)
- \, = \,
+ \, = &
  \left(
  \begin{array}{ccc}
  \frac{\partial ({\cal M}_{\lambda})_1}{\partial v^{(\lambda)}_1}
- & \ldots &
+ & \ldots \,\, \ldots &
  \frac{\partial ({\cal M}_{\lambda})_{n_{\lambda+1}}}{\partial v^{(\lambda)}_1} \\
  \vdots & ~ & \vdots \\
  \frac{\partial ({\cal M}_{\lambda})_1}{\partial v^{(\lambda)}_{n_{\lambda}}}
- & \ldots  &
+ & \ldots \,\, \ldots  &
  \frac{\partial ({\cal M}_{\lambda})_{n_{\lambda+1}}}{\partial v^{(\lambda)}_{n_{\lambda}}} \\
  \end{array}
  \right)
- %
  \cdot
  %
+ \\ ~ & ~
+ \\ ~ &
+ %
  \left(
  \begin{array}{ccc}
  \frac{\partial ({\cal M}_{\lambda+1})_1}{\partial v^{(\lambda+1)}_1}
-Line 431 
 for intermediate components, yielding
+Line 448 
 for intermediate components, yielding
  \frac{\partial ({\cal M}_{\lambda+1})_{n_{\lambda+2}}}{\partial v^{(\lambda+1)}_{n_{\lambda+1}}} \\
  \end{array}
  \right)
- \cdot \ldots \ldots \cdot
+ \cdot \, \ldots \, \cdot
  \left(
  \begin{array}{c}
  \delta v^{\ast}_1 \\
-Line 439 
 for intermediate components, yielding
+Line 456 
 for intermediate components, yielding
  \delta v^{\ast}_{n} \\
  \end{array}
  \right)
- \]
+ \end{split}
+ \end{equation}
  Eq. (\ref{forward}) and (\ref{reverse}) are perhaps clearest in
  showing the advantage of the reverse over the forward mode
-Line 450 
 variables $u$
+Line 468 
 variables $u$
  {\it all} intermediate states $ \vec{v}^{(\lambda)} $) are sought.
  In order to be able to solve for each component of the gradient
  $ \partial {\cal J} / \partial u_{i} $ in (\ref{forward})
- a forward calulation has to be performed for each component seperately,
+ a forward calculation has to be performed for each component separately,
  i.e. $ \delta \vec{u} = \delta u_{i} {\vec{e}_{i}} $
  for  the $i$-th forward calculation.
  Then, (\ref{forward}) represents the
-Line 460 
 In contrast, eq. (\ref{reverse}) yields
+Line 478 
 In contrast, eq. (\ref{reverse}) yields
  gradient $\nabla _{u}{\cal J}$ (and all intermediate gradients
  $\nabla _{v^{(\lambda)}}{\cal J}$) within a single reverse calculation.
- Note, that in case $ {\cal J} $ is a vector-valued function
+ Note, that if $ {\cal J} $ is a vector-valued function
  of dimension $ l > 1 $,
  eq. (\ref{reverse}) has to be modified according to
  \[
-Line 468 
 M^T \left( \nabla_v {\cal J}^T \left(\de
+Line 486 
 M^T \left( \nabla_v {\cal J}^T \left(\de
  \, = \,
  \nabla_u {\cal J}^T \cdot \delta \vec{J}
  \]
- where now $ \delta \vec{J} \in I\!\!R $ is a vector of dimenison $ l $.
+ where now $ \delta \vec{J} \in I\!\!R^l $ is a vector of
+ dimension $ l $.
  In this case $ l $ reverse simulations have to be performed
  for each $ \delta J_{k}, \,\, k = 1, \ldots, l $.
  Then, the reverse mode is more efficient as long as
  $ l < n $, otherwise the forward mode is preferable.
- Stricly, the reverse mode is called adjoint mode only for
+ Strictly, the reverse mode is called adjoint mode only for
  $ l = 1 $.
  A detailed analysis of the underlying numerical operations
-Line 503 
 operator onto the $j$-th component ${\bf
+Line 522 
 operator onto the $j$-th component ${\bf
  \paragraph{Example 2:
  $ {\cal J} = \langle \, {\cal H}(\vec{v}) - \vec{d} \, ,
   \, {\cal H}(\vec{v}) - \vec{d} \, \rangle $} ~ \\
- The cost function represents the quadratic model vs.data misfit.
+ The cost function represents the quadratic model vs. data misfit.
  Here, $ \vec{d} $ is the data vector and $ {\cal H} $ represents the
  operator which maps the model state space onto the data space.
  Then, $ \nabla_v {\cal J} $ takes the form
-Line 534 
 H \cdot \left( {\cal H}(\vec{v}) - \vec{
+Line 553 
 H \cdot \left( {\cal H}(\vec{v}) - \vec{
  We note an important aspect of the forward vs. reverse
  mode calculation.
- Because of the locality of the derivative,
+ Because of the local character of the derivative
+ (a derivative is defined w.r.t. a point along the trajectory),
  the intermediate results of the model trajectory
  $\vec{v}^{(\lambda+1)}={\cal M}_{\lambda}(v^{(\lambda)})$
  are needed to evaluate the intermediate Jacobian
  $M_{\lambda}|_{\vec{v}^{(\lambda)}} \, \delta \vec{v}^{(\lambda)} $.
  In the forward mode, the intermediate results are required
  in the same order as computed by the full forward model ${\cal M}$,
- in the reverse mode they are required in the reverse order.
+ but in the reverse mode they are required in the reverse order.
  Thus, in the reverse mode the trajectory of the forward model
  integration ${\cal M}$ has to be stored to be available in the reverse
- calculation. Alternatively, the model state would have to be
+ calculation. Alternatively, the complete model state up to the
- recomputed whenever its value is required.
+ point of evaluation has to be recomputed whenever its value is required.
  A method to balance the amount of recomputations vs.
  storage requirements is called {\sf checkpointing}
  (e.g. \cite{res-eta:98}).
- It is depicted in Fig. ... for a 3-level checkpointing
+ It is depicted in \ref{fig:3levelcheck} for a 3-level checkpointing
- [as concrete example, we give explicit numbers for a 3-day
+ [as an example, we give explicit numbers for a 3-day
  integration with a 1-hourly timestep in square brackets].
  \begin{itemize}
  %
-Line 559 
 integration with a 1-hourly timestep in
+Line 579 
 integration with a 1-hourly timestep in
  In a first step, the model trajectory is subdivided into
  $ {n}^{lev3} $ subsections [$ {n}^{lev3} $=3 1-day intervals],
  with the label $lev3$ for this outermost loop.
- The model is then integrated over the full trajectory,
+ The model is then integrated along the full trajectory,
  and the model state stored only at every $ k_{i}^{lev3} $-th timestep
  [i.e. 3 times, at
  $ i = 0,1,2 $ corresponding to $ k_{i}^{lev3} = 0, 24, 48 $].
  %
  \item [$lev2$]
- In a second step each subsection is itself divided into
+ In a second step each subsection itself is divided into
- $ {n}^{lev2} $ subsubsections
+ $ {n}^{lev2} $ sub-subsections
  [$ {n}^{lev2} $=4 6-hour intervals per subsection].
  The model picks up at the last outermost dumped state
- $ v_{k_{n}^{lev3}} $ and is integrated forward in time over
+ $ v_{k_{n}^{lev3}} $ and is integrated forward in time along
  the last subsection, with the label $lev2$ for this
  intermediate loop.
- The model state is now stored only at every $ k_{i}^{lev2} $-th
+ The model state is now stored at every $ k_{i}^{lev2} $-th
  timestep
  [i.e. 4 times, at
  $ i = 0,1,2,3 $ corresponding to $ k_{i}^{lev2} = 48, 54, 60, 66 $].
  %
  \item [$lev1$]
- Finally, the mode picks up at the last intermediate dump state
+ Finally, the model picks up at the last intermediate dump state
- $ v_{k_{n}^{lev2}} $ and is integrated forward in time over
+ $ v_{k_{n}^{lev2}} $ and is integrated forward in time along
- the last subsubsection, with the label $lev1$ for this
+ the last sub-subsection, with the label $lev1$ for this
  intermediate loop.
- Within this subsubsection only, the model state is stored
+ Within this sub-subsection only, the model state is stored
  at every timestep
  [i.e. every hour $ i=0,...,5$ corresponding to
  $ k_{i}^{lev1} = 66, 67, \ldots, 71 $].
  Thus, the  final state $ v_n = v_{k_{n}^{lev1}} $ is reached
- and the model state of all peceeding timesteps over the last
+ and the model state of all  proceeding timesteps along the last
- subsubsections are available, enabling integration backwards
+ sub-subsections are available, enabling integration backwards
- in time over the last subsubsection.
+ in time along the last sub-subsection.
- Thus, the adjoint can be computed over this last
+ Thus, the adjoint can be computed along this last
- subsubsection $k_{n}^{lev2}$.
+ sub-subsection $k_{n}^{lev2}$.
  %
  \end{itemize}
  %
  This procedure is repeated consecutively for each previous
- subsubsection $k_{n-1}^{lev2}, \ldots, k_{1}^{lev2} $
+ sub-subsection $k_{n-1}^{lev2}, \ldots, k_{1}^{lev2} $
  carrying the adjoint computation to the initial time
  of the subsection $k_{n}^{lev3}$.
  Then, the procedure is repeated for the previous subsection
-Line 617 
 The balance of storage vs. recomputation
+Line 637 
 The balance of storage vs. recomputation
  on the computing resources available.
  \begin{figure}[t!]
- \centering
+ \begin{center}
  %\psdraft
- \psfrag{v_k1^lev3}{\mathinfigure{v_{k_{1}^{lev3}}}}
+ %\psfrag{v_k1^lev3}{\mathinfigure{v_{k_{1}^{lev3}}}}
- \psfrag{v_kn-1^lev3}{\mathinfigure{v_{k_{n-1}^{lev3}}}}
+ %\psfrag{v_kn-1^lev3}{\mathinfigure{v_{k_{n-1}^{lev3}}}}
- \psfrag{v_kn^lev3}{\mathinfigure{v_{k_{n}^{lev3}}}}
+ %\psfrag{v_kn^lev3}{\mathinfigure{v_{k_{n}^{lev3}}}}
- \psfrag{v_k1^lev2}{\mathinfigure{v_{k_{1}^{lev2}}}}
+ %\psfrag{v_k1^lev2}{\mathinfigure{v_{k_{1}^{lev2}}}}
- \psfrag{v_kn-1^lev2}{\mathinfigure{v_{k_{n-1}^{lev2}}}}
+ %\psfrag{v_kn-1^lev2}{\mathinfigure{v_{k_{n-1}^{lev2}}}}
- \psfrag{v_kn^lev2}{\mathinfigure{v_{k_{n}^{lev2}}}}
+ %\psfrag{v_kn^lev2}{\mathinfigure{v_{k_{n}^{lev2}}}}
- \psfrag{v_k1^lev1}{\mathinfigure{v_{k_{1}^{lev1}}}}
+ %\psfrag{v_k1^lev1}{\mathinfigure{v_{k_{1}^{lev1}}}}
- \psfrag{v_kn^lev1}{\mathinfigure{v_{k_{n}^{lev1}}}}
+ %\psfrag{v_kn^lev1}{\mathinfigure{v_{k_{n}^{lev1}}}}
- \mbox{\epsfig{file=part5/checkpointing.eps, width=0.8\textwidth}}
+ %\mbox{\epsfig{file=part5/checkpointing.eps, width=0.8\textwidth}}
+ \resizebox{5.5in}{!}{\includegraphics{part5/checkpointing.eps}}
  %\psfull
- \caption
+ \end{center}
- {Schematic view of intermediate dump and restart for
+ \caption{
+ Schematic view of intermediate dump and restart for
 -level checkpointing.}
- \label{fig:erswns}
+ \label{fig:3levelcheck}
  \end{figure}
- \subsection{Optimal perturbations}
+ % \subsection{Optimal perturbations}
- \label{optpert}
+ % \label{sec_optpert}
- \subsection{Error covariance estimate and Hessian matrix}
+ % \subsection{Error covariance estimate and Hessian matrix}
- \label{sec_hessian}
+ % \label{sec_hessian}
  \newpage
-Line 649 
 on the computing resources available.
+Line 671 
 on the computing resources available.
  \label{sec_ad_setup_ex}
  %**********************************************************************
- The MITGCM has been adapted to enable AD using TAMC or TAF
+ The MITGCM has been adapted to enable AD using TAMC or TAF.
- (we'll refer to TAMC and TAF interchangeably, except where
- distinctions are explicitly mentioned).
  The present description, therefore, is specific to the
- use of TAMC as AD tool.
+ use of TAMC or TAF as AD tool.
  The following sections describe the steps which are necessary to
  generate a tangent linear or adjoint model of the MITGCM.
  We take as an example the sensitivity of carbon sequestration
  in the ocean.
  The AD-relevant hooks in the code are sketched in
- \reffig{adthemodel}, \reffig{adthemain}.
+ \ref{fig:adthemodel}, \ref{fig:adthemain}.
  \subsection{Overview of the experiment}
- We describe an adjoint sensitivity analysis of outgassing from
+ We describe an adjoint sensitivity analysis of out-gassing from
- the ocean into the atmosphere of a carbon like tracer injected
+ the ocean into the atmosphere of a carbon-like tracer injected
  into the ocean interior (see \cite{hil-eta:01}).
  \subsubsection{Passive tracer equation}
  For this work the MITGCM was augmented with a thermodynamically
  inactive tracer, $C$. Tracer residing in the ocean
- model surface layer is outgassed according to a relaxation time scale,
+ model surface layer is out-gassed according to a relaxation time scale,
  $\mu$. Within the ocean interior, the tracer is passively advected
  by the ocean model currents. The full equation for the time evolution
  %
-Line 686 
 represents interior sources of $C$ such
+Line 706 
 represents interior sources of $C$ such
  direct injection.
  The velocity term, $U$, is the sum of the
  model Eulerian circulation and an eddy-induced velocity, the latter
- parameterized according to Gent/McWilliams (\cite{gen:90, dan:95}).
+ parameterized according to Gent/McWilliams
+ (\cite{gen-mcw:90, gen-eta:95}).
  The convection function, $\Gamma$, mixes $C$ vertically wherever the
  fluid is locally statically unstable.
- The outgassing time scale, $\mu$, in eqn. (\ref{carbon_ddt})
+ The out-gassing time scale, $\mu$, in eqn. (\ref{carbon_ddt})
  is set so that \( 1/\mu \sim 1 \ \mathrm{year} \) for the surface
  ocean and $\mu=0$ elsewhere. With this value, eqn. (\ref{carbon_ddt})
  is valid as a prognostic equation for small perturbations in oceanic
  carbon concentrations. This configuration provides a
  powerful tool for examining the impact of large-scale ocean circulation
- on $ CO_2 $ outgassing due to interior injections.
+ on $ CO_2 $ out-gassing due to interior injections.
  As source we choose a constant in time injection of
  $ S = 1 \,\, {\rm mol / s}$.
-Line 707 
 $4^\circ \times 4^\circ$ resolution hori
+Line 728 
 $4^\circ \times 4^\circ$ resolution hori
  geography and bathymetry. Twenty vertical layers are used with
  vertical spacing ranging
  from 50 m near the surface to 815 m at depth.
- Driven to steady-state by climatalogical wind-stress, heat and
+ Driven to steady-state by climatological wind-stress, heat and
  fresh-water forcing the model reproduces well known large-scale
  features of the ocean general circulation.
- \subsubsection{Outgassing cost function}
+ \subsubsection{Out-gassing cost function}
- To quantify and understand outgassing due to injections of $C$
+ To quantify and understand out-gassing due to injections of $C$
  in eqn. (\ref{carbon_ddt}),
  we define a cost function $ {\cal J} $ that measures the total amount of
- tracer outgassed at each timestep:
+ tracer out-gassed at each timestep:
  %
  \begin{equation}
  \label{cost_tracer}
  {\cal J}(t=T)=\int_{t=0}^{t=T}\int_{A} \mu C \, dA \, dt
  \end{equation}
  %
- Equation(\ref{cost_tracer}) integrates the outgassing term, $\mu C$,
+ Equation(\ref{cost_tracer}) integrates the out-gassing term, $\mu C$,
  from (\ref{carbon_ddt})
  over the entire ocean surface area, $A$, and accumulates it
  up to time $T$.
  Physically, ${\cal J}$ can be thought of as representing the amount of
- $CO_2$ that our model predicts would be outgassed following an
+ $CO_2$ that our model predicts would be out-gassed following an
  injection at rate $S$.
  The sensitivity of ${\cal J}$ to the spatial location of $S$,
  $\frac{\partial {\cal J}}{\partial S}$,
  can be used to identify regions from which circulation
- would cause $CO_2$ to rapidly outgas following injection
+ would cause $CO_2$ to rapidly out-gas following injection
  and regions in which $CO_2$ injections would remain effectively
- sequesterd within the ocean.
+ sequestered within the ocean.
  \subsection{Code configuration}
  The model configuration for this experiment resides under the
  directory {\it verification/carbon/}.
- The code customisation routines are in {\it verification/carbon/code/}:
+ The code customization routines are in {\it verification/carbon/code/}:
  %
  \begin{itemize}
  %
-Line 777 
 together with the forcing fields and and
+Line 798 
 together with the forcing fields and and
  %
  \item {\it data.ctrl}
  %
+ \item {\it data.gmredi}
+ %
+ \item {\it data.grdchk}
+ %
+ \item {\it data.optim}
+ %
  \item {\it data.pkg}
  %
  \item {\it eedata}
-Line 803 
 $ adjoint/ $:
+Line 830 
 $ adjoint/ $:
  \end{itemize}
  %
- Below we describe the customisations of this files which are
+ Below we describe the customizations of this files which are
  specific to this experiment.
  \subsubsection{File {\it .genmakerc}}
- This file overwites default settings of {\it genmake}.
+ This file overwrites default settings of {\it genmake}.
  In the present example it is used to switch on the following
  packages which are related to automatic differentiation
  and are disabled by default: \\
- \hspace*{4ex} {\tt set ENABLE=( autodiff cost ctrl ecco )}  \\
+ \hspace*{4ex} {\tt set ENABLE=( autodiff cost ctrl ecco gmredi grdchk kpp )}  \\
  Other packages which are not needed are switched off: \\
  \hspace*{4ex} {\tt set DISABLE=( aim obcs zonal\_filt shap\_filt cal exf )}
-Line 828 
 the standard include of the {\it CPP\_OP
+Line 855 
 the standard include of the {\it CPP\_OP
  This file contains 'wrapper'-specific CPP options.
  It only needs to be changed if the code is to be run
- in  parallel environment (see Section \ref{???}).
+ in a parallel environment (see Section \ref{???}).
  \subsubsection{File {\it CPP\_OPTIONS.h}}
-Line 837 
 This file contains model-specific CPP op
+Line 864 
 This file contains model-specific CPP op
  Most options are related to the forward model setup.
  They are identical to the global steady circulation setup of
  {\it verification/exp2/}.
- The option specific to this experiment is \\
+ The three options specific to this experiment are \\
+ \hspace*{4ex} {\tt \#define ALLOW\_PASSIVE\_TRACER} \\
+ This flag enables the code to carry through the
+ advection/diffusion of a passive tracer along the
+ model integration. \\
  \hspace*{4ex} {\tt \#define ALLOW\_MIT\_ADJOINT\_RUN} \\
  This flag enables the inclusion of some AD-related fields
- concerning initialisation, link between control variables
+ concerning initialization, link between control variables
  and forward model variables, and the call to the top-level
  forward/adjoint subroutine {\it adthe\_main\_loop}
- instead of {\it the\_main\_loop}.
+ instead of {\it the\_main\_loop}. \\
+ \hspace*{4ex} {\tt \#define ALLOW\_GRADIENT\_CHECK} \\
+ This flag enables the gradient check package.
+ After computing the unperturbed cost function and its gradient,
+ a series of computations are performed for which \\
+ $\bullet$ an element of the control vector is perturbed \\
+ $\bullet$ the cost function w.r.t. the perturbed element is
+ computed \\
+ $\bullet$ the difference between the perturbed and unperturbed
+ cost function is computed to compute the finite difference gradient \\
+ $\bullet$ the finite difference gradient is compared with the
+ adjoint-generated gradient.
+ The gradient check package is further described in Section ???.
  \subsubsection{File {\it ECCO\_OPTIONS.h}}
-Line 864 
 enables the checkpointing feature of TAM
+Line 907 
 enables the checkpointing feature of TAM
  (see Section \ref{???}).
  In the present example a 3-level checkpointing is implemented.
  The code contains the relevant store directives, common block
- and tape initialisations, storing key computation,
+ and tape initializations, storing key computation,
  and loop index handling.
  The checkpointing length at each level is defined in
  file {\it tamc.h}, cf. below.
  %
  \item Cost function package: {\it pkg/cost/} \\
  This package contains all relevant routines for
- initialising, accumulating and finalizing the cost function
+ initializing, accumulating and finalizing the cost function
  (see Section \ref{???}). \\
  \hspace*{4ex} {\tt \#define ALLOW\_COST} \\
  enables all general aspects of the cost function handling,
- in particular the hooks in the foorward code for
+ in particular the hooks in the forward code for
- initialising, accumulating and finalizing the cost function. \\
+ initializing, accumulating and finalizing the cost function. \\
  \hspace*{4ex} {\tt \#define ALLOW\_COST\_TRACER} \\
- includes the subroutine with the cost function for this
+ includes the call to the cost function for this
  particular experiment, eqn. (\ref{cost_tracer}).
  %
  \item Control variable package: {\it pkg/ctrl/} \\
-Line 900 
 meridional wind stress \\
+Line 943 
 meridional wind stress \\
  freshwater flux \\
  \hspace*{2ex} {\tt \#define ALLOW\_HFLUX0\_CONTROL} &
  heat flux \\
- \hspace*{2ex} {\tt \#undef ALLOW\_DIFFKR\_CONTROL} &
+ \hspace*{2ex} {\tt \#define ALLOW\_DIFFKR\_CONTROL} &
  diapycnal diffusivity \\
  \hspace*{2ex} {\tt \#undef ALLOW\_KAPPAGM\_CONTROL} &
  isopycnal diffusivity \\
-Line 915 
 model. It is identical to the {\it verif
+Line 958 
 model. It is identical to the {\it verif
  \hspace*{4ex} {\tt sNx = 90} \\
  \hspace*{4ex} {\tt sNy = 40} \\
  \hspace*{4ex} {\tt Nr = 20} \\
- It correpsponds to a single-tile/single-processor setup:
+ It corresponds to a single-tile/single-processor setup:
  {\tt nSx = nSy = 1, nPx = nPy = 1},
  with standard overlap dimensioning
  {\tt OLx = OLy = 3}.
-Line 932 
 The common blocks are used by the adjoin
+Line 975 
 The common blocks are used by the adjoin
  \hspace*{4ex} is related to {\it DYNVARS.h} \\
  \hspace*{4ex} {\tt common /addynvars\_cd/} &
  \hspace*{4ex} is related to {\it DYNVARS.h} \\
+ \hspace*{4ex} {\tt common /addynvars\_diffkr/} &
+ \hspace*{4ex} is related to {\it DYNVARS.h} \\
+ \hspace*{4ex} {\tt common /addynvars\_kapgm/} &
+ \hspace*{4ex} is related to {\it DYNVARS.h} \\
  \hspace*{4ex} {\tt common /adtr1\_r/} &
  \hspace*{4ex} is related to {\it TR1.h} \\
  \hspace*{4ex} {\tt common /adffields/} &
-Line 956 
 This routine contains the dimensions for
+Line 1003 
 This routine contains the dimensions for
 -level checkpointing is enabled, i.e. the timestepping
  is divided into three different levels (see Section \ref{???}).
  The model state of the outermost ({\tt nchklev\_3}) and the
- itermediate ({\tt nchklev\_2}) timestepping loop are stored to file
+ intermediate ({\tt nchklev\_2}) timestepping loop are stored to file
  (handled in {\it the\_main\_loop}).
  The innermost loop ({\tt nchklev\_1})
  avoids I/O by storing all required variables
-Line 968 
 In the present example the dimensions ar
+Line 1015 
 In the present example the dimensions ar
  \hspace*{4ex} {\tt nchklev\_2      =  30 } \\
  \hspace*{4ex} {\tt nchklev\_3      =  60 } \\
  To guarantee that the checkpointing intervals span the entire
- integration period the relation \\
+ integration period the following relation must be satisfied: \\
  \hspace*{4ex} {\tt nchklev\_1*nchklev\_2*nchklev\_3 $ \ge $ nTimeSteps} \\
  where {\tt nTimeSteps} is either specified in {\it data}
  or computed via \\
-Line 982 
 Similar to above, the following relation
+Line 1029 
 Similar to above, the following relation
  %
  \end{itemize}
+ The following parameters may be worth describing: \\
+ %
+ \hspace*{4ex} {\tt isbyte} \\
+ \hspace*{4ex} {\tt maxpass} \\
+ ~
  \subsubsection{File {\it makefile}}
- This file contains all relevant paramter flags and
+ This file contains all relevant parameter flags and
- lists to run TAMC.
+ lists to run TAMC or TAF.
  It is assumed that TAMC is available to you, either locally,
  being installed on your network, or remotely through the 'TAMC Utility'.
  TAMC is called with the command {\tt tamc} followed by a
-Line 996 
 Here we briefly discuss the main flags u
+Line 1049 
 Here we briefly discuss the main flags u
  \begin{itemize}
  \item [{\tt tamc}] {\tt
  -input <variable names>
- -output <variable name> ... \\
+ -output <variable name> -r4 ... \\
  -toplevel <S/R name> -reverse <file names>
  }
  \end{itemize}
-Line 1017 
 Dependent variable $ J $  which is to be
+Line 1070 
 Dependent variable $ J $  which is to be
  \item {\tt -reverse <file names>} \\
  Adjoint code is generated to compute the sensitivity of an
  independent variable w.r.t.  many dependent variables.
- The generated adjoint top-level routine computes the product
+ In the discussion of Section ???
+ the generated adjoint top-level routine computes the product
  of the transposed Jacobian matrix $ M^T $ times
  the gradient vector $ \nabla_v J $.
  \\
  {\tt <file names>} refers to the list of files {\it .f} which are to be
  analyzed by TAMC. This list is generally smaller than the full list
  of code to be compiled. The files not contained are either
- above the top-level routine (some initialisations), or are
+ above the top-level routine (some initializations), or are
  deliberately hidden from TAMC, either because hand-written
  adjoint routines exist, or the routines must not (or don't have to)
  be differentiated. For each routine which is part of the flow tree
- of the top-level routine, but deliberately hidden from TAMC,
+ of the top-level routine, but deliberately hidden from TAMC
+ (or for each package which contains such routines),
  a corresponding file {\it .flow} exists containing flow directives
  for TAMC.
  %
+ \item {\tt -r4} \\
+ ~
+ %
  \end{itemize}
- \subsubsection{File {\it data}}
+ \subsubsection{The input parameter files}
+ \paragraph{File {\it data}}
+ \paragraph{File {\it data.cost}}
+ \paragraph{File {\it data.ctrl}}
- \subsubsection{File {\it data.cost}}
+ \paragraph{File {\it data.gmredi}}
- \subsubsection{File {\it data.ctrl}}
+ \paragraph{File {\it data.grdchk}}
- \subsubsection{File {\it data.pkg}}
+ \paragraph{File {\it data.optim}}
- \subsubsection{File {\it eedata}}
+ \paragraph{File {\it data.pkg}}
- \subsubsection{File {\it topog.bin}}
+ \paragraph{File {\it eedata}}
- \subsubsection{File {\it windx.bin, windy.bin}}
+ \paragraph{File {\it topog.bin}}
- \subsubsection{File {\it salt.bin, theta.bin}}
+ \paragraph{File {\it windx.bin, windy.bin}}
- \subsubsection{File {\it SSS.bin, SST.bin}}
+ \paragraph{File {\it salt.bin, theta.bin}}
- \subsubsection{File {\it pickup*}}
+ \paragraph{File {\it SSS.bin, SST.bin}}
- \subsection{Compiling the model and its adjoint}
+ \paragraph{File {\it pickup*}}
+ \subsection{Compiling the model and its adjoint}
+ The built process of the adjoint model is slightly more
+ complex than that of compiling the forward code.
+ The main reason is that the adjoint code generation requires
+ a specific list of routines that are to be differentiated
+ (as opposed to the automatic generation of a list of
+ files to be compiled by genmake).
+ This list excludes routines that don't have to be or must not be
+ differentiated. For some of the latter routines flow directives
+ may be necessary, a list of which has to be given as well.
+ For this reason, a separate {\it makefile} is currently
+ maintained in the directory {\tt adjoint/}. This
+ makefile is responsible for the adjoint code generation.
+ In the following we describe the build process step by step,
+ assuming you are in the directory {\tt bin/}.
+ A summary of steps to follow is given at the end.
+ \paragraph{Adjoint code generation and compilation -- step by step}
+ \begin{enumerate}
+ %
+ \item
+ {\tt ln -s ../verification/???/code/.genmakerc .} \\
+ {\tt ln -s ../verification/???/code/*.[Fh] .} \\
+ Link your customized genmake options, header files,
+ and modified code to the compile directory.
+ %
+ \item
+ {\tt ../tools/genmake -makefile} \\
+ Generate your Makefile (cf. Section ???).
+ %
+ \item
+ {\tt make depend} \\
+ Dependency analysis for the CPP pre-compiler (cf. Section ???).
+ %
+ \item
+ {\tt make small\_f} \\
+ This is the first difference between forward code compilation
+ and adjoint code generation and compilation.
+ Instead of going through the entire compilation process
+ (CPP precompiling -- {\tt .f}, object code generation -- {\tt .o},
+ linking of object files and libraries to generate executable),
+ only the CPP compiler is invoked at this stage to generate
+ the {\tt .f} files.
+ %
+ \item
+ {\tt cd ../adjoint} \\
+ {\tt make adtaf} or {\tt make adtamc} \\
+ Depending on whether you have TAF or TAMC at your disposal,
+ you'll choose {\tt adtaf} or {\tt adtamc} as your
+ make target for the {\it makefile} in the directory {\tt adjoint/}.
+ Several things happen at this stage.
+ %
+ \begin{enumerate}
+ %
+ \item
+ The initial template file {\it adjoint\_model.F} which is part
+ of the compiling list created by {\it genmake} is restored.
+ %
+ \item
+ All Fortran routines {\tt *.f} in {\tt bin/} are
+ concatenated into a single file (it's current name is
+ {\it tamc\_code.f}).
+ %
+ \item
+ Adjoint code is generated by TAMC or TAF.
+ The adjoint code is written to the file {\it tamc\_code\_ad.f}.
+ It contains all adjoint routines of the forward routines
+ concatenated in {\it tamc\_code.f}.
+ For a given forward routines {\tt subroutine routinename}
+ the adjoint routine is named {\tt adsubroutine routinename}
+ by default (that default can be changed via the flag
+ {\tt -admark <markname>}).
+ Furthermore, it may contain modified code which
+ incorporates the translation of adjoint store directives
+ into specific Fortran code.
+ For a given forward routines {\tt subroutine routinename}
+ the modified routine is named {\tt mdsubroutine routinename}.
+ TAMC or TAF info is written to file
+ {\it tamc\_code.prot} or {\it taf.log}, respectively.
+ %
+ \end{enumerate}
+ %
+ \item
+ {\tt make adchange} \\
+ The multi-threading capability of the MITGCM requires a slight
+ change in the parameter list of some routines that are related to
+ to active file handling.
+ This post-processing invokes the sed script {\it adjoint\_ecco\_sed.com}
+ to insert the threading counter {\bf myThId} into the parameter list
+ of those subroutines.
+ The resulting code is written to file {\it tamc\_code\_sed\_ad.f}
+ and appended to the file {\it adjoint\_model.F}.
+ This concludes the adjoint code generation.
+ %
+ \item
+ {\tt cd ../bin} \\
+ {\tt make} \\
+ The file {\it adjoint\_model.F} now contains the full adjoint code.
+ All routines are now compiled.
+ %
+ \end{enumerate}
+ \paragraph{Adjoint code generation and compilation -- summary}
+ ~ \\
+ \[
+ \boxed{
+ \begin{split}
+  ~ & \mbox{\tt cd bin} \\
+  ~ & \mbox{\tt ln -s ../verification/my\_experiment/code/.genmakerc .} \\
+  ~ & \mbox{\tt ln -s ../verification/my\_experiment/code/*.[Fh] .} \\
+  ~ & \mbox{\tt ../tools/genmake -makefile} \\
+  ~ & \mbox{\tt make depend} \\
+  ~ & \mbox{\tt make small\_f} \\
+  ~ & \mbox{\tt cd ../adjoint} \\
+  ~ & \mbox{\tt make adtaf <OR: make adtamc>} \\
+  ~ & \mbox{\tt make adchange} \\
+  ~ & \mbox{\tt cd ../bin} \\
+  ~ & \mbox{\tt make} \\
+ \end{split}
+ }
+ \]
  \newpage
  %**********************************************************************
- \section{TLM and ADM code generation in general}
+ \section{TLM and ADM generation in general}
  \label{sec_ad_setup_gen}
  %**********************************************************************
-Line 1068 
 In this section we describe in a general
+Line 1258 
 In this section we describe in a general
  the parts of the code that are relevant for automatic
  differentiation using the software tool TAMC.
- \subsection{The cost function (dependent variable)}
+ \begin{figure}[b!]
+ \input{part5/doc_ad_the_model}
+ \caption{~}
+ \label{fig:adthemodel}
+ \end{figure}
+ The basic flow is depicted in \ref{fig:adthemodel}.
+ If the option {\tt ALLOW\_AUTODIFF\_TAMC} is defined, the driver routine
+ {\it the\_model\_main}, instead of calling {\it the\_main\_loop},
+ invokes the adjoint of this routine, {\it adthe\_main\_loop},
+ which is the toplevel routine in terms of reverse mode computation.
+ The routine {\it adthe\_main\_loop} has been generated using TAMC.
+ It contains both the forward integration of the full model,
+ any additional storing that is required for efficient checkpointing,
+ and the reverse integration of the adjoint model.
+ The structure of {\it adthe\_main\_loop} has been strongly
+ simplified for clarification; in particular, no checkpointing
+ procedures are shown here.
+ Prior to the call of {\it adthe\_main\_loop}, the routine
+ {\it ctrl\_unpack} is invoked to unpack the control vector,
+ and following that call, the routine {\it ctrl\_pack}
+ is invoked to pack the control vector
+ (cf. Section \ref{section_ctrl}).
+ If gradient checks are to be performed, the option
+ {\tt ALLOW\_GRADIENT\_CHECK} is defined. In this case
+ the driver routine {\it grdchk\_main} is called after
+ the gradient has been computed via the adjoint
+ (cf. Section \ref{section_grdchk}).
+ \subsection{The cost function (dependent variable)
+ \label{section_cost}}
  The cost function $ {\cal J} $ is referred to as the {\sf dependent variable}.
  It is a function of the input variables $ \vec{u} $ via the composition
-Line 1076 
 $ {\cal J}(\vec{u}) \, = \, {\cal J}(M(\
+Line 1296 
 $ {\cal J}(\vec{u}) \, = \, {\cal J}(M(\
  The input is referred to as the
  {\sf independent variables} or {\sf control variables}.
  All aspects relevant to the treatment of the cost function $ {\cal J} $
- (parameter setting, initialisation, incrementation,
+ (parameter setting, initialization, accumulation,
- final evaluation), are controled by the package {\it pkg/cost}.
+ final evaluation), are controlled by the package {\it pkg/cost}.
+ \begin{figure}[h!]
+ \input{part5/doc_cost_flow}
+ \caption{~}
+ \label{fig:costflow}
+ \end{figure}
  \subsubsection{genmake and CPP options}
  %
-Line 1097 
 compile list in 3 different ways (cf. Se
+Line 1323 
 compile list in 3 different ways (cf. Se
  \begin{enumerate}
  %
  \item {\it genmake}: \\
- Change the default settngs in the file {\it genmake} by adding
+ Change the default settings in the file {\it genmake} by adding
  {\bf cost} to the {\bf enable} list (not recommended).
  %
  \item {\it .genmakerc}: \\
-Line 1110 
 Call {\it genmake} with the option
+Line 1336 
 Call {\it genmake} with the option
  {\tt genmake -enable=cost}.
  %
  \end{enumerate}
- Since the cost function is usually used in conjunction with
- automatic differentiation, the CPP option
- {\bf ALLOW\_ADJOINT\_RUN} should be defined
- (file {\it CPP\_OPTIONS.h}).
  The basic CPP option to enable the cost function is {\bf ALLOW\_COST}.
  Each specific cost function contribution has its own option.
  For the present example the option is {\bf ALLOW\_COST\_TRACER}.
  All cost-specific options are set in {\it ECCO\_CPPOPTIONS.h}
+ Since the cost function is usually used in conjunction with
+ automatic differentiation, the CPP option
+ {\bf ALLOW\_ADJOINT\_RUN} should be defined
+ (file {\it CPP\_OPTIONS.h}).
- \subsubsection{Initialisation}
+ \subsubsection{Initialization}
  %
- The initialisation of the {\it cost} package is readily enabled
+ The initialization of the {\it cost} package is readily enabled
  as soon as the CPP option {\bf ALLOW\_ADJOINT\_RUN} is defined.
  %
  \begin{itemize}
-Line 1152 
 Variables: {\it cost\_init}
+Line 1378 
 Variables: {\it cost\_init}
  }
  \\
  This S/R
- initialises the different cost function contributions.
+ initializes the different cost function contributions.
- The contribtion for the present example is {\bf objf\_tracer}
+ The contribution for the present example is {\bf objf\_tracer}
  which is defined on each tile (bi,bj).
  %
  \end{itemize}
  %
- \subsubsection{Incrementation}
+ \subsubsection{Accumulation}
  %
  \begin{itemize}
  %
-Line 1206 
 The total cost function {\bf fc} will be
+Line 1432 
 The total cost function {\bf fc} will be
  tamc -output 'fc' ...
  \end{verbatim}
- \begin{figure}[t!]
+ %%%% \end{document}
- \input{part5/doc_ad_the_model}
- \label{fig:adthemodel}
- \caption{~}
- \end{figure}
  \begin{figure}
  \input{part5/doc_ad_the_main}
- \label{fig:adthemain}
  \caption{~}
+ \label{fig:adthemain}
  \end{figure}
- \subsection{The control variables (independent variables)}
+ \subsection{The control variables (independent variables)
+ \label{section_ctrl}}
  The control variables are a subset of the model input
  (initial conditions, boundary conditions, model parameters).
  Here we identify them with the variable $ \vec{u} $.
  All intermediate variables whose derivative w.r.t. control
- variables don't vanish are called {\sf active variables}.
+ variables do not vanish are called {\sf active variables}.
  All subroutines whose derivative w.r.t. the control variables
  don't vanish are called {\sf active routines}.
  Read and write operations from and to file can be viewed
-Line 1232 
 as variable assignments. Therefore, file
+Line 1455 
 as variable assignments. Therefore, file
  active variables are written and from which active variables
  are read are called {\sf active files}.
  All aspects relevant to the treatment of the control variables
- (parameter setting, initialisation, perturbation)
+ (parameter setting, initialization, perturbation)
- are controled by the package {\it pkg/ctrl}.
+ are controlled by the package {\it pkg/ctrl}.
+ \begin{figure}[h!]
+ \input{part5/doc_ctrl_flow}
+ \caption{~}
+ \label{fig:ctrlflow}
+ \end{figure}
  \subsubsection{genmake and CPP options}
  %
-Line 1253 
 To enable the directory to be included t
+Line 1482 
 To enable the directory to be included t
  Each control variable is enabled via its own CPP option
  in {\it ECCO\_CPPOPTIONS.h}.
- \subsubsection{Initialisation}
+ \subsubsection{Initialization}
  %
  \begin{itemize}
  %
-Line 1293 
 Two important issues related to the hand
+Line 1522 
 Two important issues related to the hand
  variables in the MITGCM need to be addressed.
  First, in order to save memory, the control variable arrays
  are not kept in memory, but rather read from file and added
- to the initial (or first guess) fields.
+ to the initial fields during the model initialization phase.
  Similarly, the corresponding adjoint fields which represent
  the gradient of the cost function w.r.t. the control variables
- are written to to file.
+ are written to file at the end of the adjoint integration.
  Second, in addition to the files holding the 2-dim. and 3-dim.
- control variables and the gradient, a 1-dim. {\sf control vector}
+ control variables and the corresponding cost gradients,
+ a 1-dim. {\sf control vector}
  and {\sf gradient vector} are written to file. They contain
  only the wet points of the control variables and the corresponding
  gradient.
  This leads to a significant data compression.
- Furthermore, the control and the gradient vector can be passed to a
+ Furthermore, an option is available
+ ({\tt ALLOW\_NONDIMENSIONAL\_CONTROL\_IO}) to
+ non-dimensionalise the control and gradient vector,
+ which otherwise would contain different pieces of different
+ magnitudes and units.
+ Finally, the control and gradient vector can be passed to a
  minimization routine if an update of the control variables
  is sought as part of a minimization exercise.
-Line 1314 
 and gradient are generated and initialis
+Line 1549 
 and gradient are generated and initialis
  \subsubsection{Perturbation of the independent variables}
  %
- The dependency chain for differentiation starts
+ The dependency flow for differentiation w.r.t. the controls
- with adding a perturbation onto the the input variable,
+ starts with adding a perturbation onto the input variable,
  thus defining the independent or control variables for TAMC.
- Three classes of controls may be considered:
+ Three types of controls may be considered:
  %
  \begin{itemize}
  %
-Line 1332 
 Three classes of controls may be conside
+Line 1567 
 Three classes of controls may be conside
  Consider as an example the initial tracer distribution
  {\bf tr1} as control variable.
  After {\bf tr1} has been initialised in
- {\it ini\_tr1} (dynamical variables including
+ {\it ini\_tr1} (dynamical variables such as
  temperature and salinity are initialised in {\it ini\_fields}),
  a perturbation anomaly is added to the field in S/R
  {\it ctrl\_map\_ini}
-Line 1345 
 u         & = \, u_{[0]} \, + \, \Delta
+Line 1580 
 u         & = \, u_{[0]} \, + \, \Delta
  \end{split}
  \end{equation}
  %
- In principle {\bf xx\_tr1} is a 3-dim. global array
+ {\bf xx\_tr1} is a 3-dim. global array
  holding the perturbation. In the case of a simple
  sensitivity study this array is identical to zero.
- However, it's specification is essential since TAMC
+ However, it's specification is essential in the context
+ of automatic differentiation since TAMC
  treats the corresponding line in the code symbolically
  when determining the differentiation chain and its origin.
  Thus, the variable names are part of the argument list
-Line 1366 
 dummy variable {\bf xx\_tr1\_dummy} is i
+Line 1602 
 dummy variable {\bf xx\_tr1\_dummy} is i
  and an 'active read' routine of the adjoint support
  package {\it pkg/autodiff} is invoked.
  The read-procedure is tagged with the variable
- {\bf xx\_tr1\_dummy} enabbling TAMC to recognize the
+ {\bf xx\_tr1\_dummy} enabling TAMC to recognize the
- initialisation of the perturbation.
+ initialization of the perturbation.
  The modified call of TAMC thus reads
  %
  \begin{verbatim}
-Line 1388 
 Note, that reading an active variable co
+Line 1624 
 Note, that reading an active variable co
  to a variable assignment. Its derivative corresponds
  to a write statement of the adjoint variable.
  The 'active file' routines have been designed
- to support active read and corresponding active write
+ to support active read and corresponding adjoint active write
- operations.
+ operations (and vice versa).
  %
  \item
  \fbox{
-Line 1406 
 with the symbolic perturbation taking pl
+Line 1642 
 with the symbolic perturbation taking pl
  Note however an important difference:
  Since the boundary values are time dependent with a new
  forcing field applied at each time steps,
- the general problem may be be thought of as
+ the general problem may be thought of as
- a new control variable at each time step, i.e.
+ a new control variable at each time step
+ (or, if the perturbation is averaged over a certain period,
+ at each $ N $ timesteps), i.e.
  \[
  u_{\rm forcing} \, = \,
  \{ \, u_{\rm forcing} ( t_n ) \, \}_{
-Line 1432 
 calendar ({\it cal}~) and external forci
+Line 1670 
 calendar ({\it cal}~) and external forci
  %
  This routine is not yet implemented, but would proceed
  proceed along the same lines as the initial value sensitivity.
+ The mixing parameters {\bf diffkr} and {\bf kapgm}
+ are currently added as controls in {\it ctrl\_map\_ini.F}.
  %
  \end{itemize}
  %
  \subsubsection{Output of adjoint variables and gradient}
  %
- Two ways exist to generate output of adjoint fields.
+ Several ways exist to generate output of adjoint fields.
  %
  \begin{itemize}
  %
  \item
  \fbox{
  \begin{minipage}{12cm}
- {\it ctrl\_pack}:
+ {\it ctrl\_map\_ini, ctrl\_map\_forcing}:
  \end{minipage}
  }
  \\
- At the end of the forward/adjoint integration, the S/R
- {\it ctrl\_pack} is called which mirrors S/R {\it ctrl\_unpack}.
- It writes the following files:
- %
  \begin{itemize}
  %
- \item {\bf xx\_...}: the control variable fields
+ \item {\bf xx\_...}: the control variable fields \\
+ Before the forward integration, the control
+ variables are read from file {\bf xx\_ ...} and added to
+ the model field.
  %
  \item {\bf adxx\_...}: the adjoint variable fields, i.e. the gradient
- $ \nabla _{u}{\cal J} $ for each control variable,
+ $ \nabla _{u}{\cal J} $ for each control variable \\
+ After the adjoint integration the corresponding adjoint
+ variables are written to {\bf adxx\_ ...}.
  %
- \item {\bf vector\_ctrl}: the control vector
+ \end{itemize}
  %
- \item {\bf vector\_grad}: the gradient vector
+ \item
+ \fbox{
+ \begin{minipage}{12cm}
+ {\it ctrl\_unpack, ctrl\_pack}:
+ \end{minipage}
+ }
+ \\
+ %
+ \begin{itemize}
+ %
+ \item {\bf vector\_ctrl}: the control vector \\
+ At the very beginning of the model initialization,
+ the updated compressed control vector is read (or initialised)
+ and distributed to 2-dim. and 3-dim. control variable fields.
+ %
+ \item {\bf vector\_grad}: the gradient vector \\
+ At the very end of the adjoint integration,
+ the 2-dim. and 3-dim. adjoint variables are read,
+ compressed to a single vector and written to file.
  %
  \end{itemize}
  %
-Line 1474 
 $ \nabla _{u}{\cal J} $ for each control
+Line 1733 
 $ \nabla _{u}{\cal J} $ for each control
  }
  \\
  In addition to writing the gradient at the end of the
- forward/adjoint integration, many more adjoint variables,
+ forward/adjoint integration, many more adjoint variables
- representing the Lagrange multipliers of the model state
+ of the model state
- w.r.t. the model state
+ at intermediate times can be written using S/R
- at different times can be written using S/R
  {\it addummy\_in\_stepping}.
  This routine is part of the adjoint support package
  {\it pkg/autodiff} (cf.f. below).
-Line 1491 
 than generated automatically.
+Line 1749 
 than generated automatically.
  Appropriate flow directives ({\it dummy\_in\_stepping.flow})
  ensure that TAMC does not automatically
  generate {\it addummy\_in\_stepping} by trying to differentiate
- {\it dummy\_in\_stepping}, but rather takes the hand-written routine.
+ {\it dummy\_in\_stepping}, but instead refers to
+ the hand-written routine.
  {\it dummy\_in\_stepping} is called in the forward code
  at the beginning of each
-Line 1501 
 each timestep in the adjoint calculation
+Line 1760 
 each timestep in the adjoint calculation
  {\it addynamics}.
  {\it addummy\_in\_stepping} includes the header files
- {\it adffields.h, addynamics.h, adtr1.h}.
+ {\it adcommon.h}.
- These header files are also hand-written. They contain
+ This header file is also hand-written. It contains
- the common blocks {\bf /addynvars\_r/}, {\bf /addynvars\_cd/},
+ the common blocks
+ {\bf /addynvars\_r/}, {\bf /addynvars\_cd/},
+ {\bf /addynvars\_diffkr/}, {\bf /addynvars\_kapgm/},
  {\bf /adtr1\_r/}, {\bf /adffields/},
  which have been extracted from the adjoint code to enable
  access to the adjoint variables.
-Line 1521 
 The gradient $ \nabla _{u}{\cal J} |_{u_
+Line 1782 
 The gradient $ \nabla _{u}{\cal J} |_{u_
  with the value of the cost function itself $ {\cal J}(u_{[k]}) $
  at iteration step $ k $ serve
  as input to a minimization routine (e.g. quasi-Newton method,
- conjugate gradient, ...) to compute an update in the
+ conjugate gradient, ... \cite{gil_lem:89})
+ to compute an update in the
  control variable for iteration step $k+1$
  \[
  u_{[k+1]} \, = \,  u_{[0]} \, + \, \Delta u_{[k+1]}
-Line 1535 
 Tab. \ref{???} sketches the flow between
+Line 1797 
 Tab. \ref{???} sketches the flow between
  and the minimization routine.
  \begin{eqnarray*}
- \footnotesize
+ \scriptsize
  \begin{array}{ccccc}
  u_{[0]} \,\, ,  \,\, \Delta u_{[k]}    & ~ & ~ & ~ & ~ \\
  {\Big\downarrow}
-Line 1552 
 v_{[k]} = M \left( u_{[k]} \right) &
+Line 1814 
 v_{[k]} = M \left( u_{[k]} \right) &
  {\cal J}_{[k]} = {\cal J} \left( M \left( u_{[k]} \right) \right)} \\
  \multicolumn{1}{|c}{~} & ~ & ~ & ~ & \multicolumn{1}{c|}{~} \\
  \hline
+ \multicolumn{1}{|c}{~} & ~ & ~ & ~ & \multicolumn{1}{c|}{~}  \\
+ \multicolumn{1}{|c}{~} & ~ & ~ & ~ & \multicolumn{1}{c|}{{\Big\downarrow}} \\
+ \multicolumn{1}{|c}{~} & ~ & ~ & ~ & \multicolumn{1}{c|}{~}  \\
  \hline
  \multicolumn{1}{|c}{~} & ~ & ~ & ~ & \multicolumn{1}{c|}{~} \\
  \multicolumn{1}{|c}{
  \nabla_u {\cal J}_{[k]} (\delta {\cal J}) =
- T\!\!^{\ast} \cdot \nabla_v {\cal J} |_{v_{[k]}} (\delta {\cal J})} &
+ T^{\ast} \cdot \nabla_v {\cal J} |_{v_{[k]}} (\delta {\cal J})} &
  \stackrel{\bf adjoint}{\mathbf \longleftarrow} &
  ad \, v_{[k]} (\delta {\cal J}) =
  \nabla_v {\cal J} |_{v_{[k]}} (\delta {\cal J}) &
-Line 1565 
 ad \, v_{[k]} (\delta {\cal J}) =
+Line 1830 
 ad \, v_{[k]} (\delta {\cal J}) =
  \multicolumn{1}{|c}{~} & ~ & ~ & ~ & \multicolumn{1}{c|}{~} \\
  \hline
   ~ & ~ & ~ & ~ & ~ \\
- ~ & ~ &
+ \hspace*{15ex}{\Bigg\downarrow}
- {\cal J}_{[k]} \qquad {\Bigg\downarrow}  \qquad \nabla_u {\cal J}_{[k]}
+ \quad {\cal J}_{[k]}, \quad \nabla_u {\cal J}_{[k]}
-  & ~ & ~ \\
+  & ~ & ~ & ~ & ~ \\
   ~ & ~ & ~ & ~ & ~ \\
  \hline
  \multicolumn{1}{|c}{~} & ~ & ~ & ~ & \multicolumn{1}{c|}{~} \\
-Line 1595 
 The corresponding I/O flow looks as foll
+Line 1860 
 The corresponding I/O flow looks as foll
  \vspace*{0.5cm}
+ {\scriptsize
  \begin{tabular}{ccccc}
  {\bf vector\_ctrl\_$<$k$>$ } & ~ & ~ & ~ & ~ \\
  {\big\downarrow}  & ~ & ~ & ~ & ~ \\
-Line 1605 
 The corresponding I/O flow looks as foll
+Line 1871 
 The corresponding I/O flow looks as foll
  \cline{3-3}
  \multicolumn{1}{l}{\bf xx\_theta0...$<$k$>$} & ~ &
  \multicolumn{1}{|c|}{~} & ~ & ~ \\
- \multicolumn{1}{l}{\bf xx\_salt0...$<$k$>$} & $\longrightarrow$ &
+ \multicolumn{1}{l}{\bf xx\_salt0...$<$k$>$} &
+ $\stackrel{\mbox{read}}{\longrightarrow}$ &
  \multicolumn{1}{|c|}{forward integration} & ~ & ~ \\
  \multicolumn{1}{l}{\bf \vdots} & ~ & \multicolumn{1}{|c|}{~}
  & ~ & ~ \\
  \cline{3-3}
- ~ & ~ & ~ & ~ & ~ \\
+ ~ & ~ & $\downarrow$ & ~ & ~ \\
  \cline{3-3}
  ~ & ~ &
  \multicolumn{1}{|c|}{~} & ~ &
  \multicolumn{1}{l}{\bf adxx\_theta0...$<$k$>$}  \\
  ~ & ~ & \multicolumn{1}{|c|}{adjoint integration} &
- $\longrightarrow$ &
+ $\stackrel{\mbox{write}}{\longrightarrow}$ &
  \multicolumn{1}{l}{\bf adxx\_salt0...$<$k$>$} \\
  ~ & ~ & \multicolumn{1}{|c|}{~}
  & ~ & \multicolumn{1}{l}{\bf \vdots} \\
-Line 1628 
 $\longrightarrow$ &
+Line 1895 
 $\longrightarrow$ &
  ~ & ~ & ~ & ~ &  {\big\downarrow} \\
  ~ & ~ & ~ & ~ &  {\bf vector\_grad\_$<$k$>$ } \\
  \end{tabular}
+ }
  \vspace*{0.5cm}
- {\it ctrl\_unpack} reads in the updated control vector
+ {\it ctrl\_unpack} reads the updated control vector
  {\bf vector\_ctrl\_$<$k$>$}.
  It distributes the different control variables to
 -dim. and 3-dim. files {\it xx\_...$<$k$>$}.
- During the forward integration the control variables
+ At the start of the forward integration the control variables
- are read from {\it xx\_...$<$k$>$}.
+ are read from {\it xx\_...$<$k$>$} and added to the
- Correspondingly, the adjoint fields are written
+ field.
+ Correspondingly, at the end of the adjoint integration
+ the adjoint fields are written
  to {\it adxx\_...$<$k$>$}, again via the active file routines.
- Finally, {\it ctrl\_pack} collects all adjoint field files
+ Finally, {\it ctrl\_pack} collects all adjoint files
  and writes them to the compressed vector file
  {\bf vector\_grad\_$<$k$>$}.
-Line 1648 
 and writes them to the compressed vector
+Line 1918 
 and writes them to the compressed vector
- \subsection{Flow directives and adjoint support routines}
+ \subsection{Flow directives and adjoint support routines \label{section_flowdir}}
- \subsection{Store directives and checkpointing}
+ \subsection{Store directives and checkpointing \label{section_checkpointing}}
- \subsection{Gradient checks}
+ \subsection{Gradient checks \label{section_grdchk}}
  \subsection{Second derivative generation via TAMC}

 Legend:



Removed from v.1.1.1.1
 


changed lines


 
Added in v.1.7
 Legend:



Removed from v.1.1.1.1
 


changed lines


 
Added in v.1.7
-Removed from v.1.1.1.1
+Added in v.1.7

	ViewVC Help
Powered by ViewVC 1.1.22