/[MITgcm]/manual/s_autodiff/text/doc_ad_2.tex

Diff of /manual/s_autodiff/text/doc_ad_2.tex

Parent Directory | Revision Log | View Revision Graph Revision Graph | View Patch Patch

-revision 1.1.1.1 by adcroft,
Wed Aug  8 16:16:26 2001 UTC
+revision 1.24 by jmc,
Tue Aug 31 20:56:21 2010 UTC
 Line 1
  % $Header$
  % $Name$
+ Author: Patrick Heimbach
  {\sf Automatic differentiation} (AD), also referred to as algorithmic
  (or, more loosely, computational) differentiation, involves
- automatically deriving code to calculate
+ automatically deriving code to calculate partial derivatives from an
- partial derivatives from an existing fully non-linear prognostic code.
+ existing fully non-linear prognostic code.  (see \cite{gri:00}).  A
- (see \cite{gri:00}).
+ software tool is used that parses and transforms source files
- A software tool is used that parses and transforms source files
+ according to a set of linguistic and mathematical rules.  AD tools are
- according to a set of linguistic and mathematical rules.
+ like source-to-source translators in that they parse a program code as
- AD tools are like source-to-source translators in that
+ input and produce a new program code as output
- they parse a program code as input and produce a new program code
+ (we restrict our discussion to source-to-source tools, ignoring
- as output.
+ operator-overloading tools).  However, unlike a
- However, unlike a pure source-to-source translation, the output program
+ pure source-to-source translation, the output program represents a new
- represents a new algorithm, such as the evaluation of the
+ algorithm, such as the evaluation of the Jacobian, the Hessian, or
- Jacobian, the Hessian, or higher derivative operators.
+ higher derivative operators.  In principle, a variety of derived
- In principle, a variety of derived algorithms
+ algorithms can be generated automatically in this way.
- can be generated automatically in this way.
+ MITgcm has been adapted for use with the Tangent linear and Adjoint
- The MITGCM has been adapted for use with the
+ Model Compiler (TAMC) and its successor TAF (Transformation of
- Tangent linear and Adjoint Model Compiler (TAMC) and its succssor TAF
+ Algorithms in Fortran), developed by Ralf Giering (\cite{gie-kam:98},
- (Transformation of Algorithms in Fortran), developed
+ \cite{gie:99,gie:00}).  The first application of the adjoint of MITgcm
- by Ralf Giering (\cite{gie-kam:98}, \cite{gie:99,gie:00}).
+ for sensitivity studies has been published by \cite{maro-eta:99}.
- The first application of the adjoint of the MITGCM for senistivity
+ \cite{stam-etal:97,stam-etal:02} use MITgcm and its adjoint for ocean
- studies has been published by \cite{maro-eta:99}.
+ state estimation studies.  In the following we shall refer to TAMC and
- \cite{sta-eta:97,sta-eta:01} use the MITGCM and its adjoint
+ TAF synonymously, except were explicitly stated otherwise.
- for ocean state estimation studies.
+ As of mid-2007 we are also able to generate fairly efficient
- TAMC exploits the chain rule for computing the first
+ adjoint code of the MITgcm using a new, open-source AD tool,
- derivative of a function with
+ called OpenAD (see \cite{naum-etal:06,utke-etal:08}.
- respect to a set of input variables.
+ This enables us for the first time to compare adjoint models
- Treating a given forward code as a composition of operations --
+ generated from different AD tools, providing an additional
- each line representing a compositional element -- the chain rule is
+ accuracy check, complementary to finite-difference gradient checks.
- rigorously applied to the code, line by line. The resulting
+ OpenAD and its application to  MITgcm is described in detail
- tangent linear or adjoint code,
+ in section \ref{sec_ad_openad}.
- then, may be thought of as the composition in
- forward or reverse order, respectively, of the
+ The AD tool exploits the chain rule for computing the first derivative of a
- Jacobian matrices of the forward code compositional elements.
+ function with respect to a set of input variables.  Treating a given
+ forward code as a composition of operations -- each line representing
+ a compositional element, the chain rule is rigorously applied to the
+ code, line by line. The resulting tangent linear or adjoint code,
+ then, may be thought of as the composition in forward or reverse
+ order, respectively, of the Jacobian matrices of the forward code's
+ compositional elements.
  %**********************************************************************
  \section{Some basic algebra}
  \label{sec_ad_algebra}
+ \begin{rawhtml}
+ <!-- CMIREDIR:sec_ad_algebra: -->
+ \end{rawhtml}
  %**********************************************************************
  Let $ \cal{M} $ be a general nonlinear, model, i.e. a
-Line 50 
 $\vec{u}=(u_1,\ldots,u_m)$
+Line 61 
 $\vec{u}=(u_1,\ldots,u_m)$
  such as forcing functions) to the $n$-dimensional space
  $V \subset I\!\!R^n$ of
  model output variable $\vec{v}=(v_1,\ldots,v_n)$
- (model state, model diagnostcs, objective function, ...)
+ (model state, model diagnostics, objective function, ...)
  under consideration,
  %
  \begin{equation}
- \begin{split}
+ \begin{aligned}
  {\cal M} \, : & \, U \,\, \longrightarrow \, V \\
  ~      & \, \vec{u} \,\, \longmapsto \, \vec{v} \, = \,
  {\cal M}(\vec{u})
  \label{fulloperator}
- \end{split}
+ \end{aligned}
  \end{equation}
  %
  The vectors $ \vec{u} \in U $ and $ v \in V $ may be represented w.r.t.
-Line 105 
 In contrast to the full nonlinear model
+Line 116 
 In contrast to the full nonlinear model
  $ M $ is just a matrix
  which can readily be used to find the forward sensitivity of $\vec{v}$ to
  perturbations in  $u$,
- but if there are very many input variables $(>>O(10^{6})$ for
+ but if there are very many input variables $(\gg O(10^{6})$ for
  large-scale oceanographic application), it quickly becomes
  prohibitive to proceed directly as in (\ref{tangent_linear}),
  if the impact of each component $ {\bf e_{i}} $ is to be assessed.
-Line 130 
 or a measure of some model-to-data misfi
+Line 141 
 or a measure of some model-to-data misfi
  \label{compo}
  \end{eqnarray}
  %
- The linear approximation of $ {\cal J} $,
+ The perturbation of $ {\cal J} $ around a fixed point $ {\cal J}_0 $,
  \[
- {\cal J} \, \approx \, {\cal J}_0 \, + \, \delta {\cal J}
+ {\cal J} \, = \, {\cal J}_0 \, + \, \delta {\cal J}
  \]
  can be expressed in both bases of $ \vec{u} $ and $ \vec{v} $
  w.r.t. their corresponding inner product
  $\left\langle \,\, , \,\, \right\rangle $
  %
  \begin{equation}
- \begin{split}
+ \begin{aligned}
  {\cal J} & = \,
  {\cal J} |_{\vec{u}^{(0)}} \, + \,
  \left\langle \, \nabla _{u}{\cal J}^T |_{\vec{u}^{(0)}} \, , \, \delta \vec{u} \, \right\rangle
-Line 148 
 $\left\langle \,\, , \,\, \right\rangle
+Line 159 
 $\left\langle \,\, , \,\, \right\rangle
  {\cal J} |_{\vec{v}^{(0)}} \, + \,
  \left\langle \, \nabla _{v}{\cal J}^T |_{\vec{v}^{(0)}} \, , \, \delta \vec{v} \, \right\rangle
  \, + \, O(\delta \vec{v}^2)
- \end{split}
+ \end{aligned}
  \label{deljidentity}
  \end{equation}
  %
- (note, that the gradient $ \nabla f $ is a pseudo-vector, therefore
+ (note, that the gradient $ \nabla f $ is a co-vector, therefore
  its transpose is required in the above inner product).
  Then, using the representation of
  $ \delta {\cal J} =
-Line 168 
 transpose of $ A $,
+Line 179 
 transpose of $ A $,
  \[
  A^{\ast} \, = \, A^T
  \]
- and from eq. (\ref{tangent_linear}), we note that
+ and from eq. (\ref{tangent_linear}), (\ref{deljidentity}),
+ we note that
  (omitting $|$'s):
  %
  \begin{equation}
-Line 188 
 the gradient $ \nabla _{u}{\cal J} $ can
+Line 200 
 the gradient $ \nabla _{u}{\cal J} $ can
  invoking the adjoint $ M^{\ast } $ of the tangent linear model $ M $
  %
  \begin{equation}
- \begin{split}
+ \begin{aligned}
  \nabla _{u}{\cal J}^T |_{\vec{u}} &
  = \, M^T |_{\vec{u}} \cdot \nabla _{v}{\cal J}^T |_{\vec{v}}  \\
  ~ & = \, M^T |_{\vec{u}} \cdot \delta \vec{v}^{\ast} \\
  ~ & = \, \delta \vec{u}^{\ast}
- \end{split}
+ \end{aligned}
  \label{adjoint}
  \end{equation}
  %
-Line 204 
 the adjoint variable of the model state
+Line 216 
 the adjoint variable of the model state
  $ \delta \vec{u}^{\ast} $ the adjoint variable of the control variable $ \vec{u} $.
  The {\sf reverse} nature of the adjoint calculation can be readily
- seen as follows. Let us decompose ${\cal J}(u)$, thus:
+ seen as follows.
+ Consider a model integration which consists of $ \Lambda $
+ consecutive operations
+ $ {\cal M}_{\Lambda} (  {\cal M}_{\Lambda-1} (
+ ...... ( {\cal M}_{\lambda} (
+ ......
+ ( {\cal M}_{1} ( {\cal M}_{0}(\vec{u}) )))) $,
+ where the ${\cal M}$'s could be the elementary steps, i.e. single lines
+ in the code of the model, or successive time steps of the
+ model integration,
+ starting at step 0 and moving up to step $\Lambda$, with intermediate
+ ${\cal M}_{\lambda} (\vec{u}) = \vec{v}^{(\lambda+1)}$ and final
+ ${\cal M}_{\Lambda} (\vec{u}) = \vec{v}^{(\Lambda+1)} = \vec{v}$.
+ Let ${\cal J}$ be a cost function which explicitly depends on the
+ final state $\vec{v}$ only
+ (this restriction is for clarity reasons only).
+ %
+ ${\cal J}(u)$ may be decomposed according to:
  %
  \begin{equation}
  {\cal J}({\cal M}(\vec{u})) \, = \,
-Line 215 
 seen as follows. Let us decompose ${\cal
+Line 244 
 seen as follows. Let us decompose ${\cal
  \label{compos}
  \end{equation}
  %
- where the ${\cal M}$'s could be the elementary steps, i.e. single lines
+ Then, according to the chain rule, the forward calculation reads,
- in the code of the model,
+ in terms of the Jacobi matrices
- starting at step 0 and moving up to step $\Lambda$, with intermediate
- ${\cal M}_{\lambda} (\vec{u}) = \vec{v}^{(\lambda+1)}$ and final
- ${\cal M}_{\Lambda} (\vec{u}) = \vec{v}^{(\Lambda+1)} = \vec{v}$
- Then, according to the chain rule the forward calculation reads in
- terms of the Jacobi matrices
  (we've omitted the $ | $'s which, nevertheless are important
  to the aspect of {\it tangent} linearity;
- note also that per definition
+ note also that by definition
  $ \langle \, \nabla _{v}{\cal J}^T \, , \, \delta \vec{v} \, \rangle
  = \nabla_v {\cal J} \cdot \delta \vec{v} $ )
  %
  \begin{equation}
- \begin{split}
+ \begin{aligned}
  \nabla_v {\cal J} (M(\delta \vec{u})) & = \,
  \nabla_v {\cal J} \cdot M_{\Lambda}
  \cdot ...... \cdot M_{\lambda} \cdot ...... \cdot
  M_{1} \cdot M_{0} \cdot \delta \vec{u} \\
  ~ & = \, \nabla_v {\cal J} \cdot \delta \vec{v} \\
- \end{split}
+ \end{aligned}
  \label{forward}
  \end{equation}
  %
-Line 243 
 whereas in reverse mode we have
+Line 267 
 whereas in reverse mode we have
  %
  \begin{equation}
  \boxed{
- \begin{split}
+ \begin{aligned}
  M^T ( \nabla_v {\cal J}^T) & = \,
  M_{0}^T \cdot M_{1}^T
  \cdot ...... \cdot M_{\lambda}^T \cdot ...... \cdot
-Line 252 
 M_{\Lambda}^T \cdot \nabla_v {\cal J}^T
+Line 276 
 M_{\Lambda}^T \cdot \nabla_v {\cal J}^T
  \cdot ...... \cdot
  \nabla_{v^{(\lambda)}} {\cal J}^T \\
  ~ & = \, \nabla_u {\cal J}^T
- \end{split}
+ \end{aligned}
  }
  \label{reverse}
  \end{equation}
  %
  clearly expressing the reverse nature of the calculation.
  Eq. (\ref{reverse}) is at the heart of automatic adjoint compilers.
- The intermediate steps $\lambda$ in
+ If the intermediate steps $\lambda$ in
  eqn. (\ref{compos}) -- (\ref{reverse})
- could represent the model state (forward or adjoint) at each
+ represent the model state (forward or adjoint) at each
- intermediate time step in which case
+ intermediate time step as noted above, then correspondingly,
- $ {\cal M}(\vec{v}^{(\lambda)}) = \vec{v}^{(\lambda+1)} $, and correspondingly,
+ $ M^T (\delta \vec{v}^{(\lambda) \, \ast}) =
- $ M^T (\delta \vec{v}^{(\lambda) \, \ast}) = \delta \vec{v}^{(\lambda-1) \, \ast} $,
+ \delta \vec{v}^{(\lambda-1) \, \ast} $ for the adjoint variables.
- but they can also be viewed more generally as
+ It thus becomes evident that the adjoint calculation also
- single lines of code in the numerical algorithm.
+ yields the adjoint of each model state component
- In both cases it becomes evident that the adjoint calculation
+ $ \vec{v}^{(\lambda)} $ at each intermediate step $ \lambda $, namely
- yields at the same time the adjoint of each model state component
- $ \vec{v}^{(\lambda)} $ at each intermediate step $ l $, namely
  %
  \begin{equation}
  \boxed{
- \begin{split}
+ \begin{aligned}
  \nabla_{v^{(\lambda)}} {\cal J}^T |_{\vec{v}^{(\lambda)}}
  & = \,
  M_{\lambda}^T |_{\vec{v}^{(\lambda)}} \cdot ...... \cdot
  M_{\Lambda}^T |_{\vec{v}^{(\lambda)}} \cdot \delta \vec{v}^{\ast} \\
  ~ & = \, \delta \vec{v}^{(\lambda) \, \ast}
- \end{split}
+ \end{aligned}
  }
  \end{equation}
  %
  in close analogy to eq. (\ref{adjoint})
  We note in passing that that the $\delta \vec{v}^{(\lambda) \, \ast}$
- are the Lagrange multipliers of the model state $ \vec{v}^{(\lambda)}$.
+ are the Lagrange multipliers of the model equations which determine
+ $ \vec{v}^{(\lambda)}$.
- In coponents, eq. (\ref{adjoint}) reads as follows.
+ In components, eq. (\ref{adjoint}) reads as follows.
  Let
  \[
  \begin{array}{rclcrcl}
-Line 308 
 Let
+Line 331 
 Let
  \end{array}
  \]
  denote the perturbations in $\vec{u}$ and $\vec{v}$, respectively,
- and their adjoint varaiables;
+ and their adjoint variables;
  further
  \[
  M \, = \, \left(
-Line 395 
 and the shorthand notation for the adjoi
+Line 418 
 and the shorthand notation for the adjoi
  $ \delta v^{(\lambda) \, \ast}_{j} = \frac{\partial}{\partial v^{(\lambda)}_{j}}
  {\cal J}^T $, $ j = 1, \ldots , n_{\lambda} $,
  for intermediate components, yielding
- \[
+ {\small
- \footnotesize
+ \begin{equation}
+ \begin{aligned}
  \left(
  \begin{array}{c}
  \delta v^{(\lambda) \, \ast}_1 \\
-Line 404 
 for intermediate components, yielding
+Line 428 
 for intermediate components, yielding
  \delta v^{(\lambda) \, \ast}_{n_{\lambda}} \\
  \end{array}
  \right)
- \, = \,
+ \, = &
  \left(
  \begin{array}{ccc}
  \frac{\partial ({\cal M}_{\lambda})_1}{\partial v^{(\lambda)}_1}
- & \ldots &
+ & \ldots \,\, \ldots &
  \frac{\partial ({\cal M}_{\lambda})_{n_{\lambda+1}}}{\partial v^{(\lambda)}_1} \\
  \vdots & ~ & \vdots \\
  \frac{\partial ({\cal M}_{\lambda})_1}{\partial v^{(\lambda)}_{n_{\lambda}}}
- & \ldots  &
+ & \ldots \,\, \ldots  &
  \frac{\partial ({\cal M}_{\lambda})_{n_{\lambda+1}}}{\partial v^{(\lambda)}_{n_{\lambda}}} \\
  \end{array}
  \right)
- %
  \cdot
  %
+ \\ ~ & ~
+ \\ ~ &
+ %
  \left(
  \begin{array}{ccc}
  \frac{\partial ({\cal M}_{\lambda+1})_1}{\partial v^{(\lambda+1)}_1}
-Line 431 
 for intermediate components, yielding
+Line 457 
 for intermediate components, yielding
  \frac{\partial ({\cal M}_{\lambda+1})_{n_{\lambda+2}}}{\partial v^{(\lambda+1)}_{n_{\lambda+1}}} \\
  \end{array}
  \right)
- \cdot \ldots \ldots \cdot
+ \cdot \, \ldots \, \cdot
  \left(
  \begin{array}{c}
  \delta v^{\ast}_1 \\
-Line 439 
 for intermediate components, yielding
+Line 465 
 for intermediate components, yielding
  \delta v^{\ast}_{n} \\
  \end{array}
  \right)
- \]
+ \end{aligned}
+ \end{equation}
+ }
  Eq. (\ref{forward}) and (\ref{reverse}) are perhaps clearest in
  showing the advantage of the reverse over the forward mode
-Line 450 
 variables $u$
+Line 478 
 variables $u$
  {\it all} intermediate states $ \vec{v}^{(\lambda)} $) are sought.
  In order to be able to solve for each component of the gradient
  $ \partial {\cal J} / \partial u_{i} $ in (\ref{forward})
- a forward calulation has to be performed for each component seperately,
+ a forward calculation has to be performed for each component separately,
  i.e. $ \delta \vec{u} = \delta u_{i} {\vec{e}_{i}} $
  for  the $i$-th forward calculation.
  Then, (\ref{forward}) represents the
-Line 460 
 In contrast, eq. (\ref{reverse}) yields
+Line 488 
 In contrast, eq. (\ref{reverse}) yields
  gradient $\nabla _{u}{\cal J}$ (and all intermediate gradients
  $\nabla _{v^{(\lambda)}}{\cal J}$) within a single reverse calculation.
- Note, that in case $ {\cal J} $ is a vector-valued function
+ Note, that if $ {\cal J} $ is a vector-valued function
  of dimension $ l > 1 $,
  eq. (\ref{reverse}) has to be modified according to
  \[
-Line 468 
 M^T \left( \nabla_v {\cal J}^T \left(\de
+Line 496 
 M^T \left( \nabla_v {\cal J}^T \left(\de
  \, = \,
  \nabla_u {\cal J}^T \cdot \delta \vec{J}
  \]
- where now $ \delta \vec{J} \in I\!\!R $ is a vector of dimenison $ l $.
+ where now $ \delta \vec{J} \in I\!\!R^l $ is a vector of
+ dimension $ l $.
  In this case $ l $ reverse simulations have to be performed
  for each $ \delta J_{k}, \,\, k = 1, \ldots, l $.
  Then, the reverse mode is more efficient as long as
  $ l < n $, otherwise the forward mode is preferable.
- Stricly, the reverse mode is called adjoint mode only for
+ Strictly, the reverse mode is called adjoint mode only for
  $ l = 1 $.
  A detailed analysis of the underlying numerical operations
-Line 503 
 operator onto the $j$-th component ${\bf
+Line 532 
 operator onto the $j$-th component ${\bf
  \paragraph{Example 2:
  $ {\cal J} = \langle \, {\cal H}(\vec{v}) - \vec{d} \, ,
   \, {\cal H}(\vec{v}) - \vec{d} \, \rangle $} ~ \\
- The cost function represents the quadratic model vs.data misfit.
+ The cost function represents the quadratic model vs. data misfit.
  Here, $ \vec{d} $ is the data vector and $ {\cal H} $ represents the
  operator which maps the model state space onto the data space.
  Then, $ \nabla_v {\cal J} $ takes the form
  %
  \begin{equation*}
- \begin{split}
+ \begin{aligned}
  \nabla_v {\cal J}^T & = \, 2 \, \, H \cdot
  \left( \, {\cal H}(\vec{v}) - \vec{d} \, \right) \\
  ~          & = \, 2 \sum_{j} \left\{ \sum_k
  \frac{\partial {\cal H}_k}{\partial v_{j}}
  \left( {\cal H}_k (\vec{v}) - d_k \right)
  \right\} \, {\vec{f}_{j}} \\
- \end{split}
+ \end{aligned}
  \end{equation*}
  %
  where $H_{kj} = \partial {\cal H}_k / \partial v_{j} $ is the
-Line 534 
 H \cdot \left( {\cal H}(\vec{v}) - \vec{
+Line 563 
 H \cdot \left( {\cal H}(\vec{v}) - \vec{
  We note an important aspect of the forward vs. reverse
  mode calculation.
- Because of the locality of the derivative,
+ Because of the local character of the derivative
+ (a derivative is defined w.r.t. a point along the trajectory),
  the intermediate results of the model trajectory
  $\vec{v}^{(\lambda+1)}={\cal M}_{\lambda}(v^{(\lambda)})$
- are needed to evaluate the intermediate Jacobian
+ may be required to evaluate the intermediate Jacobian
  $M_{\lambda}|_{\vec{v}^{(\lambda)}} \, \delta \vec{v}^{(\lambda)} $.
+ This is the case e.g. for nonlinear expressions
+ (momentum advection, nonlinear equation of state), state-dependent
+ conditional statements (parameterization schemes).
  In the forward mode, the intermediate results are required
  in the same order as computed by the full forward model ${\cal M}$,
- in the reverse mode they are required in the reverse order.
+ but in the reverse mode they are required in the reverse order.
  Thus, in the reverse mode the trajectory of the forward model
  integration ${\cal M}$ has to be stored to be available in the reverse
- calculation. Alternatively, the model state would have to be
+ calculation. Alternatively, the complete model state up to the
- recomputed whenever its value is required.
+ point of evaluation has to be recomputed whenever its value is required.
  A method to balance the amount of recomputations vs.
  storage requirements is called {\sf checkpointing}
- (e.g. \cite{res-eta:98}).
+ (e.g. \cite{gri:92}, \cite{res-eta:98}).
- It is depicted in Fig. ... for a 3-level checkpointing
+ It is depicted in \ref{fig:3levelcheck} for a 3-level checkpointing
- [as concrete example, we give explicit numbers for a 3-day
+ [as an example, we give explicit numbers for a 3-day
  integration with a 1-hourly timestep in square brackets].
  \begin{itemize}
  %
-Line 559 
 integration with a 1-hourly timestep in
+Line 592 
 integration with a 1-hourly timestep in
  In a first step, the model trajectory is subdivided into
  $ {n}^{lev3} $ subsections [$ {n}^{lev3} $=3 1-day intervals],
  with the label $lev3$ for this outermost loop.
- The model is then integrated over the full trajectory,
+ The model is then integrated along the full trajectory,
- and the model state stored only at every $ k_{i}^{lev3} $-th timestep
+ and the model state stored to disk only at every $ k_{i}^{lev3} $-th timestep
  [i.e. 3 times, at
  $ i = 0,1,2 $ corresponding to $ k_{i}^{lev3} = 0, 24, 48 $].
+ In addition, the cost function is computed, if needed.
  %
  \item [$lev2$]
- In a second step each subsection is itself divided into
+ In a second step each subsection itself is divided into
- $ {n}^{lev2} $ subsubsections
+ $ {n}^{lev2} $ subsections
  [$ {n}^{lev2} $=4 6-hour intervals per subsection].
  The model picks up at the last outermost dumped state
- $ v_{k_{n}^{lev3}} $ and is integrated forward in time over
+ $ v_{k_{n}^{lev3}} $ and is integrated forward in time along
  the last subsection, with the label $lev2$ for this
  intermediate loop.
- The model state is now stored only at every $ k_{i}^{lev2} $-th
+ The model state is now stored to disk at every $ k_{i}^{lev2} $-th
  timestep
  [i.e. 4 times, at
  $ i = 0,1,2,3 $ corresponding to $ k_{i}^{lev2} = 48, 54, 60, 66 $].
  %
  \item [$lev1$]
- Finally, the mode picks up at the last intermediate dump state
+ Finally, the model picks up at the last intermediate dump state
- $ v_{k_{n}^{lev2}} $ and is integrated forward in time over
+ $ v_{k_{n}^{lev2}} $ and is integrated forward in time along
- the last subsubsection, with the label $lev1$ for this
+ the last subsection, with the label $lev1$ for this
  intermediate loop.
- Within this subsubsection only, the model state is stored
+ Within this sub-subsection only, parts of the model state is stored
- at every timestep
+ to memory at every timestep
  [i.e. every hour $ i=0,...,5$ corresponding to
  $ k_{i}^{lev1} = 66, 67, \ldots, 71 $].
- Thus, the  final state $ v_n = v_{k_{n}^{lev1}} $ is reached
+ The  final state $ v_n = v_{k_{n}^{lev1}} $ is reached
- and the model state of all peceeding timesteps over the last
+ and the model state of all preceding timesteps along the last
- subsubsections are available, enabling integration backwards
+ innermost subsection are available, enabling integration backwards
- in time over the last subsubsection.
+ in time along the last subsection.
- Thus, the adjoint can be computed over this last
+ The adjoint can thus be computed along this last
- subsubsection $k_{n}^{lev2}$.
+ subsection $k_{n}^{lev2}$.
  %
  \end{itemize}
  %
  This procedure is repeated consecutively for each previous
- subsubsection $k_{n-1}^{lev2}, \ldots, k_{1}^{lev2} $
+ subsection $k_{n-1}^{lev2}, \ldots, k_{1}^{lev2} $
  carrying the adjoint computation to the initial time
  of the subsection $k_{n}^{lev3}$.
  Then, the procedure is repeated for the previous subsection
-Line 607 
 $k_{1}^{lev3}$.
+Line 641 
 $k_{1}^{lev3}$.
  For the full model trajectory of
  $ n^{lev3} \cdot n^{lev2} \cdot n^{lev1} $ timesteps
  the required storing of the model state was significantly reduced to
- $ n^{lev1} + n^{lev2} + n^{lev3} $
+ $ n^{lev2} + n^{lev3} $ to disk and roughly $ n^{lev1} $ to memory
  [i.e. for the 3-day integration with a total oof 72 timesteps
- the model state was stored 13 times].
+ the model state was stored 7 times to disk and roughly 6 times
+ to memory].
  This saving in memory comes at a cost of a required
 full forward integrations of the model (one for each
  checkpointing level).
- The balance of storage vs. recomputation certainly depends
+ The optimal balance of storage vs. recomputation certainly depends
- on the computing resources available.
+ on the computing resources available and may be adjusted by
+ adjusting the partitioning among the
+ $ n^{lev3}, \,\, n^{lev2}, \,\, n^{lev1} $.
  \begin{figure}[t!]
- \centering
+ \begin{center}
  %\psdraft
- \psfrag{v_k1^lev3}{\mathinfigure{v_{k_{1}^{lev3}}}}
+ %\psfrag{v_k1^lev3}{\mathinfigure{v_{k_{1}^{lev3}}}}
- \psfrag{v_kn-1^lev3}{\mathinfigure{v_{k_{n-1}^{lev3}}}}
+ %\psfrag{v_kn-1^lev3}{\mathinfigure{v_{k_{n-1}^{lev3}}}}
- \psfrag{v_kn^lev3}{\mathinfigure{v_{k_{n}^{lev3}}}}
+ %\psfrag{v_kn^lev3}{\mathinfigure{v_{k_{n}^{lev3}}}}
- \psfrag{v_k1^lev2}{\mathinfigure{v_{k_{1}^{lev2}}}}
+ %\psfrag{v_k1^lev2}{\mathinfigure{v_{k_{1}^{lev2}}}}
- \psfrag{v_kn-1^lev2}{\mathinfigure{v_{k_{n-1}^{lev2}}}}
+ %\psfrag{v_kn-1^lev2}{\mathinfigure{v_{k_{n-1}^{lev2}}}}
- \psfrag{v_kn^lev2}{\mathinfigure{v_{k_{n}^{lev2}}}}
+ %\psfrag{v_kn^lev2}{\mathinfigure{v_{k_{n}^{lev2}}}}
- \psfrag{v_k1^lev1}{\mathinfigure{v_{k_{1}^{lev1}}}}
+ %\psfrag{v_k1^lev1}{\mathinfigure{v_{k_{1}^{lev1}}}}
- \psfrag{v_kn^lev1}{\mathinfigure{v_{k_{n}^{lev1}}}}
+ %\psfrag{v_kn^lev1}{\mathinfigure{v_{k_{n}^{lev1}}}}
- \mbox{\epsfig{file=part5/checkpointing.eps, width=0.8\textwidth}}
+ %\mbox{\epsfig{file=s_autodiff/figs/checkpointing.eps, width=0.8\textwidth}}
+ \resizebox{5.5in}{!}{\includegraphics{s_autodiff/figs/checkpointing.eps}}
  %\psfull
- \caption
+ \end{center}
- {Schematic view of intermediate dump and restart for
+ \caption{
+ Schematic view of intermediate dump and restart for
 -level checkpointing.}
- \label{fig:erswns}
+ \label{fig:3levelcheck}
  \end{figure}
- \subsection{Optimal perturbations}
+ % \subsection{Optimal perturbations}
- \label{optpert}
+ % \label{sec_optpert}
- \subsection{Error covariance estimate and Hessian matrix}
+ % \subsection{Error covariance estimate and Hessian matrix}
- \label{sec_hessian}
+ % \label{sec_hessian}
  \newpage
  %**********************************************************************
- \section{AD-specific setup by example: sensitivity of carbon sequestration}
+ \section{TLM and ADM generation in general}
- \label{sec_ad_setup_ex}
+ \label{sec_ad_setup_gen}
+ \begin{rawhtml}
+ <!-- CMIREDIR:sec_ad_setup_gen: -->
+ \end{rawhtml}
  %**********************************************************************
- The MITGCM has been adapted to enable AD using TAMC or TAF
+ In this section we describe in a general fashion
- (we'll refer to TAMC and TAF interchangeably, except where
+ the parts of the code that are relevant for automatic
- distinctions are explicitly mentioned).
+ differentiation using the software tool TAF.
- The present description, therefore, is specific to the
+ Modifications to use OpenAD are described in \ref{sec_ad_openad}.
- use of TAMC as AD tool.
- The following sections describe the steps which are necessary to
+ \input{s_autodiff/text/doc_ad_the_model}
- generate a tangent linear or adjoint model of the MITGCM.
- We take as an example the sensitivity of carbon sequestration
+ The basic flow is depicted in \ref{fig:adthemodel}.
- in the ocean.
+ If CPP option \texttt{ALLOW\_AUTODIFF\_TAMC} is defined,
- The AD-relevant hooks in the code are sketched in
+ the driver routine
- \reffig{adthemodel}, \reffig{adthemain}.
+ {\it the\_model\_main}, instead of calling {\it the\_main\_loop},
+ invokes the adjoint of this routine, {\it adthe\_main\_loop}
- \subsection{Overview of the experiment}
+ (case \texttt{\#define ALLOW\_ADJOINT\_RUN}), or
+ the tangent linear of this routine {\it g\_the\_main\_loop}
- We describe an adjoint sensitivity analysis of outgassing from
+ (case \texttt{\#define ALLOW\_TANGENTLINEAR\_RUN}),
- the ocean into the atmosphere of a carbon like tracer injected
+ which are the toplevel routines in terms of automatic differentiation.
- into the ocean interior (see \cite{hil-eta:01}).
+ The routines {\it adthe\_main\_loop} or {\it g\_the\_main\_loop}
+ are generated by TAF.
- \subsubsection{Passive tracer equation}
+ It contains both the forward integration of the full model, the
+ cost function calculation,
- For this work the MITGCM was augmented with a thermodynamically
+ any additional storing that is required for efficient checkpointing,
- inactive tracer, $C$. Tracer residing in the ocean
+ and the reverse integration of the adjoint model.
- model surface layer is outgassed according to a relaxation time scale,
- $\mu$. Within the ocean interior, the tracer is passively advected
+ [DESCRIBE IN A SEPARATE SECTION THE WORKING OF THE TLM]
- by the ocean model currents. The full equation for the time evolution
- %
+ In Fig. \ref{fig:adthemodel}
- \begin{equation}
+ the structure of {\it adthe\_main\_loop} has been strongly
- \label{carbon_ddt}
+ simplified to focus on the essentials; in particular, no checkpointing
- \frac{\partial C}{\partial t} \, = \,
+ procedures are shown here.
- -U\cdot \nabla C \, - \, \mu C \, + \, \Gamma(C) \,+ \, S
+ Prior to the call of {\it adthe\_main\_loop}, the routine
- \end{equation}
+ {\it ctrl\_unpack} is invoked to unpack the control vector
- %
+ or initialise the control variables.
- also includes a source term $S$. This term
+ Following the call of {\it adthe\_main\_loop},
- represents interior sources of $C$ such as would arise due to
+ the routine {\it ctrl\_pack}
- direct injection.
+ is invoked to pack the control vector
- The velocity term, $U$, is the sum of the
+ (cf. Section \ref{section_ctrl}).
- model Eulerian circulation and an eddy-induced velocity, the latter
+ If gradient checks are to be performed, the option
- parameterized according to Gent/McWilliams (\cite{gen:90, dan:95}).
+ {\tt ALLOW\_GRADIENT\_CHECK} is defined. In this case
- The convection function, $\Gamma$, mixes $C$ vertically wherever the
+ the driver routine {\it grdchk\_main} is called after
- fluid is locally statically unstable.
+ the gradient has been computed via the adjoint
+ (cf. Section \ref{sec:ad_gradient_check}).
- The outgassing time scale, $\mu$, in eqn. (\ref{carbon_ddt})
- is set so that \( 1/\mu \sim 1 \ \mathrm{year} \) for the surface
+ %------------------------------------------------------------------
- ocean and $\mu=0$ elsewhere. With this value, eqn. (\ref{carbon_ddt})
- is valid as a prognostic equation for small perturbations in oceanic
+ \subsection{General setup
- carbon concentrations. This configuration provides a
+ \label{section_ad_setup}}
- powerful tool for examining the impact of large-scale ocean circulation
- on $ CO_2 $ outgassing due to interior injections.
+ In order to configure AD-related setups the following packages need
- As source we choose a constant in time injection of
+ to be enabled:
- $ S = 1 \,\, {\rm mol / s}$.
+ {\it
+ \begin{table}[!ht]
- \subsubsection{Model configuration}
+ \begin{tabular}{l}
+ autodiff \\
- The model configuration employed has a constant
+ ctrl \\
- $4^\circ \times 4^\circ$ resolution horizontal grid and realistic
+ cost \\
- geography and bathymetry. Twenty vertical layers are used with
+ grdchk \\
- vertical spacing ranging
+ \end{tabular}
- from 50 m near the surface to 815 m at depth.
+ \end{table}
- Driven to steady-state by climatalogical wind-stress, heat and
+ }
- fresh-water forcing the model reproduces well known large-scale
+ The packages are enabled by adding them to your experiment-specific
- features of the ocean general circulation.
+ configuration file
+ {\it packages.conf} (see Section ???).
- \subsubsection{Outgassing cost function}
+ The following AD-specific CPP option files need to be customized:
- To quantify and understand outgassing due to injections of $C$
- in eqn. (\ref{carbon_ddt}),
- we define a cost function $ {\cal J} $ that measures the total amount of
- tracer outgassed at each timestep:
- %
- \begin{equation}
- \label{cost_tracer}
- {\cal J}(t=T)=\int_{t=0}^{t=T}\int_{A} \mu C \, dA \, dt
- \end{equation}
- %
- Equation(\ref{cost_tracer}) integrates the outgassing term, $\mu C$,
- from (\ref{carbon_ddt})
- over the entire ocean surface area, $A$, and accumulates it
- up to time $T$.
- Physically, ${\cal J}$ can be thought of as representing the amount of
- $CO_2$ that our model predicts would be outgassed following an
- injection at rate $S$.
- The sensitivity of ${\cal J}$ to the spatial location of $S$,
- $\frac{\partial {\cal J}}{\partial S}$,
- can be used to identify regions from which circulation
- would cause $CO_2$ to rapidly outgas following injection
- and regions in which $CO_2$ injections would remain effectively
- sequesterd within the ocean.
- \subsection{Code configuration}
- The model configuration for this experiment resides under the
- directory {\it verification/carbon/}.
- The code customisation routines are in {\it verification/carbon/code/}:
  %
  \begin{itemize}
  %
- \item {\it .genmakerc}
+ \item {\it ECCO\_CPPOPTIONS.h} \\
- %
+ This header file collects CPP options for the packages
- \item {\it COST\_CPPOPTIONS.h}
+ {\it autodiff, cost, ctrl} as well as AD-unrelated options for
- %
+ the external forcing package {\it exf}.
- \item {\it CPP\_EEOPTIONS.h}
+ \footnote{NOTE: These options are not set in their package-specific
- %
+ headers such as {\it COST\_CPPOPTIONS.h}, but are instead collected
- \item {\it CPP\_OPTIONS.h}
+ in the single header file {\it ECCO\_CPPOPTIONS.h}.
- %
+ The package-specific header files serve as simple
- \item {\it CTRL\_OPTIONS.h}
+ placeholders at this point.}
  %
- \item {\it ECCO\_OPTIONS.h}
+ \item {\it tamc.h} \\
- %
+ This header configures the splitting of the time stepping loop
- \item {\it SIZE.h}
+ w.r.t. the 3-level checkpointing (see section ???).
- %
- \item {\it adcommon.h}
- %
- \item {\it tamc.h}
  %
  \end{itemize}
+ %------------------------------------------------------------------
+ \subsection{Building the AD code using TAF
+ \label{section_ad_build}}
+ The build process of an AD code is very similar to building
+ the forward model. However, depending on which AD code one wishes
+ to generate, and on which AD tool is available (TAF or TAMC),
+ the following {\tt make} targets are available:
+ \begin{table}[!ht]
+ {\footnotesize
+ \begin{tabular}{|ccll|}
+ \hline
+ ~ & {\it AD-target} & {\it output} & {\it description} \\
+ \hline
+ \hline
+ (1) & {\tt <MODE><TOOL>only} & {\tt <MODE>\_<TOOL>\_output.f}  &
+ generates code for $<$MODE$>$ using $<$TOOL$>$ \\
+ ~ & ~ & ~ & no {\tt make} dependencies on {\tt .F .h} \\
+ ~ & ~ & ~ & useful for compiling on remote platforms \\
+ \hline
+ (2) & {\tt <MODE><TOOL>} & {\tt <MODE>\_<TOOL>\_output.f}  &
+ generates code for $<$MODE$>$ using $<$TOOL$>$ \\
+ ~ & ~ & ~ & includes {\tt make} dependencies on {\tt .F .h} \\
+ ~ & ~ & ~ & i.e. input for $<$TOOL$>$ may be re-generated \\
+ \hline
+ (3) & {\tt <MODE>all} & {\tt mitgcmuv\_<MODE>}  &
+ generates code for $<$MODE$>$ using $<$TOOL$>$ \\
+ ~ & ~ & ~ & and compiles all code \\
+ ~ & ~ & ~ & (use of TAF is set as default) \\
+ \hline
+ \end{tabular}
+ }
+ \end{table}
  %
- The runtime flag and parameters settings are contained in
+ Here, the following placeholders are used
- {\it verification/carbon/input/},
- together with the forcing fields and and restart files:
  %
  \begin{itemize}
  %
- \item {\it data}
+ \item $<$TOOL$>$
  %
- \item {\it data.cost}
+ \begin{itemize}
- %
- \item {\it data.ctrl}
- %
- \item {\it data.pkg}
- %
- \item {\it eedata}
- %
- \item {\it topog.bin}
- %
- \item {\it windx.bin, windy.bin}
- %
- \item {\it salt.bin, theta.bin}
- %
- \item {\it SSS.bin, SST.bin}
  %
- \item {\it pickup*}
+ \item {\tt TAF}
+ \item {\tt TAMC}
  %
  \end{itemize}
  %
- Finally, the file to generate the adjoint code resides in
+ \item $<$MODE$>$
- $ adjoint/ $:
  %
  \begin{itemize}
  %
- \item {\it makefile}
+ \item {\tt ad} generates the adjoint model (ADM)
+ \item {\tt ftl} generates the tangent linear model (TLM)
+ \item {\tt svd} generates both ADM and TLM for \\
+ singular value decomposition (SVD) type calculations
  %
  \end{itemize}
  %
+ \end{itemize}
- Below we describe the customisations of this files which are
+ For example, to generate the adjoint model using TAF after routines ({\tt .F})
- specific to this experiment.
+ or headers ({\tt .h}) have been modified, but without compilation,
+ type {\tt make adtaf};
- \subsubsection{File {\it .genmakerc}}
+ or, to generate the tangent linear model using TAMC without
- This file overwites default settings of {\it genmake}.
+ re-generating the input code, type {\tt make ftltamconly}.
- In the present example it is used to switch on the following
- packages which are related to automatic differentiation
- and are disabled by default: \\
- \hspace*{4ex} {\tt set ENABLE=( autodiff cost ctrl ecco )}  \\
- Other packages which are not needed are switched off: \\
- \hspace*{4ex} {\tt set DISABLE=( aim obcs zonal\_filt shap\_filt cal exf )}
- \subsubsection{File {\it COST\_CPPOPTIONS.h,  CTRL\_OPTIONS.h}}
- These files used to contain package-specific CPP-options
- (see Section \ref{???}).
- For technical reasons those options have been grouped together
- in the file {\it ECCO\_OPTIONS.h}.
- To retain the modularity, the files have been kept and contain
- the standard include of the {\it CPP\_OPTIONS.h} file.
- \subsubsection{File {\it CPP\_EEOPTIONS.h}}
- This file contains 'wrapper'-specific CPP options.
- It only needs to be changed if the code is to be run
- in  parallel environment (see Section \ref{???}).
- \subsubsection{File {\it CPP\_OPTIONS.h}}
- This file contains model-specific CPP options
- (see Section \ref{???}).
- Most options are related to the forward model setup.
- They are identical to the global steady circulation setup of
- {\it verification/exp2/}.
- The option specific to this experiment is \\
- \hspace*{4ex} {\tt \#define ALLOW\_MIT\_ADJOINT\_RUN} \\
- This flag enables the inclusion of some AD-related fields
- concerning initialisation, link between control variables
- and forward model variables, and the call to the top-level
- forward/adjoint subroutine {\it adthe\_main\_loop}
- instead of {\it the\_main\_loop}.
- \subsubsection{File {\it ECCO\_OPTIONS.h}}
- The CPP options of several AD-related packages are grouped
- in this file:
- %
- \begin{itemize}
- %
- \item
- Adjoint support package: {\it pkg/autodiff/} \\
- This package contains hand-written adjoint code such as
- active file handling, flow directives for files which must not
- be differentiated, and TAMC-specific header files. \\
- \hspace*{4ex} {\tt \#define ALLOW\_AUTODIFF\_TAMC} \\
- defines TAMC-related features in the code. \\
- \hspace*{4ex} {\tt \#define ALLOW\_TAMC\_CHECKPOINTING} \\
- enables the checkpointing feature of TAMC
- (see Section \ref{???}).
- In the present example a 3-level checkpointing is implemented.
- The code contains the relevant store directives, common block
- and tape initialisations, storing key computation,
- and loop index handling.
- The checkpointing length at each level is defined in
- file {\it tamc.h}, cf. below.
- %
- \item Cost function package: {\it pkg/cost/} \\
- This package contains all relevant routines for
- initialising, accumulating and finalizing the cost function
- (see Section \ref{???}). \\
- \hspace*{4ex} {\tt \#define ALLOW\_COST} \\
- enables all general aspects of the cost function handling,
- in particular the hooks in the foorward code for
- initialising, accumulating and finalizing the cost function. \\
- \hspace*{4ex} {\tt \#define ALLOW\_COST\_TRACER} \\
- includes the subroutine with the cost function for this
- particular experiment, eqn. (\ref{cost_tracer}).
- %
- \item Control variable package: {\it pkg/ctrl/} \\
- This package contains all relevant routines for
- the handling of the control vector.
- Each control variable can be enabled/disabled with its own flag: \\
- \begin{tabular}{ll}
- \hspace*{2ex} {\tt \#define ALLOW\_THETA0\_CONTROL} &
- initial temperature \\
- \hspace*{2ex} {\tt \#define ALLOW\_SALT0\_CONTROL} &
- initial salinity \\
- \hspace*{2ex} {\tt \#define ALLOW\_TR0\_CONTROL} &
- initial passive tracer concentration \\
- \hspace*{2ex} {\tt \#define ALLOW\_TAUU0\_CONTROL} &
- zonal wind stress \\
- \hspace*{2ex} {\tt \#define ALLOW\_TAUV0\_CONTROL} &
- meridional wind stress \\
- \hspace*{2ex} {\tt \#define ALLOW\_SFLUX0\_CONTROL} &
- freshwater flux \\
- \hspace*{2ex} {\tt \#define ALLOW\_HFLUX0\_CONTROL} &
- heat flux \\
- \hspace*{2ex} {\tt \#undef ALLOW\_DIFFKR\_CONTROL} &
- diapycnal diffusivity \\
- \hspace*{2ex} {\tt \#undef ALLOW\_KAPPAGM\_CONTROL} &
- isopycnal diffusivity \\
- \end{tabular}
- %
- \end{itemize}
- \subsubsection{File {\it SIZE.h}}
+ A typical full build process to generate the ADM via TAF would
+ look like follows:
+ \begin{verbatim}
+ % mkdir build
+ % cd build
+ % ../../../tools/genmake2 -mods=../code_ad
+ % make depend
+ % make adall
+ \end{verbatim}
- The file contains the grid point dimensions of the forward
+ %------------------------------------------------------------------
- model. It is identical to the {\it verification/exp2/}: \\
- \hspace*{4ex} {\tt sNx = 90} \\
- \hspace*{4ex} {\tt sNy = 40} \\
- \hspace*{4ex} {\tt Nr = 20} \\
- It correpsponds to a single-tile/single-processor setup:
- {\tt nSx = nSy = 1, nPx = nPy = 1},
- with standard overlap dimensioning
- {\tt OLx = OLy = 3}.
- \subsubsection{File {\it adcommon.h}}
- This file contains common blocks of some adjoint variables
- that are generated by TAMC.
- The common blocks are used by the adjoint support routine
- {\it addummy\_in\_stepping} which needs to access those variables:
- \begin{tabular}{ll}
- \hspace*{4ex} {\tt common /addynvars\_r/} &
- \hspace*{4ex} is related to {\it DYNVARS.h} \\
- \hspace*{4ex} {\tt common /addynvars\_cd/} &
- \hspace*{4ex} is related to {\it DYNVARS.h} \\
- \hspace*{4ex} {\tt common /adtr1\_r/} &
- \hspace*{4ex} is related to {\it TR1.h} \\
- \hspace*{4ex} {\tt common /adffields/} &
- \hspace*{4ex} is related to {\it FFIELDS.h}\\
- \end{tabular}
- Note that if the structure of the common block changes in the
+ \subsection{The AD build process in detail
- above header files of the forward code, the structure
+ \label{section_ad_build_detail}}
- of the adjoint common blocks will change accordingly.
- Thus, it has to be made sure that the structure of the
- adjoint common block in the hand-written file {\it adcommon.h}
- complies with the automatically generated adjoint common blocks
- in {\it adjoint\_model.F}.
- \subsubsection{File {\it tamc.h}}
+ The {\tt make <MODE>all} target consists of the following procedures:
- This routine contains the dimensions for TAMC checkpointing.
+ \begin{enumerate}
  %
+ \item
+ A header file {\tt AD\_CONFIG.h} is generated which contains a CPP option
+ on which code ought to be generated. Depending on the {\tt make} target,
+ the contents is one of the following:
  \begin{itemize}
+ \item
+ {\tt \#define ALLOW\_ADJOINT\_RUN}
+ \item
+ {\tt \#define ALLOW\_TANGENTLINEAR\_RUN}
+ \item
+ {\tt \#define ALLOW\_ECCO\_OPTIMIZATION}
+ \end{itemize}
  %
- \item {\tt \#ifdef ALLOW\_TAMC\_CHECKPOINTING} \\
+ \item
--level checkpointing is enabled, i.e. the timestepping
+ A single file {\tt <MODE>\_input\_code.f} is concatenated
- is divided into three different levels (see Section \ref{???}).
+ consisting of all {\tt .f} files that are part of the list {\bf AD\_FILES}
- The model state of the outermost ({\tt nchklev\_3}) and the
+ and all {\tt .flow} files that are part of the list {\bf AD\_FLOW\_FILES}.
- itermediate ({\tt nchklev\_2}) timestepping loop are stored to file
+ %
- (handled in {\it the\_main\_loop}).
+ \item
- The innermost loop ({\tt nchklev\_1})
+ The AD tool is invoked with the {\tt <MODE>\_<TOOL>\_FLAGS}.
- avoids I/O by storing all required variables
+ The default AD tool flags in {\tt genmake2} can be overrwritten by
- to common blocks. This storing may also be necessary if
+ an {\tt adjoint\_options} file (similar to the platform-specific
- no checkpointing is chosen
+ {\tt build\_options}, see Section ???.
- (nonlinear functions, if-statements, iterative loops, ...).
+ The AD tool writes the resulting AD code into the file
- In the present example the dimensions are chosen as follows: \\
+ {\tt <MODE>\_input\_code\_ad.f}
- \hspace*{4ex} {\tt nchklev\_1      =  36 } \\
+ %
- \hspace*{4ex} {\tt nchklev\_2      =  30 } \\
+ \item
- \hspace*{4ex} {\tt nchklev\_3      =  60 } \\
+ A short sed script {\tt adjoint\_sed} is applied to
- To guarantee that the checkpointing intervals span the entire
+ {\tt <MODE>\_input\_code\_ad.f}
- integration period the relation \\
+ to reinstate {\bf myThid} into the CALL argument list of active file I/O.
- \hspace*{4ex} {\tt nchklev\_1*nchklev\_2*nchklev\_3 $ \ge $ nTimeSteps} \\
+ The result is written to file {\tt <MODE>\_<TOOL>\_output.f}.
- where {\tt nTimeSteps} is either specified in {\it data}
+ %
- or computed via \\
+ \item
- \hspace*{4ex} {\tt nTimeSteps = (endTime-startTime)/deltaTClock }.
+ All routines are compiled and an executable is generated
- %
+ (see Table ???).
- \item {\tt \#undef ALLOW\_TAMC\_CHECKPOINTING} \\
- No checkpointing is enabled.
- In this case the relevant counter is {\tt nchklev\_0}.
- Similar to above, the following relation has to be satisfied \\
- \hspace*{4ex} {\tt nchklev\_0 $ \ge $ nTimeSteps}.
  %
- \end{itemize}
+ \end{enumerate}
- \subsubsection{File {\it makefile}}
+ \subsubsection{The list AD\_FILES and {\tt .list} files}
- This file contains all relevant paramter flags and
+ Not all routines are presented to the AD tool.
- lists to run TAMC.
+ Routines typically hidden are diagnostics routines which
- It is assumed that TAMC is available to you, either locally,
+ do not influence the cost function, but may create
- being installed on your network, or remotely through the 'TAMC Utility'.
+ artificial flow dependencies such as I/O of active variables.
- TAMC is called with the command {\tt tamc} followed by a
- number of options. They are described in detail in the
+ {\tt genmake2} generates a list (or variable) {\bf AD\_FILES}
- TAMC manual \cite{gie:99}.
+ which contains all routines that are shown to the AD tool.
- Here we briefly discuss the main flags used in the {\it makefile}
+ This list is put together from all files with suffix {\tt .list}
+ that {\tt genmake2} finds in its search directories.
+ The list file for the core MITgcm routines is in {\tt model/src/}
+ is called {\tt model\_ad\_diff.list}.
+ Note that no wrapper routine is shown to TAF. These are either
+ not visible at all to the AD code, or hand-written AD code
+ is available (see next section).
+ Each package directory contains its package-specific
+ list file {\tt <PKG>\_ad\_diff.list}. For example,
+ {\tt pkg/ptracers/} contains the file {\tt ptracers\_ad\_diff.list}.
+ Thus, enabling a package will automatically extend the
+ {\bf AD\_FILES} list of {\tt genmake2} to incorporate the
+ package-specific routines.
+ Note that you will need to regenerate the {\tt Makefile} if
+ you enable a package (e.g. by adding it to {\tt packages.conf})
+ and a {\tt Makefile} already exists.
+ \subsubsection{The list AD\_FLOW\_FILES and {\tt .flow} files}
+ TAMC and TAF can evaluate user-specified directives
+ that start with a specific syntax ({\tt CADJ}, {\tt C\$TAF}, {\tt !\$TAF}).
+ The main categories of directives are STORE directives and
+ FLOW directives. Here, we are concerned with flow directives,
+ store directives are treated elsewhere.
+ Flow directives enable the AD tool to evaluate how it should treat
+ routines that are 'hidden' by the user, i.e. routines which are
+ not contained in the {\bf AD\_FILES} list (see previous section),
+ but which are called in part of the code that the AD tool does see.
+ The flow directive tell the AD tool
  %
  \begin{itemize}
- \item [{\tt tamc}] {\tt
- -input <variable names>
- -output <variable name> ... \\
- -toplevel <S/R name> -reverse <file names>
- }
- \end{itemize}
  %
- \begin{itemize}
+ \item which subroutine arguments are input/output
- %
+ \item which subroutine arguments are active
- \item {\tt -toplevel <S/R name>} \\
+ \item which subroutine arguments are required to compute the cost
- Name of the toplevel routine, with respect to which the
+ \item which subroutine arguments are dependent
- control flow analysis is performed.
- %
- \item {\tt -input <variable names>} \\
- List of independent variables $ u $ with respect to which the
- dependent variable $ J $ is differentiated.
- %
- \item {\tt -output <variable name>} \\
- Dependent variable $ J $  which is to be differentiated.
- %
- \item {\tt -reverse <file names>} \\
- Adjoint code is generated to compute the sensitivity of an
- independent variable w.r.t.  many dependent variables.
- The generated adjoint top-level routine computes the product
- of the transposed Jacobian matrix $ M^T $ times
- the gradient vector $ \nabla_v J $.
- \\
- {\tt <file names>} refers to the list of files {\it .f} which are to be
- analyzed by TAMC. This list is generally smaller than the full list
- of code to be compiled. The files not contained are either
- above the top-level routine (some initialisations), or are
- deliberately hidden from TAMC, either because hand-written
- adjoint routines exist, or the routines must not (or don't have to)
- be differentiated. For each routine which is part of the flow tree
- of the top-level routine, but deliberately hidden from TAMC,
- a corresponding file {\it .flow} exists containing flow directives
- for TAMC.
  %
  \end{itemize}
+ %
+ The syntax for the flow directives can be found in the
+ AD tool manuals.
+ {\tt genmake2} generates a list (or variable) {\bf AD\_FLOW\_FILES}
+ which contains all files with suffix{\tt .flow} that it finds
+ in its search directories.
+ The flow directives for the core MITgcm routines of
+ {\tt eesupp/src/} and {\tt model/src/}
+ reside in {\tt pkg/autodiff/}.
+ This directory also contains hand-written adjoint code
+ for the MITgcm WRAPPER (section \ref{chap:sarch}).
+ Flow directives for package-specific routines are contained in
+ the corresponding package directories in the file
+ {\tt <PKG>\_ad.flow}, e.g. ptracers-specific directives are in
+ {\tt ptracers\_ad.flow}.
+ \subsubsection{Store directives for 3-level checkpointing}
+ The storing that is required at each period of the
+-level checkpointing is controled by three
+ top-level headers.
- \subsubsection{File {\it data}}
+ \begin{verbatim}
+ do ilev_3 = 1, nchklev_3
- \subsubsection{File {\it data.cost}}
+ #  include ``checkpoint_lev3.h''
+    do ilev_2 = 1, nchklev_2
- \subsubsection{File {\it data.ctrl}}
+ #     include ``checkpoint_lev2.h''
+       do ilev_1 = 1, nchklev_1
- \subsubsection{File {\it data.pkg}}
+ #        include ``checkpoint_lev1.h''
- \subsubsection{File {\it eedata}}
+ ...
- \subsubsection{File {\it topog.bin}}
+       end do
+    end do
- \subsubsection{File {\it windx.bin, windy.bin}}
+ end do
+ \end{verbatim}
- \subsubsection{File {\it salt.bin, theta.bin}}
- \subsubsection{File {\it SSS.bin, SST.bin}}
+ All files {\tt checkpoint\_lev?.h} are contained in directory
+ {\tt pkg/autodiff/}.
- \subsubsection{File {\it pickup*}}
- \subsection{Compiling the model and its adjoint}
+ \subsubsection{Changing the default AD tool flags: ad\_options files}
- \newpage
- %**********************************************************************
+ \subsubsection{Hand-written adjoint code}
- \section{TLM and ADM code generation in general}
- \label{sec_ad_setup_gen}
- %**********************************************************************
- In this section we describe in a general fashion
+ %------------------------------------------------------------------
- the parts of the code that are relevant for automatic
- differentiation using the software tool TAMC.
- \subsection{The cost function (dependent variable)}
+ \subsection{The cost function (dependent variable)
+ \label{section_cost}}
  The cost function $ {\cal J} $ is referred to as the {\sf dependent variable}.
  It is a function of the input variables $ \vec{u} $ via the composition
  $ {\cal J}(\vec{u}) \, = \, {\cal J}(M(\vec{u})) $.
- The input is referred to as the
+ The input are referred to as the
  {\sf independent variables} or {\sf control variables}.
  All aspects relevant to the treatment of the cost function $ {\cal J} $
- (parameter setting, initialisation, incrementation,
+ (parameter setting, initialization, accumulation,
- final evaluation), are controled by the package {\it pkg/cost}.
+ final evaluation), are controlled by the package {\it pkg/cost}.
+ The aspects relevant to the treatment of the independent variables
+ are controlled by the package {\it pkg/ctrl} and will be treated
+ in the next section.
+ \input{s_autodiff/text/doc_cost_flow}
+ \subsubsection{Enabling the package}
- \subsubsection{genmake and CPP options}
- %
- \begin{itemize}
- %
- \item
  \fbox{
  \begin{minipage}{12cm}
- {\it genmake}, {\it CPP\_OPTIONS.h}, {\it ECCO\_CPPOPTIONS.h}
+ {\it packages.conf}, {\it ECCO\_CPPOPTIONS.h}
  \end{minipage}
  }
- \end{itemize}
+ \begin{itemize}
  %
- The directory {\it pkg/cost} can be included to the
+ \item
- compile list in 3 different ways (cf. Section \ref{???}):
+ The package is enabled by adding {\it cost} to your file {\it packages.conf}
+ (see Section ???)
  %
- \begin{enumerate}
+ \item
- %
- \item {\it genmake}: \\
- Change the default settngs in the file {\it genmake} by adding
+ \end{itemize}
- {\bf cost} to the {\bf enable} list (not recommended).
- %
- \item {\it .genmakerc}: \\
- Customize the settings of {\bf enable}, {\bf disable} which are
- appropriate for your experiment in the file {\it .genmakerc}
- and add the file to your compile directory.
- %
- \item genmake-options: \\
- Call {\it genmake} with the option
- {\tt genmake -enable=cost}.
  %
- \end{enumerate}
- Since the cost function is usually used in conjunction with
+ N.B.: In general the following packages ought to be enabled
- automatic differentiation, the CPP option
+ simultaneously: {\it autodiff, cost, ctrl}.
- {\bf ALLOW\_ADJOINT\_RUN} should be defined
- (file {\it CPP\_OPTIONS.h}).
  The basic CPP option to enable the cost function is {\bf ALLOW\_COST}.
  Each specific cost function contribution has its own option.
  For the present example the option is {\bf ALLOW\_COST\_TRACER}.
  All cost-specific options are set in {\it ECCO\_CPPOPTIONS.h}
+ Since the cost function is usually used in conjunction with
+ automatic differentiation, the CPP option
+ {\bf ALLOW\_ADJOINT\_RUN} (file {\it CPP\_OPTIONS.h}) and
+ {\bf ALLOW\_AUTODIFF\_TAMC} (file {\it ECCO\_CPPOPTIONS.h})
+ should be defined.
- \subsubsection{Initialisation}
+ \subsubsection{Initialization}
  %
- The initialisation of the {\it cost} package is readily enabled
+ The initialization of the {\it cost} package is readily enabled
- as soon as the CPP option {\bf ALLOW\_ADJOINT\_RUN} is defined.
+ as soon as the CPP option {\bf ALLOW\_COST} is defined.
  %
  \begin{itemize}
  %
-Line 1152 
 Variables: {\it cost\_init}
+Line 1081 
 Variables: {\it cost\_init}
  }
  \\
  This S/R
- initialises the different cost function contributions.
+ initializes the different cost function contributions.
- The contribtion for the present example is {\bf objf\_tracer}
+ The contribution for the present example is {\bf objf\_tracer}
  which is defined on each tile (bi,bj).
  %
  \end{itemize}
  %
- \subsubsection{Incrementation}
+ \subsubsection{Accumulation}
  %
  \begin{itemize}
  %
-Line 1176 
 Within this 'driver' routine, S/R are ca
+Line 1105 
 Within this 'driver' routine, S/R are ca
  the chosen cost function contributions.
  In the present example ({\bf ALLOW\_COST\_TRACER}),
  S/R {\it cost\_tracer} is called.
- It accumulates {\bf objf\_tracer} according to eqn. (\ref{???}).
+ It accumulates {\bf objf\_tracer} according to eqn. (ref:ask-the-author).
  %
  \subsubsection{Finalize all contributions}
  %
-Line 1196 
 from each contribution and sums over all
+Line 1125 
 from each contribution and sums over all
  \begin{equation}
  {\cal J} \, = \,
  {\rm fc} \, = \,
- {\rm mult\_tracer} \sum_{bi,\,bj}^{nSx,\,nSy}
+ {\rm mult\_tracer} \sum_{\text{global sum}} \sum_{bi,\,bj}^{nSx,\,nSy}
  {\rm objf\_tracer}(bi,bj) \, + \, ...
  \end{equation}
  %
  The total cost function {\bf fc} will be the
- 'dependent' variable in the argument list for TAMC, i.e.
+ 'dependent' variable in the argument list for TAF, i.e.
  \begin{verbatim}
- tamc -output 'fc' ...
+ taf -output 'fc' ...
  \end{verbatim}
- \begin{figure}[t!]
+ %%%% \end{document}
- \input{part5/doc_ad_the_model}
- \label{fig:adthemodel}
- \caption{~}
- \end{figure}
- \begin{figure}
+ \input{s_autodiff/text/doc_ad_the_main}
- \input{part5/doc_ad_the_main}
- \label{fig:adthemain}
- \caption{~}
- \end{figure}
- \subsection{The control variables (independent variables)}
+ \subsection{The control variables (independent variables)
+ \label{section_ctrl}}
  The control variables are a subset of the model input
  (initial conditions, boundary conditions, model parameters).
  Here we identify them with the variable $ \vec{u} $.
  All intermediate variables whose derivative w.r.t. control
- variables don't vanish are called {\sf active variables}.
+ variables do not vanish are called {\sf active variables}.
  All subroutines whose derivative w.r.t. the control variables
  don't vanish are called {\sf active routines}.
  Read and write operations from and to file can be viewed
-Line 1232 
 as variable assignments. Therefore, file
+Line 1154 
 as variable assignments. Therefore, file
  active variables are written and from which active variables
  are read are called {\sf active files}.
  All aspects relevant to the treatment of the control variables
- (parameter setting, initialisation, perturbation)
+ (parameter setting, initialization, perturbation)
- are controled by the package {\it pkg/ctrl}.
+ are controlled by the package {\it pkg/ctrl}.
+ \input{s_autodiff/text/doc_ctrl_flow}
  \subsubsection{genmake and CPP options}
  %
-Line 1249 
 are controled by the package {\it pkg/ct
+Line 1173 
 are controled by the package {\it pkg/ct
  %
  To enable the directory to be included to the compile list,
  {\bf ctrl} has to be added to the {\bf enable} list in
- {\it .genmakerc} (or {\it genmake} itself).
+ {\it .genmakerc} or in {\it genmake} itself (analogous to {\it cost}
+ package, cf. previous section).
  Each control variable is enabled via its own CPP option
  in {\it ECCO\_CPPOPTIONS.h}.
- \subsubsection{Initialisation}
+ \subsubsection{Initialization}
  %
  \begin{itemize}
  %
-Line 1290 
 and their gradients: {\it ctrl\_unpack}
+Line 1215 
 and their gradients: {\it ctrl\_unpack}
  \\
  %
  Two important issues related to the handling of the control
- variables in the MITGCM need to be addressed.
+ variables in MITgcm need to be addressed.
  First, in order to save memory, the control variable arrays
  are not kept in memory, but rather read from file and added
- to the initial (or first guess) fields.
+ to the initial fields during the model initialization phase.
  Similarly, the corresponding adjoint fields which represent
  the gradient of the cost function w.r.t. the control variables
- are written to to file.
+ are written to file at the end of the adjoint integration.
  Second, in addition to the files holding the 2-dim. and 3-dim.
- control variables and the gradient, a 1-dim. {\sf control vector}
+ control variables and the corresponding cost gradients,
+ a 1-dim. {\sf control vector}
  and {\sf gradient vector} are written to file. They contain
  only the wet points of the control variables and the corresponding
  gradient.
  This leads to a significant data compression.
- Furthermore, the control and the gradient vector can be passed to a
+ Furthermore, an option is available
+ ({\tt ALLOW\_NONDIMENSIONAL\_CONTROL\_IO}) to
+ non-dimensionalise the control and gradient vector,
+ which otherwise would contain different pieces of different
+ magnitudes and units.
+ Finally, the control and gradient vector can be passed to a
  minimization routine if an update of the control variables
  is sought as part of a minimization exercise.
-Line 1314 
 and gradient are generated and initialis
+Line 1245 
 and gradient are generated and initialis
  \subsubsection{Perturbation of the independent variables}
  %
- The dependency chain for differentiation starts
+ The dependency flow for differentiation w.r.t. the controls
- with adding a perturbation onto the the input variable,
+ starts with adding a perturbation onto the input variable,
- thus defining the independent or control variables for TAMC.
+ thus defining the independent or control variables for TAF.
- Three classes of controls may be considered:
+ Three types of controls may be considered:
  %
  \begin{itemize}
  %
-Line 1332 
 Three classes of controls may be conside
+Line 1263 
 Three classes of controls may be conside
  Consider as an example the initial tracer distribution
  {\bf tr1} as control variable.
  After {\bf tr1} has been initialised in
- {\it ini\_tr1} (dynamical variables including
+ {\it ini\_tr1} (dynamical variables such as
  temperature and salinity are initialised in {\it ini\_fields}),
  a perturbation anomaly is added to the field in S/R
  {\it ctrl\_map\_ini}
  %
+ %\begin{eqnarray}
  \begin{equation}
- \begin{split}
+ \begin{aligned}
  u         & = \, u_{[0]} \, + \, \Delta u \\
  {\bf tr1}(...) & = \, {\bf tr1_{ini}}(...) \, + \, {\bf xx\_tr1}(...)
  \label{perturb}
- \end{split}
+ \end{aligned}
  \end{equation}
+ %\end{eqnarray}
  %
- In principle {\bf xx\_tr1} is a 3-dim. global array
+ {\bf xx\_tr1} is a 3-dim. global array
  holding the perturbation. In the case of a simple
  sensitivity study this array is identical to zero.
- However, it's specification is essential since TAMC
+ However, it's specification is essential in the context
+ of automatic differentiation since TAF
  treats the corresponding line in the code symbolically
  when determining the differentiation chain and its origin.
  Thus, the variable names are part of the argument list
- when calling TAMC:
+ when calling TAF:
  %
  \begin{verbatim}
- tamc -input 'xx_tr1 ...' ...
+ taf -input 'xx_tr1 ...' ...
  \end{verbatim}
  %
- Now, as mentioned above, the MITGCM avoids maintaining
+ Now, as mentioned above, MITgcm avoids maintaining
  an array for each control variable by reading the
  perturbation to a temporary array from file.
- To ensure the symbolic link to be recognized by TAMC, a scalar
+ To ensure the symbolic link to be recognized by TAF, a scalar
  dummy variable {\bf xx\_tr1\_dummy} is introduced
  and an 'active read' routine of the adjoint support
  package {\it pkg/autodiff} is invoked.
  The read-procedure is tagged with the variable
- {\bf xx\_tr1\_dummy} enabbling TAMC to recognize the
+ {\bf xx\_tr1\_dummy} enabling TAF to recognize the
- initialisation of the perturbation.
+ initialization of the perturbation.
- The modified call of TAMC thus reads
+ The modified call of TAF thus reads
  %
  \begin{verbatim}
- tamc -input 'xx_tr1_dummy ...' ...
+ taf -input 'xx_tr1_dummy ...' ...
  \end{verbatim}
  %
  and the modified operation to (\ref{perturb})
-Line 1386 
 in the code takes on the form
+Line 1320 
 in the code takes on the form
  %
  Note, that reading an active variable corresponds
  to a variable assignment. Its derivative corresponds
- to a write statement of the adjoint variable.
+ to a write statement of the adjoint variable, followed by
+ a reset.
  The 'active file' routines have been designed
- to support active read and corresponding active write
+ to support active read and corresponding adjoint active write
- operations.
+ operations (and vice versa).
  %
  \item
  \fbox{
-Line 1406 
 with the symbolic perturbation taking pl
+Line 1341 
 with the symbolic perturbation taking pl
  Note however an important difference:
  Since the boundary values are time dependent with a new
  forcing field applied at each time steps,
- the general problem may be be thought of as
+ the general problem may be thought of as
- a new control variable at each time step, i.e.
+ a new control variable at each time step
+ (or, if the perturbation is averaged over a certain period,
+ at each $ N $ timesteps), i.e.
  \[
  u_{\rm forcing} \, = \,
  \{ \, u_{\rm forcing} ( t_n ) \, \}_{
-Line 1432 
 calendar ({\it cal}~) and external forci
+Line 1369 
 calendar ({\it cal}~) and external forci
  %
  This routine is not yet implemented, but would proceed
  proceed along the same lines as the initial value sensitivity.
+ The mixing parameters {\bf diffkr} and {\bf kapgm}
+ are currently added as controls in {\it ctrl\_map\_ini.F}.
  %
  \end{itemize}
  %
  \subsubsection{Output of adjoint variables and gradient}
  %
- Two ways exist to generate output of adjoint fields.
+ Several ways exist to generate output of adjoint fields.
  %
  \begin{itemize}
  %
  \item
  \fbox{
  \begin{minipage}{12cm}
- {\it ctrl\_pack}:
+ {\it ctrl\_map\_ini, ctrl\_map\_forcing}:
  \end{minipage}
  }
  \\
- At the end of the forward/adjoint integration, the S/R
- {\it ctrl\_pack} is called which mirrors S/R {\it ctrl\_unpack}.
- It writes the following files:
- %
  \begin{itemize}
  %
- \item {\bf xx\_...}: the control variable fields
+ \item {\bf xx\_...}: the control variable fields \\
+ Before the forward integration, the control
+ variables are read from file {\bf xx\_ ...} and added to
+ the model field.
  %
  \item {\bf adxx\_...}: the adjoint variable fields, i.e. the gradient
- $ \nabla _{u}{\cal J} $ for each control variable,
+ $ \nabla _{u}{\cal J} $ for each control variable \\
+ After the adjoint integration the corresponding adjoint
+ variables are written to {\bf adxx\_ ...}.
  %
- \item {\bf vector\_ctrl}: the control vector
+ \end{itemize}
  %
- \item {\bf vector\_grad}: the gradient vector
+ \item
+ \fbox{
+ \begin{minipage}{12cm}
+ {\it ctrl\_unpack, ctrl\_pack}:
+ \end{minipage}
+ }
+ \\
+ %
+ \begin{itemize}
+ %
+ \item {\bf vector\_ctrl}: the control vector \\
+ At the very beginning of the model initialization,
+ the updated compressed control vector is read (or initialised)
+ and distributed to 2-dim. and 3-dim. control variable fields.
+ %
+ \item {\bf vector\_grad}: the gradient vector \\
+ At the very end of the adjoint integration,
+ the 2-dim. and 3-dim. adjoint variables are read,
+ compressed to a single vector and written to file.
  %
  \end{itemize}
  %
-Line 1474 
 $ \nabla _{u}{\cal J} $ for each control
+Line 1432 
 $ \nabla _{u}{\cal J} $ for each control
  }
  \\
  In addition to writing the gradient at the end of the
- forward/adjoint integration, many more adjoint variables,
+ forward/adjoint integration, many more adjoint variables
- representing the Lagrange multipliers of the model state
+ of the model state
- w.r.t. the model state
+ at intermediate times can be written using S/R
- at different times can be written using S/R
  {\it addummy\_in\_stepping}.
  This routine is part of the adjoint support package
  {\it pkg/autodiff} (cf.f. below).
+ The procedure is enabled using via the CPP-option
+ {\bf ALLOW\_AUTODIFF\_MONITOR} (file {\it ECCO\_CPPOPTIONS.h}).
  To be part of the adjoint code, the corresponding S/R
  {\it dummy\_in\_stepping} has to be called in the forward
  model (S/R {\it the\_main\_loop}) at the appropriate place.
+ The adjoint common blocks are extracted from the adjoint code
+ via the header file {\it adcommon.h}.
  {\it dummy\_in\_stepping} is essentially empty,
  the corresponding adjoint routine is hand-written rather
-Line 1491 
 than generated automatically.
+Line 1452 
 than generated automatically.
  Appropriate flow directives ({\it dummy\_in\_stepping.flow})
  ensure that TAMC does not automatically
  generate {\it addummy\_in\_stepping} by trying to differentiate
- {\it dummy\_in\_stepping}, but rather takes the hand-written routine.
+ {\it dummy\_in\_stepping}, but instead refers to
+ the hand-written routine.
  {\it dummy\_in\_stepping} is called in the forward code
  at the beginning of each
-Line 1501 
 each timestep in the adjoint calculation
+Line 1463 
 each timestep in the adjoint calculation
  {\it addynamics}.
  {\it addummy\_in\_stepping} includes the header files
- {\it adffields.h, addynamics.h, adtr1.h}.
+ {\it adcommon.h}.
- These header files are also hand-written. They contain
+ This header file is also hand-written. It contains
- the common blocks {\bf /addynvars\_r/}, {\bf /addynvars\_cd/},
+ the common blocks
+ {\bf /addynvars\_r/}, {\bf /addynvars\_cd/},
+ {\bf /addynvars\_diffkr/}, {\bf /addynvars\_kapgm/},
  {\bf /adtr1\_r/}, {\bf /adffields/},
  which have been extracted from the adjoint code to enable
  access to the adjoint variables.
+ {\bf WARNING:} If the structure of the common blocks
+ {\bf /dynvars\_r/}, {\bf /dynvars\_cd/}, etc., changes
+ similar changes will occur in the adjoint common blocks.
+ Therefore, consistency between the TAMC-generated common blocks
+ and those in {\it adcommon.h} have to be checked.
  %
  \end{itemize}
-Line 1521 
 The gradient $ \nabla _{u}{\cal J} |_{u_
+Line 1491 
 The gradient $ \nabla _{u}{\cal J} |_{u_
  with the value of the cost function itself $ {\cal J}(u_{[k]}) $
  at iteration step $ k $ serve
  as input to a minimization routine (e.g. quasi-Newton method,
- conjugate gradient, ...) to compute an update in the
+ conjugate gradient, ... \cite{gil-lem:89})
+ to compute an update in the
  control variable for iteration step $k+1$
  \[
  u_{[k+1]} \, = \,  u_{[0]} \, + \, \Delta u_{[k+1]}
-Line 1531 
 u_{[k+1]} \, = \,  u_{[0]} \, + \, \Delt
+Line 1502 
 u_{[k+1]} \, = \,  u_{[0]} \, + \, \Delt
  $ u_{[k+1]} $ then serves as input for a forward/adjoint run
  to determine $ {\cal J} $ and $ \nabla _{u}{\cal J} $ at iteration step
  $ k+1 $.
- Tab. \ref{???} sketches the flow between forward/adjoint model
+ Tab. ref:ask-the-author sketches the flow between forward/adjoint model
  and the minimization routine.
+ {\scriptsize
  \begin{eqnarray*}
- \footnotesize
  \begin{array}{ccccc}
  u_{[0]} \,\, ,  \,\, \Delta u_{[k]}    & ~ & ~ & ~ & ~ \\
  {\Big\downarrow}
-Line 1552 
 v_{[k]} = M \left( u_{[k]} \right) &
+Line 1523 
 v_{[k]} = M \left( u_{[k]} \right) &
  {\cal J}_{[k]} = {\cal J} \left( M \left( u_{[k]} \right) \right)} \\
  \multicolumn{1}{|c}{~} & ~ & ~ & ~ & \multicolumn{1}{c|}{~} \\
  \hline
+ \multicolumn{1}{|c}{~} & ~ & ~ & ~ & \multicolumn{1}{c|}{~}  \\
+ \multicolumn{1}{|c}{~} & ~ & ~ & ~ & \multicolumn{1}{c|}{{\Big\downarrow}} \\
+ \multicolumn{1}{|c}{~} & ~ & ~ & ~ & \multicolumn{1}{c|}{~}  \\
  \hline
  \multicolumn{1}{|c}{~} & ~ & ~ & ~ & \multicolumn{1}{c|}{~} \\
  \multicolumn{1}{|c}{
  \nabla_u {\cal J}_{[k]} (\delta {\cal J}) =
- T\!\!^{\ast} \cdot \nabla_v {\cal J} |_{v_{[k]}} (\delta {\cal J})} &
+ T^{\ast} \cdot \nabla_v {\cal J} |_{v_{[k]}} (\delta {\cal J})} &
  \stackrel{\bf adjoint}{\mathbf \longleftarrow} &
  ad \, v_{[k]} (\delta {\cal J}) =
  \nabla_v {\cal J} |_{v_{[k]}} (\delta {\cal J}) &
-Line 1565 
 ad \, v_{[k]} (\delta {\cal J}) =
+Line 1539 
 ad \, v_{[k]} (\delta {\cal J}) =
  \multicolumn{1}{|c}{~} & ~ & ~ & ~ & \multicolumn{1}{c|}{~} \\
  \hline
   ~ & ~ & ~ & ~ & ~ \\
- ~ & ~ &
+ \hspace*{15ex}{\Bigg\downarrow}
- {\cal J}_{[k]} \qquad {\Bigg\downarrow}  \qquad \nabla_u {\cal J}_{[k]}
+ \quad {\cal J}_{[k]}, \quad \nabla_u {\cal J}_{[k]}
-  & ~ & ~ \\
+  & ~ & ~ & ~ & ~ \\
   ~ & ~ & ~ & ~ & ~ \\
  \hline
  \multicolumn{1}{|c}{~} & ~ & ~ & ~ & \multicolumn{1}{c|}{~} \\
-Line 1583 
 ad \, v_{[k]} (\delta {\cal J}) =
+Line 1557 
 ad \, v_{[k]} (\delta {\cal J}) =
   ~ & ~ & ~ & ~ & \Delta u_{[k+1]} \\
  \end{array}
  \end{eqnarray*}
+ }
  The routines {\it ctrl\_unpack} and {\it ctrl\_pack} provide
  the link between the model and the minimization routine.
- As described in Section \ref{???}
+ As described in Section ref:ask-the-author
  the {\it unpack} and {\it pack} routines read and write
  control and gradient {\it vectors} which are compressed
  to contain only wet points, in addition to the full
-Line 1595 
 The corresponding I/O flow looks as foll
+Line 1570 
 The corresponding I/O flow looks as foll
  \vspace*{0.5cm}
+ {\scriptsize
  \begin{tabular}{ccccc}
  {\bf vector\_ctrl\_$<$k$>$ } & ~ & ~ & ~ & ~ \\
  {\big\downarrow}  & ~ & ~ & ~ & ~ \\
-Line 1605 
 The corresponding I/O flow looks as foll
+Line 1581 
 The corresponding I/O flow looks as foll
  \cline{3-3}
  \multicolumn{1}{l}{\bf xx\_theta0...$<$k$>$} & ~ &
  \multicolumn{1}{|c|}{~} & ~ & ~ \\
- \multicolumn{1}{l}{\bf xx\_salt0...$<$k$>$} & $\longrightarrow$ &
+ \multicolumn{1}{l}{\bf xx\_salt0...$<$k$>$} &
+ $\stackrel{\mbox{read}}{\longrightarrow}$ &
  \multicolumn{1}{|c|}{forward integration} & ~ & ~ \\
  \multicolumn{1}{l}{\bf \vdots} & ~ & \multicolumn{1}{|c|}{~}
  & ~ & ~ \\
  \cline{3-3}
- ~ & ~ & ~ & ~ & ~ \\
+ ~ & ~ & $\downarrow$ & ~ & ~ \\
  \cline{3-3}
  ~ & ~ &
  \multicolumn{1}{|c|}{~} & ~ &
  \multicolumn{1}{l}{\bf adxx\_theta0...$<$k$>$}  \\
  ~ & ~ & \multicolumn{1}{|c|}{adjoint integration} &
- $\longrightarrow$ &
+ $\stackrel{\mbox{write}}{\longrightarrow}$ &
  \multicolumn{1}{l}{\bf adxx\_salt0...$<$k$>$} \\
  ~ & ~ & \multicolumn{1}{|c|}{~}
  & ~ & \multicolumn{1}{l}{\bf \vdots} \\
-Line 1628 
 $\longrightarrow$ &
+Line 1605 
 $\longrightarrow$ &
  ~ & ~ & ~ & ~ &  {\big\downarrow} \\
  ~ & ~ & ~ & ~ &  {\bf vector\_grad\_$<$k$>$ } \\
  \end{tabular}
+ }
  \vspace*{0.5cm}
- {\it ctrl\_unpack} reads in the updated control vector
+ {\it ctrl\_unpack} reads the updated control vector
  {\bf vector\_ctrl\_$<$k$>$}.
  It distributes the different control variables to
 -dim. and 3-dim. files {\it xx\_...$<$k$>$}.
- During the forward integration the control variables
+ At the start of the forward integration the control variables
- are read from {\it xx\_...$<$k$>$}.
+ are read from {\it xx\_...$<$k$>$} and added to the
- Correspondingly, the adjoint fields are written
+ field.
+ Correspondingly, at the end of the adjoint integration
+ the adjoint fields are written
  to {\it adxx\_...$<$k$>$}, again via the active file routines.
- Finally, {\it ctrl\_pack} collects all adjoint field files
+ Finally, {\it ctrl\_pack} collects all adjoint files
  and writes them to the compressed vector file
  {\bf vector\_grad\_$<$k$>$}.
- \subsection{TLM and ADM generation via TAMC}
- \subsection{Flow directives and adjoint support routines}
- \subsection{Store directives and checkpointing}
- \subsection{Gradient checks}
- \subsection{Second derivative generation via TAMC}
- \section{Example of adjoint code}

 Legend:



Removed from v.1.1.1.1
 


changed lines


 
Added in v.1.24
 Legend:



Removed from v.1.1.1.1
 


changed lines


 
Added in v.1.24
-Removed from v.1.1.1.1
+Added in v.1.24

	ViewVC Help
Powered by ViewVC 1.1.22