/[MITgcm]/manual/s_autodiff/text/doc_ad_2.tex

Diff of /manual/s_autodiff/text/doc_ad_2.tex

Parent Directory | Revision Log | View Revision Graph Revision Graph | View Patch Patch

-revision 1.20 by edhill,
Wed Apr  5 02:27:33 2006 UTC
+revision 1.23 by jmc,
Mon Aug 30 23:09:19 2010 UTC
 Line 2
  % $Name$
  Author: Patrick Heimbach
+ \label{ask_the_author:doc_ad_2}
  {\sf Automatic differentiation} (AD), also referred to as algorithmic
  (or, more loosely, computational) differentiation, involves
-Line 10 
 existing fully non-linear prognostic cod
+Line 11 
 existing fully non-linear prognostic cod
  software tool is used that parses and transforms source files
  according to a set of linguistic and mathematical rules.  AD tools are
  like source-to-source translators in that they parse a program code as
- input and produce a new program code as output.  However, unlike a
+ input and produce a new program code as output
+ (we restrict our discussion to source-to-source tools, ignoring
+ operator-overloading tools).  However, unlike a
  pure source-to-source translation, the output program represents a new
  algorithm, such as the evaluation of the Jacobian, the Hessian, or
  higher derivative operators.  In principle, a variety of derived
-Line 21 
 Model Compiler (TAMC) and its successor
+Line 24 
 Model Compiler (TAMC) and its successor
  Algorithms in Fortran), developed by Ralf Giering (\cite{gie-kam:98},
  \cite{gie:99,gie:00}).  The first application of the adjoint of MITgcm
  for sensitivity studies has been published by \cite{maro-eta:99}.
- \cite{sta-eta:97,sta-eta:01} use MITgcm and its adjoint for ocean
+ \cite{stam-etal:97,stam-etal:02} use MITgcm and its adjoint for ocean
  state estimation studies.  In the following we shall refer to TAMC and
  TAF synonymously, except were explicitly stated otherwise.
- TAMC exploits the chain rule for computing the first derivative of a
+ As of mid-2007 we are also able to generate fairly efficient
+ adjoint code of the MITgcm using a new, open-source AD tool,
+ called OpenAD (see \cite{naum-etal:06,utke-etal:08}.
+ This enables us for the first time to compare adjoint models
+ generated from different AD tools, providing an additional
+ accuracy check, complementary to finite-difference gradient checks.
+ OpenAD and its application to  MITgcm is described in detail
+ in section \ref{sec_ad_openad}.
+ The AD tool exploits the chain rule for computing the first derivative of a
  function with respect to a set of input variables.  Treating a given
  forward code as a composition of operations -- each line representing
  a compositional element, the chain rule is rigorously applied to the
-Line 54 
 model output variable $\vec{v}=(v_1,\ldo
+Line 66 
 model output variable $\vec{v}=(v_1,\ldo
  under consideration,
  %
  \begin{equation}
- \begin{split}
+ \begin{aligned}
  {\cal M} \, : & \, U \,\, \longrightarrow \, V \\
  ~      & \, \vec{u} \,\, \longmapsto \, \vec{v} \, = \,
  {\cal M}(\vec{u})
  \label{fulloperator}
- \end{split}
+ \end{aligned}
  \end{equation}
  %
  The vectors $ \vec{u} \in U $ and $ v \in V $ may be represented w.r.t.
-Line 139 
 w.r.t. their corresponding inner product
+Line 151 
 w.r.t. their corresponding inner product
  $\left\langle \,\, , \,\, \right\rangle $
  %
  \begin{equation}
- \begin{split}
+ \begin{aligned}
  {\cal J} & = \,
  {\cal J} |_{\vec{u}^{(0)}} \, + \,
  \left\langle \, \nabla _{u}{\cal J}^T |_{\vec{u}^{(0)}} \, , \, \delta \vec{u} \, \right\rangle
-Line 148 
 $\left\langle \,\, , \,\, \right\rangle
+Line 160 
 $\left\langle \,\, , \,\, \right\rangle
  {\cal J} |_{\vec{v}^{(0)}} \, + \,
  \left\langle \, \nabla _{v}{\cal J}^T |_{\vec{v}^{(0)}} \, , \, \delta \vec{v} \, \right\rangle
  \, + \, O(\delta \vec{v}^2)
- \end{split}
+ \end{aligned}
  \label{deljidentity}
  \end{equation}
  %
-Line 189 
 the gradient $ \nabla _{u}{\cal J} $ can
+Line 201 
 the gradient $ \nabla _{u}{\cal J} $ can
  invoking the adjoint $ M^{\ast } $ of the tangent linear model $ M $
  %
  \begin{equation}
- \begin{split}
+ \begin{aligned}
  \nabla _{u}{\cal J}^T |_{\vec{u}} &
  = \, M^T |_{\vec{u}} \cdot \nabla _{v}{\cal J}^T |_{\vec{v}}  \\
  ~ & = \, M^T |_{\vec{u}} \cdot \delta \vec{v}^{\ast} \\
  ~ & = \, \delta \vec{u}^{\ast}
- \end{split}
+ \end{aligned}
  \label{adjoint}
  \end{equation}
  %
-Line 242 
 $ \langle \, \nabla _{v}{\cal J}^T \, ,
+Line 254 
 $ \langle \, \nabla _{v}{\cal J}^T \, ,
  = \nabla_v {\cal J} \cdot \delta \vec{v} $ )
  %
  \begin{equation}
- \begin{split}
+ \begin{aligned}
  \nabla_v {\cal J} (M(\delta \vec{u})) & = \,
  \nabla_v {\cal J} \cdot M_{\Lambda}
  \cdot ...... \cdot M_{\lambda} \cdot ...... \cdot
  M_{1} \cdot M_{0} \cdot \delta \vec{u} \\
  ~ & = \, \nabla_v {\cal J} \cdot \delta \vec{v} \\
- \end{split}
+ \end{aligned}
  \label{forward}
  \end{equation}
  %
-Line 256 
 whereas in reverse mode we have
+Line 268 
 whereas in reverse mode we have
  %
  \begin{equation}
  \boxed{
- \begin{split}
+ \begin{aligned}
  M^T ( \nabla_v {\cal J}^T) & = \,
  M_{0}^T \cdot M_{1}^T
  \cdot ...... \cdot M_{\lambda}^T \cdot ...... \cdot
-Line 265 
 M_{\Lambda}^T \cdot \nabla_v {\cal J}^T
+Line 277 
 M_{\Lambda}^T \cdot \nabla_v {\cal J}^T
  \cdot ...... \cdot
  \nabla_{v^{(\lambda)}} {\cal J}^T \\
  ~ & = \, \nabla_u {\cal J}^T
- \end{split}
+ \end{aligned}
  }
  \label{reverse}
  \end{equation}
-Line 284 
 $ \vec{v}^{(\lambda)} $ at each intermed
+Line 296 
 $ \vec{v}^{(\lambda)} $ at each intermed
  %
  \begin{equation}
  \boxed{
- \begin{split}
+ \begin{aligned}
  \nabla_{v^{(\lambda)}} {\cal J}^T |_{\vec{v}^{(\lambda)}}
  & = \,
  M_{\lambda}^T |_{\vec{v}^{(\lambda)}} \cdot ...... \cdot
  M_{\Lambda}^T |_{\vec{v}^{(\lambda)}} \cdot \delta \vec{v}^{\ast} \\
  ~ & = \, \delta \vec{v}^{(\lambda) \, \ast}
- \end{split}
+ \end{aligned}
  }
  \end{equation}
  %
-Line 407 
 and the shorthand notation for the adjoi
+Line 419 
 and the shorthand notation for the adjoi
  $ \delta v^{(\lambda) \, \ast}_{j} = \frac{\partial}{\partial v^{(\lambda)}_{j}}
  {\cal J}^T $, $ j = 1, \ldots , n_{\lambda} $,
  for intermediate components, yielding
+ {\small
  \begin{equation}
- \small
+ \begin{aligned}
- \begin{split}
  \left(
  \begin{array}{c}
  \delta v^{(\lambda) \, \ast}_1 \\
-Line 454 
 for intermediate components, yielding
+Line 466 
 for intermediate components, yielding
  \delta v^{\ast}_{n} \\
  \end{array}
  \right)
- \end{split}
+ \end{aligned}
  \end{equation}
+ }
  Eq. (\ref{forward}) and (\ref{reverse}) are perhaps clearest in
  showing the advantage of the reverse over the forward mode
-Line 526 
 operator which maps the model state spac
+Line 539 
 operator which maps the model state spac
  Then, $ \nabla_v {\cal J} $ takes the form
  %
  \begin{equation*}
- \begin{split}
+ \begin{aligned}
  \nabla_v {\cal J}^T & = \, 2 \, \, H \cdot
  \left( \, {\cal H}(\vec{v}) - \vec{d} \, \right) \\
  ~          & = \, 2 \sum_{j} \left\{ \sum_k
  \frac{\partial {\cal H}_k}{\partial v_{j}}
  \left( {\cal H}_k (\vec{v}) - d_k \right)
  \right\} \, {\vec{f}_{j}} \\
- \end{split}
+ \end{aligned}
  \end{equation*}
  %
  where $H_{kj} = \partial {\cal H}_k / \partial v_{j} $ is the
-Line 652 
 $ n^{lev3}, \,\, n^{lev2}, \,\, n^{lev1}
+Line 665 
 $ n^{lev3}, \,\, n^{lev2}, \,\, n^{lev1}
  %\psfrag{v_kn^lev2}{\mathinfigure{v_{k_{n}^{lev2}}}}
  %\psfrag{v_k1^lev1}{\mathinfigure{v_{k_{1}^{lev1}}}}
  %\psfrag{v_kn^lev1}{\mathinfigure{v_{k_{n}^{lev1}}}}
- %\mbox{\epsfig{file=part5/checkpointing.eps, width=0.8\textwidth}}
+ %\mbox{\epsfig{file=s_autodiff/figs/checkpointing.eps, width=0.8\textwidth}}
- \resizebox{5.5in}{!}{\includegraphics{part5/checkpointing.eps}}
+ \resizebox{5.5in}{!}{\includegraphics{s_autodiff/figs/checkpointing.eps}}
  %\psfull
  \end{center}
  \caption{
-Line 681 
 Schematic view of intermediate dump and
+Line 694 
 Schematic view of intermediate dump and
  In this section we describe in a general fashion
  the parts of the code that are relevant for automatic
  differentiation using the software tool TAF.
+ Modifications to use OpenAD are described in \ref{sec_ad_openad}.
- \input{part5/doc_ad_the_model}
+ \input{s_autodiff/text/doc_ad_the_model}
  The basic flow is depicted in \ref{fig:adthemodel}.
  If CPP option \texttt{ALLOW\_AUTODIFF\_TAMC} is defined,
-Line 718 
 If gradient checks are to be performed,
+Line 732 
 If gradient checks are to be performed,
  {\tt ALLOW\_GRADIENT\_CHECK} is defined. In this case
  the driver routine {\it grdchk\_main} is called after
  the gradient has been computed via the adjoint
- (cf. Section \ref{section_grdchk}).
+ (cf. Section \ref{sec:ad_gradient_check}).
  %------------------------------------------------------------------
-Line 728 
 the gradient has been computed via the a
+Line 742 
 the gradient has been computed via the a
  In order to configure AD-related setups the following packages need
  to be enabled:
  {\it
- \begin{table}[h!]
+ \begin{table}[!ht]
  \begin{tabular}{l}
  autodiff \\
  ctrl \\
-Line 764 
 w.r.t. the 3-level checkpointing (see se
+Line 778 
 w.r.t. the 3-level checkpointing (see se
  %------------------------------------------------------------------
- \subsection{Building the AD code
+ \subsection{Building the AD code using TAF
  \label{section_ad_build}}
  The build process of an AD code is very similar to building
-Line 772 
 the forward model. However, depending on
+Line 786 
 the forward model. However, depending on
  to generate, and on which AD tool is available (TAF or TAMC),
  the following {\tt make} targets are available:
- \begin{table}[h!]
+ \begin{table}[!ht]
  {\footnotesize
- \begin{tabular}{ccll}
+ \begin{tabular}{|ccll|}
+ \hline
  ~ & {\it AD-target} & {\it output} & {\it description} \\
  \hline
  \hline
-Line 793 
 generates code for $<$MODE$>$ using $<$T
+Line 808 
 generates code for $<$MODE$>$ using $<$T
  ~ & ~ & ~ & and compiles all code \\
  ~ & ~ & ~ & (use of TAF is set as default) \\
  \hline
- \hline
  \end{tabular}
  }
  \end{table}
-Line 802 
 Here, the following placeholders are use
+Line 816 
 Here, the following placeholders are use
  %
  \begin{itemize}
  %
- \item [$<$TOOL$>$]
+ \item $<$TOOL$>$
  %
  \begin{itemize}
  %
-Line 811 
 Here, the following placeholders are use
+Line 825 
 Here, the following placeholders are use
  %
  \end{itemize}
  %
- \item [$<$MODE$>$]
+ \item $<$MODE$>$
  %
  \begin{itemize}
  %
-Line 853 
 The {\tt make <MODE>all} target consists
+Line 867 
 The {\tt make <MODE>all} target consists
  \item
  A header file {\tt AD\_CONFIG.h} is generated which contains a CPP option
  on which code ought to be generated. Depending on the {\tt make} target,
- the contents is
+ the contents is one of the following:
  \begin{itemize}
  \item
  {\tt \#define ALLOW\_ADJOINT\_RUN}
-Line 869 
 consisting of all {\tt .f} files that ar
+Line 883 
 consisting of all {\tt .f} files that ar
  and all {\tt .flow} files that are part of the list {\bf AD\_FLOW\_FILES}.
  %
  \item
- The AD tool is invoked with the {\bf <MODE>\_<TOOL>\_FLAGS}.
+ The AD tool is invoked with the {\tt <MODE>\_<TOOL>\_FLAGS}.
  The default AD tool flags in {\tt genmake2} can be overrwritten by
  an {\tt adjoint\_options} file (similar to the platform-specific
  {\tt build\_options}, see Section ???.
-Line 1002 
 The aspects relevant to the treatment of
+Line 1016 
 The aspects relevant to the treatment of
  are controlled by the package {\it pkg/ctrl} and will be treated
  in the next section.
- \input{part5/doc_cost_flow}
+ \input{s_autodiff/text/doc_cost_flow}
  \subsubsection{Enabling the package}
-Line 1092 
 Within this 'driver' routine, S/R are ca
+Line 1106 
 Within this 'driver' routine, S/R are ca
  the chosen cost function contributions.
  In the present example ({\bf ALLOW\_COST\_TRACER}),
  S/R {\it cost\_tracer} is called.
- It accumulates {\bf objf\_tracer} according to eqn. (\ref{???}).
+ It accumulates {\bf objf\_tracer} according to eqn. (\ref{ask_the_author:doc_ad_2}).
  %
  \subsubsection{Finalize all contributions}
  %
-Line 1117 
 from each contribution and sums over all
+Line 1131 
 from each contribution and sums over all
  \end{equation}
  %
  The total cost function {\bf fc} will be the
- 'dependent' variable in the argument list for TAMC, i.e.
+ 'dependent' variable in the argument list for TAF, i.e.
  \begin{verbatim}
- tamc -output 'fc' ...
+ taf -output 'fc' ...
  \end{verbatim}
  %%%% \end{document}
- \input{part5/doc_ad_the_main}
+ \input{s_autodiff/text/doc_ad_the_main}
  \subsection{The control variables (independent variables)
  \label{section_ctrl}}
-Line 1144 
 All aspects relevant to the treatment of
+Line 1158 
 All aspects relevant to the treatment of
  (parameter setting, initialization, perturbation)
  are controlled by the package {\it pkg/ctrl}.
- \input{part5/doc_ctrl_flow}
+ \input{s_autodiff/text/doc_ctrl_flow}
  \subsubsection{genmake and CPP options}
  %
-Line 1234 
 and gradient are generated and initialis
+Line 1248 
 and gradient are generated and initialis
  %
  The dependency flow for differentiation w.r.t. the controls
  starts with adding a perturbation onto the input variable,
- thus defining the independent or control variables for TAMC.
+ thus defining the independent or control variables for TAF.
  Three types of controls may be considered:
  %
  \begin{itemize}
-Line 1255 
 temperature and salinity are initialised
+Line 1269 
 temperature and salinity are initialised
  a perturbation anomaly is added to the field in S/R
  {\it ctrl\_map\_ini}
  %
+ %\begin{eqnarray}
  \begin{equation}
- \begin{split}
+ \begin{aligned}
  u         & = \, u_{[0]} \, + \, \Delta u \\
  {\bf tr1}(...) & = \, {\bf tr1_{ini}}(...) \, + \, {\bf xx\_tr1}(...)
  \label{perturb}
- \end{split}
+ \end{aligned}
  \end{equation}
+ %\end{eqnarray}
  %
  {\bf xx\_tr1} is a 3-dim. global array
  holding the perturbation. In the case of a simple
  sensitivity study this array is identical to zero.
  However, it's specification is essential in the context
- of automatic differentiation since TAMC
+ of automatic differentiation since TAF
  treats the corresponding line in the code symbolically
  when determining the differentiation chain and its origin.
  Thus, the variable names are part of the argument list
- when calling TAMC:
+ when calling TAF:
  %
  \begin{verbatim}
- tamc -input 'xx_tr1 ...' ...
+ taf -input 'xx_tr1 ...' ...
  \end{verbatim}
  %
  Now, as mentioned above, MITgcm avoids maintaining
  an array for each control variable by reading the
  perturbation to a temporary array from file.
- To ensure the symbolic link to be recognized by TAMC, a scalar
+ To ensure the symbolic link to be recognized by TAF, a scalar
  dummy variable {\bf xx\_tr1\_dummy} is introduced
  and an 'active read' routine of the adjoint support
  package {\it pkg/autodiff} is invoked.
  The read-procedure is tagged with the variable
- {\bf xx\_tr1\_dummy} enabling TAMC to recognize the
+ {\bf xx\_tr1\_dummy} enabling TAF to recognize the
  initialization of the perturbation.
- The modified call of TAMC thus reads
+ The modified call of TAF thus reads
  %
  \begin{verbatim}
- tamc -input 'xx_tr1_dummy ...' ...
+ taf -input 'xx_tr1_dummy ...' ...
  \end{verbatim}
  %
  and the modified operation to (\ref{perturb})
-Line 1487 
 u_{[k+1]} \, = \,  u_{[0]} \, + \, \Delt
+Line 1503 
 u_{[k+1]} \, = \,  u_{[0]} \, + \, \Delt
  $ u_{[k+1]} $ then serves as input for a forward/adjoint run
  to determine $ {\cal J} $ and $ \nabla _{u}{\cal J} $ at iteration step
  $ k+1 $.
- Tab. \ref{???} sketches the flow between forward/adjoint model
+ Tab. \ref{ask_the_author:doc_ad_2} sketches the flow between forward/adjoint model
  and the minimization routine.
+ {\scriptsize
  \begin{eqnarray*}
- \scriptsize
  \begin{array}{ccccc}
  u_{[0]} \,\, ,  \,\, \Delta u_{[k]}    & ~ & ~ & ~ & ~ \\
  {\Big\downarrow}
-Line 1542 
 ad \, v_{[k]} (\delta {\cal J}) =
+Line 1558 
 ad \, v_{[k]} (\delta {\cal J}) =
   ~ & ~ & ~ & ~ & \Delta u_{[k+1]} \\
  \end{array}
  \end{eqnarray*}
+ }
  The routines {\it ctrl\_unpack} and {\it ctrl\_pack} provide
  the link between the model and the minimization routine.
- As described in Section \ref{???}
+ As described in Section \ref{ask_the_author:doc_ad_2}
  the {\it unpack} and {\it pack} routines read and write
  control and gradient {\it vectors} which are compressed
  to contain only wet points, in addition to the full

 Legend:



Removed from v.1.20
 


changed lines


 
Added in v.1.23
 Legend:



Removed from v.1.20
 


changed lines


 
Added in v.1.23
-Removed from v.1.20
+Added in v.1.23

	ViewVC Help
Powered by ViewVC 1.1.22