/[MITgcm]/manual/s_autodiff/text/doc_ad_2.tex
ViewVC logotype

Diff of /manual/s_autodiff/text/doc_ad_2.tex

Parent Directory Parent Directory | Revision Log Revision Log | View Revision Graph Revision Graph | View Patch Patch

revision 1.3 by cnh, Thu Sep 27 02:00:24 2001 UTC revision 1.4 by heimbach, Fri Oct 5 22:22:20 2001 UTC
# Line 18  In principle, a variety of derived algor Line 18  In principle, a variety of derived algor
18  can be generated automatically in this way.  can be generated automatically in this way.
19    
20  The MITGCM has been adapted for use with the  The MITGCM has been adapted for use with the
21  Tangent linear and Adjoint Model Compiler (TAMC) and its succssor TAF  Tangent linear and Adjoint Model Compiler (TAMC) and its successor TAF
22  (Transformation of Algorithms in Fortran), developed  (Transformation of Algorithms in Fortran), developed
23  by Ralf Giering (\cite{gie-kam:98}, \cite{gie:99,gie:00}).  by Ralf Giering (\cite{gie-kam:98}, \cite{gie:99,gie:00}).
24  The first application of the adjoint of the MITGCM for senistivity  The first application of the adjoint of the MITGCM for senistivity
25  studies has been published by \cite{maro-eta:99}.  studies has been published by \cite{maro-eta:99}.
26  \cite{sta-eta:97,sta-eta:01} use the MITGCM and its adjoint  \cite{sta-eta:97,sta-eta:01} use the MITGCM and its adjoint
27  for ocean state estimation studies.  for ocean state estimation studies.
28    In the following we shall refer to TAMC and TAF synonymously,
29    except were explicitly stated otherwise.
30    
31  TAMC exploits the chain rule for computing the first  TAMC exploits the chain rule for computing the first
32  derivative of a function with  derivative of a function with
33  respect to a set of input variables.  respect to a set of input variables.
34  Treating a given forward code as a composition of operations --  Treating a given forward code as a composition of operations --
35  each line representing a compositional element -- the chain rule is  each line representing a compositional element, the chain rule is
36  rigorously applied to the code, line by line. The resulting  rigorously applied to the code, line by line. The resulting
37  tangent linear or adjoint code,  tangent linear or adjoint code,
38  then, may be thought of as the composition in  then, may be thought of as the composition in
39  forward or reverse order, respectively, of the  forward or reverse order, respectively, of the
40  Jacobian matrices of the forward code compositional elements.  Jacobian matrices of the forward code's compositional elements.
41    
42  %**********************************************************************  %**********************************************************************
43  \section{Some basic algebra}  \section{Some basic algebra}
# Line 105  In contrast to the full nonlinear model Line 107  In contrast to the full nonlinear model
107  $ M $ is just a matrix  $ M $ is just a matrix
108  which can readily be used to find the forward sensitivity of $\vec{v}$ to  which can readily be used to find the forward sensitivity of $\vec{v}$ to
109  perturbations in  $u$,  perturbations in  $u$,
110  but if there are very many input variables $(>>O(10^{6})$ for  but if there are very many input variables $(\gg O(10^{6})$ for
111  large-scale oceanographic application), it quickly becomes  large-scale oceanographic application), it quickly becomes
112  prohibitive to proceed directly as in (\ref{tangent_linear}),  prohibitive to proceed directly as in (\ref{tangent_linear}),
113  if the impact of each component $ {\bf e_{i}} $ is to be assessed.  if the impact of each component $ {\bf e_{i}} $ is to be assessed.
# Line 130  or a measure of some model-to-data misfi Line 132  or a measure of some model-to-data misfi
132  \label{compo}  \label{compo}
133  \end{eqnarray}  \end{eqnarray}
134  %  %
135  The linear approximation of $ {\cal J} $,  The perturbation of $ {\cal J} $ around a fixed point $ {\cal J}_0 $,
136  \[  \[
137  {\cal J} \, \approx \, {\cal J}_0 \, + \, \delta {\cal J}  {\cal J} \, = \, {\cal J}_0 \, + \, \delta {\cal J}
138  \]  \]
139  can be expressed in both bases of $ \vec{u} $ and $ \vec{v} $  can be expressed in both bases of $ \vec{u} $ and $ \vec{v} $
140  w.r.t. their corresponding inner product  w.r.t. their corresponding inner product
# Line 168  transpose of $ A $, Line 170  transpose of $ A $,
170  \[  \[
171  A^{\ast} \, = \, A^T  A^{\ast} \, = \, A^T
172  \]  \]
173  and from eq. (\ref{tangent_linear}), we note that  and from eq. (\ref{tangent_linear}), (\ref{deljidentity}),
174    we note that
175  (omitting $|$'s):  (omitting $|$'s):
176  %  %
177  \begin{equation}  \begin{equation}
# Line 204  the adjoint variable of the model state Line 207  the adjoint variable of the model state
207  $ \delta \vec{u}^{\ast} $ the adjoint variable of the control variable $ \vec{u} $.  $ \delta \vec{u}^{\ast} $ the adjoint variable of the control variable $ \vec{u} $.
208    
209  The {\sf reverse} nature of the adjoint calculation can be readily  The {\sf reverse} nature of the adjoint calculation can be readily
210  seen as follows. Let us decompose ${\cal J}(u)$, thus:  seen as follows.
211    Consider a model integration which consists of $ \Lambda $
212    consecutive operations
213    $ {\cal M}_{\Lambda} (  {\cal M}_{\Lambda-1} (
214    ...... ( {\cal M}_{\lambda} (
215    ......
216    ( {\cal M}_{1} ( {\cal M}_{0}(\vec{u}) )))) $,
217    where the ${\cal M}$'s could be the elementary steps, i.e. single lines
218    in the code of the model, or successive time steps of the
219    model integration,
220    starting at step 0 and moving up to step $\Lambda$, with intermediate
221    ${\cal M}_{\lambda} (\vec{u}) = \vec{v}^{(\lambda+1)}$ and final
222    ${\cal M}_{\Lambda} (\vec{u}) = \vec{v}^{(\Lambda+1)} = \vec{v}$.
223    Let ${\cal J}$ be a cost funciton which explicitly depends on the
224    final state $\vec{v}$ only
225    (this restriction is for clarity reasons only).
226    %
227    ${\cal J}(u)$ may be decomposed according to:
228  %  %
229  \begin{equation}  \begin{equation}
230  {\cal J}({\cal M}(\vec{u})) \, = \,  {\cal J}({\cal M}(\vec{u})) \, = \,
# Line 215  seen as follows. Let us decompose ${\cal Line 235  seen as follows. Let us decompose ${\cal
235  \label{compos}  \label{compos}
236  \end{equation}  \end{equation}
237  %  %
238  where the ${\cal M}$'s could be the elementary steps, i.e. single lines  Then, according to the chain rule, the forward calculation reads,
239  in the code of the model,  in terms of the Jacobi matrices
 starting at step 0 and moving up to step $\Lambda$, with intermediate  
 ${\cal M}_{\lambda} (\vec{u}) = \vec{v}^{(\lambda+1)}$ and final  
 ${\cal M}_{\Lambda} (\vec{u}) = \vec{v}^{(\Lambda+1)} = \vec{v}$  
 Then, according to the chain rule the forward calculation reads in  
 terms of the Jacobi matrices  
240  (we've omitted the $ | $'s which, nevertheless are important  (we've omitted the $ | $'s which, nevertheless are important
241  to the aspect of {\it tangent} linearity;  to the aspect of {\it tangent} linearity;
242  note also that per definition  note also that by definition
243  $ \langle \, \nabla _{v}{\cal J}^T \, , \, \delta \vec{v} \, \rangle  $ \langle \, \nabla _{v}{\cal J}^T \, , \, \delta \vec{v} \, \rangle
244  = \nabla_v {\cal J} \cdot \delta \vec{v} $ )  = \nabla_v {\cal J} \cdot \delta \vec{v} $ )
245  %  %
# Line 259  M_{\Lambda}^T \cdot \nabla_v {\cal J}^T Line 274  M_{\Lambda}^T \cdot \nabla_v {\cal J}^T
274  %  %
275  clearly expressing the reverse nature of the calculation.  clearly expressing the reverse nature of the calculation.
276  Eq. (\ref{reverse}) is at the heart of automatic adjoint compilers.  Eq. (\ref{reverse}) is at the heart of automatic adjoint compilers.
277  The intermediate steps $\lambda$ in  If the intermediate steps $\lambda$ in
278  eqn. (\ref{compos}) -- (\ref{reverse})  eqn. (\ref{compos}) -- (\ref{reverse})
279  could represent the model state (forward or adjoint) at each  represent the model state (forward or adjoint) at each
280  intermediate time step in which case  intermediate time step as noted above, then correspondingly,
281  $ {\cal M}(\vec{v}^{(\lambda)}) = \vec{v}^{(\lambda+1)} $, and correspondingly,  $ M^T (\delta \vec{v}^{(\lambda) \, \ast}) =
282  $ M^T (\delta \vec{v}^{(\lambda) \, \ast}) = \delta \vec{v}^{(\lambda-1) \, \ast} $,  \delta \vec{v}^{(\lambda-1) \, \ast} $ for the adjoint variables.
283  but they can also be viewed more generally as  It thus becomes evident that the adjoint calculation also
284  single lines of code in the numerical algorithm.  yields the adjoint of each model state component
285  In both cases it becomes evident that the adjoint calculation  $ \vec{v}^{(\lambda)} $ at each intermediate step $ \lambda $, namely
 yields at the same time the adjoint of each model state component  
 $ \vec{v}^{(\lambda)} $ at each intermediate step $ l $, namely  
286  %  %
287  \begin{equation}  \begin{equation}
288  \boxed{  \boxed{
# Line 285  M_{\Lambda}^T |_{\vec{v}^{(\lambda)}} \c Line 298  M_{\Lambda}^T |_{\vec{v}^{(\lambda)}} \c
298  %  %
299  in close analogy to eq. (\ref{adjoint})  in close analogy to eq. (\ref{adjoint})
300  We note in passing that that the $\delta \vec{v}^{(\lambda) \, \ast}$  We note in passing that that the $\delta \vec{v}^{(\lambda) \, \ast}$
301  are the Lagrange multipliers of the model state $ \vec{v}^{(\lambda)}$.  are the Lagrange multipliers of the model equations which determine
302    $ \vec{v}^{(\lambda)}$.
303    
304  In coponents, eq. (\ref{adjoint}) reads as follows.  In coponents, eq. (\ref{adjoint}) reads as follows.
305  Let  Let
# Line 395  and the shorthand notation for the adjoi Line 409  and the shorthand notation for the adjoi
409  $ \delta v^{(\lambda) \, \ast}_{j} = \frac{\partial}{\partial v^{(\lambda)}_{j}}  $ \delta v^{(\lambda) \, \ast}_{j} = \frac{\partial}{\partial v^{(\lambda)}_{j}}
410  {\cal J}^T $, $ j = 1, \ldots , n_{\lambda} $,  {\cal J}^T $, $ j = 1, \ldots , n_{\lambda} $,
411  for intermediate components, yielding  for intermediate components, yielding
412  \[  \begin{equation}
413  \footnotesize  \small
414    \begin{split}
415  \left(  \left(
416  \begin{array}{c}  \begin{array}{c}
417  \delta v^{(\lambda) \, \ast}_1 \\  \delta v^{(\lambda) \, \ast}_1 \\
# Line 404  for intermediate components, yielding Line 419  for intermediate components, yielding
419  \delta v^{(\lambda) \, \ast}_{n_{\lambda}} \\  \delta v^{(\lambda) \, \ast}_{n_{\lambda}} \\
420  \end{array}  \end{array}
421  \right)  \right)
422  \, = \,  \, = &
423  \left(  \left(
424  \begin{array}{ccc}  \begin{array}{ccc}
425  \frac{\partial ({\cal M}_{\lambda})_1}{\partial v^{(\lambda)}_1}  \frac{\partial ({\cal M}_{\lambda})_1}{\partial v^{(\lambda)}_1}
426  & \ldots &  & \ldots \,\, \ldots &
427  \frac{\partial ({\cal M}_{\lambda})_{n_{\lambda+1}}}{\partial v^{(\lambda)}_1} \\  \frac{\partial ({\cal M}_{\lambda})_{n_{\lambda+1}}}{\partial v^{(\lambda)}_1} \\
428  \vdots & ~ & \vdots \\  \vdots & ~ & \vdots \\
429  \frac{\partial ({\cal M}_{\lambda})_1}{\partial v^{(\lambda)}_{n_{\lambda}}}  \frac{\partial ({\cal M}_{\lambda})_1}{\partial v^{(\lambda)}_{n_{\lambda}}}
430  & \ldots  &  & \ldots \,\, \ldots  &
431  \frac{\partial ({\cal M}_{\lambda})_{n_{\lambda+1}}}{\partial v^{(\lambda)}_{n_{\lambda}}} \\  \frac{\partial ({\cal M}_{\lambda})_{n_{\lambda+1}}}{\partial v^{(\lambda)}_{n_{\lambda}}} \\
432  \end{array}  \end{array}
433  \right)  \right)
 %  
434  \cdot  \cdot
435  %  %
436    \\ ~ & ~
437    \\ ~ &
438    %
439  \left(  \left(
440  \begin{array}{ccc}  \begin{array}{ccc}
441  \frac{\partial ({\cal M}_{\lambda+1})_1}{\partial v^{(\lambda+1)}_1}  \frac{\partial ({\cal M}_{\lambda+1})_1}{\partial v^{(\lambda+1)}_1}
# Line 431  for intermediate components, yielding Line 448  for intermediate components, yielding
448  \frac{\partial ({\cal M}_{\lambda+1})_{n_{\lambda+2}}}{\partial v^{(\lambda+1)}_{n_{\lambda+1}}} \\  \frac{\partial ({\cal M}_{\lambda+1})_{n_{\lambda+2}}}{\partial v^{(\lambda+1)}_{n_{\lambda+1}}} \\
449  \end{array}  \end{array}
450  \right)  \right)
451  \cdot \ldots \ldots \cdot  \cdot \, \ldots \, \cdot
452  \left(  \left(
453  \begin{array}{c}  \begin{array}{c}
454  \delta v^{\ast}_1 \\  \delta v^{\ast}_1 \\
# Line 439  for intermediate components, yielding Line 456  for intermediate components, yielding
456  \delta v^{\ast}_{n} \\  \delta v^{\ast}_{n} \\
457  \end{array}  \end{array}
458  \right)  \right)
459  \]  \end{split}
460    \end{equation}
461    
462  Eq. (\ref{forward}) and (\ref{reverse}) are perhaps clearest in  Eq. (\ref{forward}) and (\ref{reverse}) are perhaps clearest in
463  showing the advantage of the reverse over the forward mode  showing the advantage of the reverse over the forward mode
# Line 460  In contrast, eq. (\ref{reverse}) yields Line 478  In contrast, eq. (\ref{reverse}) yields
478  gradient $\nabla _{u}{\cal J}$ (and all intermediate gradients  gradient $\nabla _{u}{\cal J}$ (and all intermediate gradients
479  $\nabla _{v^{(\lambda)}}{\cal J}$) within a single reverse calculation.  $\nabla _{v^{(\lambda)}}{\cal J}$) within a single reverse calculation.
480    
481  Note, that in case $ {\cal J} $ is a vector-valued function  Note, that if $ {\cal J} $ is a vector-valued function
482  of dimension $ l > 1 $,  of dimension $ l > 1 $,
483  eq. (\ref{reverse}) has to be modified according to  eq. (\ref{reverse}) has to be modified according to
484  \[  \[
# Line 468  M^T \left( \nabla_v {\cal J}^T \left(\de Line 486  M^T \left( \nabla_v {\cal J}^T \left(\de
486  \, = \,  \, = \,
487  \nabla_u {\cal J}^T \cdot \delta \vec{J}  \nabla_u {\cal J}^T \cdot \delta \vec{J}
488  \]  \]
489  where now $ \delta \vec{J} \in I\!\!R $ is a vector of dimenison $ l $.  where now $ \delta \vec{J} \in I\!\!R^l $ is a vector of
490    dimenison $ l $.
491  In this case $ l $ reverse simulations have to be performed  In this case $ l $ reverse simulations have to be performed
492  for each $ \delta J_{k}, \,\, k = 1, \ldots, l $.  for each $ \delta J_{k}, \,\, k = 1, \ldots, l $.
493  Then, the reverse mode is more efficient as long as  Then, the reverse mode is more efficient as long as
# Line 503  operator onto the $j$-th component ${\bf Line 522  operator onto the $j$-th component ${\bf
522  \paragraph{Example 2:  \paragraph{Example 2:
523  $ {\cal J} = \langle \, {\cal H}(\vec{v}) - \vec{d} \, ,  $ {\cal J} = \langle \, {\cal H}(\vec{v}) - \vec{d} \, ,
524   \, {\cal H}(\vec{v}) - \vec{d} \, \rangle $} ~ \\   \, {\cal H}(\vec{v}) - \vec{d} \, \rangle $} ~ \\
525  The cost function represents the quadratic model vs.data misfit.  The cost function represents the quadratic model vs. data misfit.
526  Here, $ \vec{d} $ is the data vector and $ {\cal H} $ represents the  Here, $ \vec{d} $ is the data vector and $ {\cal H} $ represents the
527  operator which maps the model state space onto the data space.  operator which maps the model state space onto the data space.
528  Then, $ \nabla_v {\cal J} $ takes the form  Then, $ \nabla_v {\cal J} $ takes the form
# Line 534  H \cdot \left( {\cal H}(\vec{v}) - \vec{ Line 553  H \cdot \left( {\cal H}(\vec{v}) - \vec{
553    
554  We note an important aspect of the forward vs. reverse  We note an important aspect of the forward vs. reverse
555  mode calculation.  mode calculation.
556  Because of the locality of the derivative,  Because of the local character of the derivative
557    (a derivative is defined w.r.t. a point along the trajectory),
558  the intermediate results of the model trajectory  the intermediate results of the model trajectory
559  $\vec{v}^{(\lambda+1)}={\cal M}_{\lambda}(v^{(\lambda)})$  $\vec{v}^{(\lambda+1)}={\cal M}_{\lambda}(v^{(\lambda)})$
560  are needed to evaluate the intermediate Jacobian  are needed to evaluate the intermediate Jacobian
561  $M_{\lambda}|_{\vec{v}^{(\lambda)}} \, \delta \vec{v}^{(\lambda)} $.  $M_{\lambda}|_{\vec{v}^{(\lambda)}} \, \delta \vec{v}^{(\lambda)} $.
562  In the forward mode, the intermediate results are required  In the forward mode, the intermediate results are required
563  in the same order as computed by the full forward model ${\cal M}$,  in the same order as computed by the full forward model ${\cal M}$,
564  in the reverse mode they are required in the reverse order.  but in the reverse mode they are required in the reverse order.
565  Thus, in the reverse mode the trajectory of the forward model  Thus, in the reverse mode the trajectory of the forward model
566  integration ${\cal M}$ has to be stored to be available in the reverse  integration ${\cal M}$ has to be stored to be available in the reverse
567  calculation. Alternatively, the model state would have to be  calculation. Alternatively, the complete model state up to the
568  recomputed whenever its value is required.  point of evaluation has to be recomputed whenever its value is required.
569    
570  A method to balance the amount of recomputations vs.  A method to balance the amount of recomputations vs.
571  storage requirements is called {\sf checkpointing}  storage requirements is called {\sf checkpointing}
572  (e.g. \cite{res-eta:98}).  (e.g. \cite{res-eta:98}).
573  It is depicted in Fig. ... for a 3-level checkpointing  It is depicted in \reffig{3levelcheck} for a 3-level checkpointing
574  [as concrete example, we give explicit numbers for a 3-day  [as an example, we give explicit numbers for a 3-day
575  integration with a 1-hourly timestep in square brackets].  integration with a 1-hourly timestep in square brackets].
576  \begin{itemize}  \begin{itemize}
577  %  %
# Line 559  integration with a 1-hourly timestep in Line 579  integration with a 1-hourly timestep in
579  In a first step, the model trajectory is subdivided into  In a first step, the model trajectory is subdivided into
580  $ {n}^{lev3} $ subsections [$ {n}^{lev3} $=3 1-day intervals],  $ {n}^{lev3} $ subsections [$ {n}^{lev3} $=3 1-day intervals],
581  with the label $lev3$ for this outermost loop.  with the label $lev3$ for this outermost loop.
582  The model is then integrated over the full trajectory,  The model is then integrated along the full trajectory,
583  and the model state stored only at every $ k_{i}^{lev3} $-th timestep  and the model state stored only at every $ k_{i}^{lev3} $-th timestep
584  [i.e. 3 times, at  [i.e. 3 times, at
585  $ i = 0,1,2 $ corresponding to $ k_{i}^{lev3} = 0, 24, 48 $].  $ i = 0,1,2 $ corresponding to $ k_{i}^{lev3} = 0, 24, 48 $].
586  %  %
587  \item [$lev2$]  \item [$lev2$]
588  In a second step each subsection is itself divided into  In a second step each subsection itself is divided into
589  $ {n}^{lev2} $ subsubsections  $ {n}^{lev2} $ sub-subsections
590  [$ {n}^{lev2} $=4 6-hour intervals per subsection].  [$ {n}^{lev2} $=4 6-hour intervals per subsection].
591  The model picks up at the last outermost dumped state  The model picks up at the last outermost dumped state
592  $ v_{k_{n}^{lev3}} $ and is integrated forward in time over  $ v_{k_{n}^{lev3}} $ and is integrated forward in time along
593  the last subsection, with the label $lev2$ for this    the last subsection, with the label $lev2$ for this  
594  intermediate loop.  intermediate loop.
595  The model state is now stored only at every $ k_{i}^{lev2} $-th  The model state is now stored at every $ k_{i}^{lev2} $-th
596  timestep  timestep
597  [i.e. 4 times, at  [i.e. 4 times, at
598  $ i = 0,1,2,3 $ corresponding to $ k_{i}^{lev2} = 48, 54, 60, 66 $].  $ i = 0,1,2,3 $ corresponding to $ k_{i}^{lev2} = 48, 54, 60, 66 $].
599  %  %
600  \item [$lev1$]  \item [$lev1$]
601  Finally, the mode picks up at the last intermediate dump state  Finally, the model picks up at the last intermediate dump state
602  $ v_{k_{n}^{lev2}} $ and is integrated forward in time over  $ v_{k_{n}^{lev2}} $ and is integrated forward in time along
603  the last subsubsection, with the label $lev1$ for this    the last sub-subsection, with the label $lev1$ for this  
604  intermediate loop.  intermediate loop.
605  Within this subsubsection only, the model state is stored  Within this sub-subsection only, the model state is stored
606  at every timestep  at every timestep
607  [i.e. every hour $ i=0,...,5$ corresponding to  [i.e. every hour $ i=0,...,5$ corresponding to
608  $ k_{i}^{lev1} = 66, 67, \ldots, 71 $].  $ k_{i}^{lev1} = 66, 67, \ldots, 71 $].
609  Thus, the  final state $ v_n = v_{k_{n}^{lev1}} $ is reached  Thus, the  final state $ v_n = v_{k_{n}^{lev1}} $ is reached
610  and the model state of all peceeding timesteps over the last  and the model state of all peceeding timesteps along the last
611  subsubsections are available, enabling integration backwards  sub-subsections are available, enabling integration backwards
612  in time over the last subsubsection.  in time along the last sub-subsection.
613  Thus, the adjoint can be computed over this last  Thus, the adjoint can be computed along this last
614  subsubsection $k_{n}^{lev2}$.  sub-subsection $k_{n}^{lev2}$.
615  %  %
616  \end{itemize}  \end{itemize}
617  %  %
618  This procedure is repeated consecutively for each previous  This procedure is repeated consecutively for each previous
619  subsubsection $k_{n-1}^{lev2}, \ldots, k_{1}^{lev2} $  sub-subsection $k_{n-1}^{lev2}, \ldots, k_{1}^{lev2} $
620  carrying the adjoint computation to the initial time  carrying the adjoint computation to the initial time
621  of the subsection $k_{n}^{lev3}$.  of the subsection $k_{n}^{lev3}$.
622  Then, the procedure is repeated for the previous subsection  Then, the procedure is repeated for the previous subsection
# Line 632  on the computing resources available. Line 652  on the computing resources available.
652  \caption  \caption
653  {Schematic view of intermediate dump and restart for  {Schematic view of intermediate dump and restart for
654  3-level checkpointing.}  3-level checkpointing.}
655  \label{fig:erswns}  \label{fig:3levelcheck}
656  \end{figure}  \end{figure}
657    
658  \subsection{Optimal perturbations}  % \subsection{Optimal perturbations}
659  \label{optpert}  % \label{sec_optpert}
660    
661    
662  \subsection{Error covariance estimate and Hessian matrix}  % \subsection{Error covariance estimate and Hessian matrix}
663  \label{sec_hessian}  % \label{sec_hessian}
664    
665  \newpage  \newpage
666    
# Line 649  on the computing resources available. Line 669  on the computing resources available.
669  \label{sec_ad_setup_ex}  \label{sec_ad_setup_ex}
670  %**********************************************************************  %**********************************************************************
671    
672  The MITGCM has been adapted to enable AD using TAMC or TAF  The MITGCM has been adapted to enable AD using TAMC or TAF.
 (we'll refer to TAMC and TAF interchangeably, except where  
 distinctions are explicitly mentioned).  
673  The present description, therefore, is specific to the  The present description, therefore, is specific to the
674  use of TAMC as AD tool.  use of TAMC or TAF as AD tool.
675  The following sections describe the steps which are necessary to  The following sections describe the steps which are necessary to
676  generate a tangent linear or adjoint model of the MITGCM.  generate a tangent linear or adjoint model of the MITGCM.
677  We take as an example the sensitivity of carbon sequestration  We take as an example the sensitivity of carbon sequestration
# Line 664  The AD-relevant hooks in the code are sk Line 682  The AD-relevant hooks in the code are sk
682  \subsection{Overview of the experiment}  \subsection{Overview of the experiment}
683    
684  We describe an adjoint sensitivity analysis of outgassing from  We describe an adjoint sensitivity analysis of outgassing from
685  the ocean into the atmosphere of a carbon like tracer injected  the ocean into the atmosphere of a carbon-like tracer injected
686  into the ocean interior (see \cite{hil-eta:01}).  into the ocean interior (see \cite{hil-eta:01}).
687    
688  \subsubsection{Passive tracer equation}  \subsubsection{Passive tracer equation}
# Line 686  represents interior sources of $C$ such Line 704  represents interior sources of $C$ such
704  direct injection.  direct injection.
705  The velocity term, $U$, is the sum of the  The velocity term, $U$, is the sum of the
706  model Eulerian circulation and an eddy-induced velocity, the latter  model Eulerian circulation and an eddy-induced velocity, the latter
707  parameterized according to Gent/McWilliams (\cite{gen:90, dan:95}).  parameterized according to Gent/McWilliams
708    (\cite{gen-mcw:90, gen-eta:95}).
709  The convection function, $\Gamma$, mixes $C$ vertically wherever the  The convection function, $\Gamma$, mixes $C$ vertically wherever the
710  fluid is locally statically unstable.  fluid is locally statically unstable.
711    
# Line 777  together with the forcing fields and and Line 796  together with the forcing fields and and
796  %  %
797  \item {\it data.ctrl}  \item {\it data.ctrl}
798  %  %
799    \item {\it data.gmredi}
800    %
801    \item {\it data.grdchk}
802    %
803    \item {\it data.optim}
804    %
805  \item {\it data.pkg}  \item {\it data.pkg}
806  %  %
807  \item {\it eedata}  \item {\it eedata}
# Line 807  Below we describe the customisations of Line 832  Below we describe the customisations of
832  specific to this experiment.  specific to this experiment.
833    
834  \subsubsection{File {\it .genmakerc}}  \subsubsection{File {\it .genmakerc}}
835  This file overwites default settings of {\it genmake}.  This file overwrites default settings of {\it genmake}.
836  In the present example it is used to switch on the following  In the present example it is used to switch on the following
837  packages which are related to automatic differentiation  packages which are related to automatic differentiation
838  and are disabled by default: \\  and are disabled by default: \\
839  \hspace*{4ex} {\tt set ENABLE=( autodiff cost ctrl ecco )}  \\  \hspace*{4ex} {\tt set ENABLE=( autodiff cost ctrl ecco gmredi grdchk kpp )}  \\
840  Other packages which are not needed are switched off: \\  Other packages which are not needed are switched off: \\
841  \hspace*{4ex} {\tt set DISABLE=( aim obcs zonal\_filt shap\_filt cal exf )}  \hspace*{4ex} {\tt set DISABLE=( aim obcs zonal\_filt shap\_filt cal exf )}
842    
# Line 828  the standard include of the {\it CPP\_OP Line 853  the standard include of the {\it CPP\_OP
853    
854  This file contains 'wrapper'-specific CPP options.  This file contains 'wrapper'-specific CPP options.
855  It only needs to be changed if the code is to be run  It only needs to be changed if the code is to be run
856  in  parallel environment (see Section \ref{???}).  in a parallel environment (see Section \ref{???}).
857    
858  \subsubsection{File {\it CPP\_OPTIONS.h}}  \subsubsection{File {\it CPP\_OPTIONS.h}}
859    
# Line 837  This file contains model-specific CPP op Line 862  This file contains model-specific CPP op
862  Most options are related to the forward model setup.  Most options are related to the forward model setup.
863  They are identical to the global steady circulation setup of  They are identical to the global steady circulation setup of
864  {\it verification/exp2/}.  {\it verification/exp2/}.
865  The option specific to this experiment is \\  The three options specific to this experiment are \\
866    \hspace*{4ex} {\tt \#define ALLOW\_PASSIVE\_TRACER} \\
867    This flag enables the code to carry through the
868    advection/diffusion of a passive tracer along the
869    model integration. \\
870  \hspace*{4ex} {\tt \#define ALLOW\_MIT\_ADJOINT\_RUN} \\  \hspace*{4ex} {\tt \#define ALLOW\_MIT\_ADJOINT\_RUN} \\
871  This flag enables the inclusion of some AD-related fields  This flag enables the inclusion of some AD-related fields
872  concerning initialisation, link between control variables  concerning initialisation, link between control variables
873  and forward model variables, and the call to the top-level  and forward model variables, and the call to the top-level
874  forward/adjoint subroutine {\it adthe\_main\_loop}  forward/adjoint subroutine {\it adthe\_main\_loop}
875  instead of {\it the\_main\_loop}.  instead of {\it the\_main\_loop}. \\
876    \hspace*{4ex} {\tt \#define ALLOW\_GRADIENT\_CHECK} \\
877    This flag enables the gradient check package.
878    After computing the unperturbed cost function and its gradient,
879    a series of computations are performed for which \\
880    $\bullet$ an element of the control vector is perturbed \\
881    $\bullet$ the cost function w.r.t. the perturbed element is
882    computed \\
883    $\bullet$ the difference between the perturbed and unperturbed
884    cost function is computed to compute the finite difference gradient \\
885    $\bullet$ the finite difference gradient is compared with the
886    adjoint-generated gradient.
887    The gradient check package is further described in Section ???.
888    
889  \subsubsection{File {\it ECCO\_OPTIONS.h}}  \subsubsection{File {\it ECCO\_OPTIONS.h}}
890    
# Line 878  enables all general aspects of the cost Line 919  enables all general aspects of the cost
919  in particular the hooks in the foorward code for  in particular the hooks in the foorward code for
920  initialising, accumulating and finalizing the cost function. \\  initialising, accumulating and finalizing the cost function. \\
921  \hspace*{4ex} {\tt \#define ALLOW\_COST\_TRACER} \\  \hspace*{4ex} {\tt \#define ALLOW\_COST\_TRACER} \\
922  includes the subroutine with the cost function for this  includes the call to the cost function for this
923  particular experiment, eqn. (\ref{cost_tracer}).  particular experiment, eqn. (\ref{cost_tracer}).
924  %  %
925  \item Control variable package: {\it pkg/ctrl/} \\  \item Control variable package: {\it pkg/ctrl/} \\
# Line 900  meridional wind stress \\ Line 941  meridional wind stress \\
941  freshwater flux \\  freshwater flux \\
942  \hspace*{2ex} {\tt \#define ALLOW\_HFLUX0\_CONTROL} &  \hspace*{2ex} {\tt \#define ALLOW\_HFLUX0\_CONTROL} &
943  heat flux \\  heat flux \\
944  \hspace*{2ex} {\tt \#undef ALLOW\_DIFFKR\_CONTROL} &  \hspace*{2ex} {\tt \#define ALLOW\_DIFFKR\_CONTROL} &
945  diapycnal diffusivity \\  diapycnal diffusivity \\
946  \hspace*{2ex} {\tt \#undef ALLOW\_KAPPAGM\_CONTROL} &  \hspace*{2ex} {\tt \#undef ALLOW\_KAPPAGM\_CONTROL} &
947  isopycnal diffusivity \\  isopycnal diffusivity \\
# Line 932  The common blocks are used by the adjoin Line 973  The common blocks are used by the adjoin
973  \hspace*{4ex} is related to {\it DYNVARS.h} \\  \hspace*{4ex} is related to {\it DYNVARS.h} \\
974  \hspace*{4ex} {\tt common /addynvars\_cd/} &  \hspace*{4ex} {\tt common /addynvars\_cd/} &
975  \hspace*{4ex} is related to {\it DYNVARS.h} \\  \hspace*{4ex} is related to {\it DYNVARS.h} \\
976    \hspace*{4ex} {\tt common /addynvars\_diffkr/} &
977    \hspace*{4ex} is related to {\it DYNVARS.h} \\
978    \hspace*{4ex} {\tt common /addynvars\_kapgm/} &
979    \hspace*{4ex} is related to {\it DYNVARS.h} \\
980  \hspace*{4ex} {\tt common /adtr1\_r/} &  \hspace*{4ex} {\tt common /adtr1\_r/} &
981  \hspace*{4ex} is related to {\it TR1.h} \\  \hspace*{4ex} is related to {\it TR1.h} \\
982  \hspace*{4ex} {\tt common /adffields/} &  \hspace*{4ex} {\tt common /adffields/} &
# Line 956  This routine contains the dimensions for Line 1001  This routine contains the dimensions for
1001  3-level checkpointing is enabled, i.e. the timestepping  3-level checkpointing is enabled, i.e. the timestepping
1002  is divided into three different levels (see Section \ref{???}).  is divided into three different levels (see Section \ref{???}).
1003  The model state of the outermost ({\tt nchklev\_3}) and the  The model state of the outermost ({\tt nchklev\_3}) and the
1004  itermediate ({\tt nchklev\_2}) timestepping loop are stored to file  intermediate ({\tt nchklev\_2}) timestepping loop are stored to file
1005  (handled in {\it the\_main\_loop}).  (handled in {\it the\_main\_loop}).
1006  The innermost loop ({\tt nchklev\_1})  The innermost loop ({\tt nchklev\_1})
1007  avoids I/O by storing all required variables  avoids I/O by storing all required variables
# Line 968  In the present example the dimensions ar Line 1013  In the present example the dimensions ar
1013  \hspace*{4ex} {\tt nchklev\_2      =  30 } \\  \hspace*{4ex} {\tt nchklev\_2      =  30 } \\
1014  \hspace*{4ex} {\tt nchklev\_3      =  60 } \\  \hspace*{4ex} {\tt nchklev\_3      =  60 } \\
1015  To guarantee that the checkpointing intervals span the entire  To guarantee that the checkpointing intervals span the entire
1016  integration period the relation \\  integration period the following relation must be satisfied: \\
1017  \hspace*{4ex} {\tt nchklev\_1*nchklev\_2*nchklev\_3 $ \ge $ nTimeSteps} \\  \hspace*{4ex} {\tt nchklev\_1*nchklev\_2*nchklev\_3 $ \ge $ nTimeSteps} \\
1018  where {\tt nTimeSteps} is either specified in {\it data}  where {\tt nTimeSteps} is either specified in {\it data}
1019  or computed via \\  or computed via \\
# Line 982  Similar to above, the following relation Line 1027  Similar to above, the following relation
1027  %  %
1028  \end{itemize}  \end{itemize}
1029    
1030    The following parameters may be worth describing: \\
1031    %
1032    \hspace*{4ex} {\tt isbyte} \\
1033    \hspace*{4ex} {\tt maxpass} \\
1034    ~
1035    
1036  \subsubsection{File {\it makefile}}  \subsubsection{File {\it makefile}}
1037    
1038  This file contains all relevant paramter flags and  This file contains all relevant paramter flags and
1039  lists to run TAMC.  lists to run TAMC or TAF.
1040  It is assumed that TAMC is available to you, either locally,  It is assumed that TAMC is available to you, either locally,
1041  being installed on your network, or remotely through the 'TAMC Utility'.  being installed on your network, or remotely through the 'TAMC Utility'.
1042  TAMC is called with the command {\tt tamc} followed by a  TAMC is called with the command {\tt tamc} followed by a
# Line 996  Here we briefly discuss the main flags u Line 1047  Here we briefly discuss the main flags u
1047  \begin{itemize}  \begin{itemize}
1048  \item [{\tt tamc}] {\tt  \item [{\tt tamc}] {\tt
1049  -input <variable names>  -input <variable names>
1050  -output <variable name> ... \\  -output <variable name> -r4 ... \\
1051  -toplevel <S/R name> -reverse <file names>  -toplevel <S/R name> -reverse <file names>
1052  }  }
1053  \end{itemize}  \end{itemize}
# Line 1017  Dependent variable $ J $  which is to be Line 1068  Dependent variable $ J $  which is to be
1068  \item {\tt -reverse <file names>} \\  \item {\tt -reverse <file names>} \\
1069  Adjoint code is generated to compute the sensitivity of an  Adjoint code is generated to compute the sensitivity of an
1070  independent variable w.r.t.  many dependent variables.  independent variable w.r.t.  many dependent variables.
1071  The generated adjoint top-level routine computes the product  In the discussion of Section ???
1072    the generated adjoint top-level routine computes the product
1073  of the transposed Jacobian matrix $ M^T $ times  of the transposed Jacobian matrix $ M^T $ times
1074  the gradient vector $ \nabla_v J $.  the gradient vector $ \nabla_v J $.
1075  \\  \\
# Line 1028  above the top-level routine (some initia Line 1080  above the top-level routine (some initia
1080  deliberately hidden from TAMC, either because hand-written  deliberately hidden from TAMC, either because hand-written
1081  adjoint routines exist, or the routines must not (or don't have to)  adjoint routines exist, or the routines must not (or don't have to)
1082  be differentiated. For each routine which is part of the flow tree  be differentiated. For each routine which is part of the flow tree
1083  of the top-level routine, but deliberately hidden from TAMC,  of the top-level routine, but deliberately hidden from TAMC
1084    (or for each package which contains such routines),
1085  a corresponding file {\it .flow} exists containing flow directives  a corresponding file {\it .flow} exists containing flow directives
1086  for TAMC.  for TAMC.
1087  %  %
1088    \item {\tt -r4} \\
1089    ~
1090    %
1091  \end{itemize}  \end{itemize}
1092    
1093    
# Line 1041  for TAMC. Line 1097  for TAMC.
1097    
1098  \subsubsection{File {\it data.ctrl}}  \subsubsection{File {\it data.ctrl}}
1099    
1100    \subsubsection{File {\it data.gmredi}}
1101    
1102    \subsubsection{File {\it data.grdchk}}
1103    
1104    \subsubsection{File {\it data.optim}}
1105    
1106  \subsubsection{File {\it data.pkg}}  \subsubsection{File {\it data.pkg}}
1107    
1108  \subsubsection{File {\it eedata}}  \subsubsection{File {\it eedata}}
# Line 1060  for TAMC. Line 1122  for TAMC.
1122  \newpage  \newpage
1123    
1124  %**********************************************************************  %**********************************************************************
1125  \section{TLM and ADM code generation in general}  \section{TLM and ADM generation in general}
1126  \label{sec_ad_setup_gen}  \label{sec_ad_setup_gen}
1127  %**********************************************************************  %**********************************************************************
1128    
# Line 1068  In this section we describe in a general Line 1130  In this section we describe in a general
1130  the parts of the code that are relevant for automatic  the parts of the code that are relevant for automatic
1131  differentiation using the software tool TAMC.  differentiation using the software tool TAMC.
1132    
1133  \subsection{The cost function (dependent variable)}  \begin{figure}[b!]
1134    \input{part5/doc_ad_the_model}
1135    \caption{~}
1136    \label{fig:adthemodel}
1137    \end{figure}
1138    
1139    The basic flow is depicted in \reffig{adthemodel}.
1140    If the option {\tt ALLOW\_AUTODIFF\_TAMC} is defined, the driver routine
1141    {\it the\_model\_main}, instead of calling {\it the\_main\_loop},
1142    invokes the adjoint of this routine, {\it adthe\_main\_loop},
1143    which is the toplevel routine in terms of reverse mode computation.
1144    The routine {\it adthe\_main\_loop} has been generated using TAMC.
1145    It contains both the forward integration of the full model,
1146    any additional storing that is required for efficient checkpointing,
1147    and the reverse integration of the adjoint model.
1148    The structure of {\it adthe\_main\_loop} has been strongly
1149    simplified for clarification; in particular, no checkpointing
1150    procedures are shown here.
1151    Prior to the call of {\it adthe\_main\_loop}, the routine
1152    {\it ctrl\_unpack} is invoked to unpack the control vector,
1153    and following that call, the routine {\it ctrl\_pack}
1154    is invoked to pack the control vector
1155    (cf. Section \ref{section_ctrl}).
1156    If gradient checks are to be performed, the option
1157    {\tt ALLOW\_GRADIENT\_CHECK} is defined. In this case
1158    the driver routine {\it grdchk\_main} is called after
1159    the gradient has been computed via the adjoint
1160    (cf. Section \ref{section_grdchk}).
1161    
1162    \subsection{The cost function (dependent variable)
1163    \label{section_cost}}
1164    
1165  The cost function $ {\cal J} $ is referred to as the {\sf dependent variable}.  The cost function $ {\cal J} $ is referred to as the {\sf dependent variable}.
1166  It is a function of the input variables $ \vec{u} $ via the composition  It is a function of the input variables $ \vec{u} $ via the composition
# Line 1076  $ {\cal J}(\vec{u}) \, = \, {\cal J}(M(\ Line 1168  $ {\cal J}(\vec{u}) \, = \, {\cal J}(M(\
1168  The input is referred to as the  The input is referred to as the
1169  {\sf independent variables} or {\sf control variables}.  {\sf independent variables} or {\sf control variables}.
1170  All aspects relevant to the treatment of the cost function $ {\cal J} $  All aspects relevant to the treatment of the cost function $ {\cal J} $
1171  (parameter setting, initialisation, incrementation,  (parameter setting, initialisation, accumulation,
1172  final evaluation), are controled by the package {\it pkg/cost}.  final evaluation), are controlled by the package {\it pkg/cost}.
1173    
1174    \begin{figure}[h!]
1175    \input{part5/doc_cost_flow}
1176    \caption{~}
1177    \label{fig:costflow}
1178    \end{figure}
1179    
1180  \subsubsection{genmake and CPP options}  \subsubsection{genmake and CPP options}
1181  %  %
# Line 1097  compile list in 3 different ways (cf. Se Line 1195  compile list in 3 different ways (cf. Se
1195  \begin{enumerate}  \begin{enumerate}
1196  %  %
1197  \item {\it genmake}: \\  \item {\it genmake}: \\
1198  Change the default settngs in the file {\it genmake} by adding  Change the default settings in the file {\it genmake} by adding
1199  {\bf cost} to the {\bf enable} list (not recommended).  {\bf cost} to the {\bf enable} list (not recommended).
1200  %  %
1201  \item {\it .genmakerc}: \\  \item {\it .genmakerc}: \\
# Line 1110  Call {\it genmake} with the option Line 1208  Call {\it genmake} with the option
1208  {\tt genmake -enable=cost}.  {\tt genmake -enable=cost}.
1209  %  %
1210  \end{enumerate}  \end{enumerate}
 Since the cost function is usually used in conjunction with  
 automatic differentiation, the CPP option  
 {\bf ALLOW\_ADJOINT\_RUN} should be defined  
 (file {\it CPP\_OPTIONS.h}).  
1211  The basic CPP option to enable the cost function is {\bf ALLOW\_COST}.  The basic CPP option to enable the cost function is {\bf ALLOW\_COST}.
1212  Each specific cost function contribution has its own option.  Each specific cost function contribution has its own option.
1213  For the present example the option is {\bf ALLOW\_COST\_TRACER}.  For the present example the option is {\bf ALLOW\_COST\_TRACER}.
1214  All cost-specific options are set in {\it ECCO\_CPPOPTIONS.h}  All cost-specific options are set in {\it ECCO\_CPPOPTIONS.h}
1215    Since the cost function is usually used in conjunction with
1216    automatic differentiation, the CPP option
1217    {\bf ALLOW\_ADJOINT\_RUN} should be defined
1218    (file {\it CPP\_OPTIONS.h}).
1219    
1220  \subsubsection{Initialisation}  \subsubsection{Initialisation}
1221  %  %
# Line 1158  which is defined on each tile (bi,bj). Line 1256  which is defined on each tile (bi,bj).
1256  %  %
1257  \end{itemize}  \end{itemize}
1258  %  %
1259  \subsubsection{Incrementation}  \subsubsection{Accumulation}
1260  %  %
1261  \begin{itemize}  \begin{itemize}
1262  %  %
# Line 1206  The total cost function {\bf fc} will be Line 1304  The total cost function {\bf fc} will be
1304  tamc -output 'fc' ...  tamc -output 'fc' ...
1305  \end{verbatim}  \end{verbatim}
1306    
 \begin{figure}[t!]  
 \input{part5/doc_ad_the_model}  
 \label{fig:adthemodel}  
 \caption{~}  
 \end{figure}  
   
1307  %%%% \end{document}  %%%% \end{document}
1308    
1309  \begin{figure}  \begin{figure}
1310  \input{part5/doc_ad_the_main}  \input{part5/doc_ad_the_main}
 \label{fig:adthemain}  
1311  \caption{~}  \caption{~}
1312    \label{fig:adthemain}
1313  \end{figure}  \end{figure}
1314    
1315  \subsection{The control variables (independent variables)}  \subsection{The control variables (independent variables)
1316    \label{section_ctrl}}
1317    
1318  The control variables are a subset of the model input  The control variables are a subset of the model input
1319  (initial conditions, boundary conditions, model parameters).  (initial conditions, boundary conditions, model parameters).
1320  Here we identify them with the variable $ \vec{u} $.  Here we identify them with the variable $ \vec{u} $.
1321  All intermediate variables whose derivative w.r.t. control  All intermediate variables whose derivative w.r.t. control
1322  variables don't vanish are called {\sf active variables}.  variables do not vanish are called {\sf active variables}.
1323  All subroutines whose derivative w.r.t. the control variables  All subroutines whose derivative w.r.t. the control variables
1324  don't vanish are called {\sf active routines}.  don't vanish are called {\sf active routines}.
1325  Read and write operations from and to file can be viewed  Read and write operations from and to file can be viewed
# Line 1237  All aspects relevant to the treatment of Line 1330  All aspects relevant to the treatment of
1330  (parameter setting, initialisation, perturbation)  (parameter setting, initialisation, perturbation)
1331  are controled by the package {\it pkg/ctrl}.  are controled by the package {\it pkg/ctrl}.
1332    
1333    \begin{figure}[h!]
1334    \input{part5/doc_ctrl_flow}
1335    \caption{~}
1336    \label{fig:ctrlflow}
1337    \end{figure}
1338    
1339  \subsubsection{genmake and CPP options}  \subsubsection{genmake and CPP options}
1340  %  %
1341  \begin{itemize}  \begin{itemize}
# Line 1295  Two important issues related to the hand Line 1394  Two important issues related to the hand
1394  variables in the MITGCM need to be addressed.  variables in the MITGCM need to be addressed.
1395  First, in order to save memory, the control variable arrays  First, in order to save memory, the control variable arrays
1396  are not kept in memory, but rather read from file and added  are not kept in memory, but rather read from file and added
1397  to the initial (or first guess) fields.  to the initial fields during the model initialisation phase.
1398  Similarly, the corresponding adjoint fields which represent  Similarly, the corresponding adjoint fields which represent
1399  the gradient of the cost function w.r.t. the control variables  the gradient of the cost function w.r.t. the control variables
1400  are written to to file.  are written to file at the end of the adjoint integration.
1401  Second, in addition to the files holding the 2-dim. and 3-dim.  Second, in addition to the files holding the 2-dim. and 3-dim.
1402  control variables and the gradient, a 1-dim. {\sf control vector}  control variables and the corresponding cost gradients,
1403    a 1-dim. {\sf control vector}
1404  and {\sf gradient vector} are written to file. They contain  and {\sf gradient vector} are written to file. They contain
1405  only the wet points of the control variables and the corresponding  only the wet points of the control variables and the corresponding
1406  gradient.  gradient.
1407  This leads to a significant data compression.  This leads to a significant data compression.
1408  Furthermore, the control and the gradient vector can be passed to a  Furthermore, an option is available
1409    ({\tt ALLOW\_NONDIMENSIONAL\_CONTROL\_IO}) to
1410    non-dimensionalise the control and gradient vector,
1411    which otherwise would contain different pieces of different
1412    magnitudes and units.
1413    Finally, the control and gradient vector can be passed to a
1414  minimization routine if an update of the control variables  minimization routine if an update of the control variables
1415  is sought as part of a minimization exercise.  is sought as part of a minimization exercise.
1416    
# Line 1316  and gradient are generated and initialis Line 1421  and gradient are generated and initialis
1421    
1422  \subsubsection{Perturbation of the independent variables}  \subsubsection{Perturbation of the independent variables}
1423  %  %
1424  The dependency chain for differentiation starts  The dependency flow for differentiation w.r.t. the controls
1425  with adding a perturbation onto the the input variable,  starts with adding a perturbation onto the input variable,
1426  thus defining the independent or control variables for TAMC.  thus defining the independent or control variables for TAMC.
1427  Three classes of controls may be considered:  Three types of controls may be considered:
1428  %  %
1429  \begin{itemize}  \begin{itemize}
1430  %  %
# Line 1334  Three classes of controls may be conside Line 1439  Three classes of controls may be conside
1439  Consider as an example the initial tracer distribution  Consider as an example the initial tracer distribution
1440  {\bf tr1} as control variable.  {\bf tr1} as control variable.
1441  After {\bf tr1} has been initialised in  After {\bf tr1} has been initialised in
1442  {\it ini\_tr1} (dynamical variables including  {\it ini\_tr1} (dynamical variables such as
1443  temperature and salinity are initialised in {\it ini\_fields}),  temperature and salinity are initialised in {\it ini\_fields}),
1444  a perturbation anomaly is added to the field in S/R  a perturbation anomaly is added to the field in S/R
1445  {\it ctrl\_map\_ini}  {\it ctrl\_map\_ini}
# Line 1347  u         & = \, u_{[0]} \, + \, \Delta Line 1452  u         & = \, u_{[0]} \, + \, \Delta
1452  \end{split}  \end{split}
1453  \end{equation}  \end{equation}
1454  %  %
1455  In principle {\bf xx\_tr1} is a 3-dim. global array  {\bf xx\_tr1} is a 3-dim. global array
1456  holding the perturbation. In the case of a simple  holding the perturbation. In the case of a simple
1457  sensitivity study this array is identical to zero.  sensitivity study this array is identical to zero.
1458  However, it's specification is essential since TAMC  However, it's specification is essential in the context
1459    of automatic differentiation since TAMC
1460  treats the corresponding line in the code symbolically  treats the corresponding line in the code symbolically
1461  when determining the differentiation chain and its origin.  when determining the differentiation chain and its origin.
1462  Thus, the variable names are part of the argument list  Thus, the variable names are part of the argument list
# Line 1390  Note, that reading an active variable co Line 1496  Note, that reading an active variable co
1496  to a variable assignment. Its derivative corresponds  to a variable assignment. Its derivative corresponds
1497  to a write statement of the adjoint variable.  to a write statement of the adjoint variable.
1498  The 'active file' routines have been designed  The 'active file' routines have been designed
1499  to support active read and corresponding active write  to support active read and corresponding adjoint active write
1500  operations.  operations (and vice versa).
1501  %  %
1502  \item  \item
1503  \fbox{  \fbox{
# Line 1408  with the symbolic perturbation taking pl Line 1514  with the symbolic perturbation taking pl
1514  Note however an important difference:  Note however an important difference:
1515  Since the boundary values are time dependent with a new  Since the boundary values are time dependent with a new
1516  forcing field applied at each time steps,  forcing field applied at each time steps,
1517  the general problem may be be thought of as  the general problem may be thought of as
1518  a new control variable at each time step, i.e.  a new control variable at each time step
1519    (or, if the perturbation is averaged over a certain period,
1520    at each $ N $ timesteps), i.e.
1521  \[  \[
1522  u_{\rm forcing} \, = \,  u_{\rm forcing} \, = \,
1523  \{ \, u_{\rm forcing} ( t_n ) \, \}_{  \{ \, u_{\rm forcing} ( t_n ) \, \}_{
# Line 1434  calendar ({\it cal}~) and external forci Line 1542  calendar ({\it cal}~) and external forci
1542  %  %
1543  This routine is not yet implemented, but would proceed  This routine is not yet implemented, but would proceed
1544  proceed along the same lines as the initial value sensitivity.  proceed along the same lines as the initial value sensitivity.
1545    The mixing parameters {\bf diffkr} and {\bf kapgm}
1546    are currently added as controls in {\it ctrl\_map\_ini.F}.
1547  %  %
1548  \end{itemize}  \end{itemize}
1549  %  %
1550    
1551  \subsubsection{Output of adjoint variables and gradient}  \subsubsection{Output of adjoint variables and gradient}
1552  %  %
1553  Two ways exist to generate output of adjoint fields.  Several ways exist to generate output of adjoint fields.
1554  %  %
1555  \begin{itemize}  \begin{itemize}
1556  %  %
1557  \item  \item
1558  \fbox{  \fbox{
1559  \begin{minipage}{12cm}  \begin{minipage}{12cm}
1560  {\it ctrl\_pack}:  {\it ctrl\_map\_ini, ctrl\_map\_forcing}:
1561  \end{minipage}  \end{minipage}
1562  }  }
1563  \\  \\
 At the end of the forward/adjoint integration, the S/R  
 {\it ctrl\_pack} is called which mirrors S/R {\it ctrl\_unpack}.  
 It writes the following files:  
 %  
1564  \begin{itemize}  \begin{itemize}
1565  %  %
1566  \item {\bf xx\_...}: the control variable fields  \item {\bf xx\_...}: the control variable fields \\
1567    Before the forward integration, the control
1568    variables are read from file {\bf xx\_ ...} and added to
1569    the model field.
1570  %  %
1571  \item {\bf adxx\_...}: the adjoint variable fields, i.e. the gradient  \item {\bf adxx\_...}: the adjoint variable fields, i.e. the gradient
1572  $ \nabla _{u}{\cal J} $ for each control variable,  $ \nabla _{u}{\cal J} $ for each control variable \\
1573    After the adjoint integration the corresponding adjoint
1574    variables are written to {\bf adxx\_ ...}.
1575  %  %
1576  \item {\bf vector\_ctrl}: the control vector  \end{itemize}
1577  %  %
1578  \item {\bf vector\_grad}: the gradient vector  \item
1579    \fbox{
1580    \begin{minipage}{12cm}
1581    {\it ctrl\_unpack, ctrl\_pack}:
1582    \end{minipage}
1583    }
1584    \\
1585    %
1586    \begin{itemize}
1587    %
1588    \item {\bf vector\_ctrl}: the control vector \\
1589    At the very beginning of the model initialisation,
1590    the updated compressed control vector is read (or initialised)
1591    and distributed to 2-dim. and 3-dim. control variable fields.
1592    %
1593    \item {\bf vector\_grad}: the gradient vector \\
1594    At the very end of the adjoint integration,
1595    the 2-dim. and 3-dim. adjoint variables are read,
1596    compressed to a single vector and written to file.
1597  %  %
1598  \end{itemize}  \end{itemize}
1599  %  %
# Line 1476  $ \nabla _{u}{\cal J} $ for each control Line 1605  $ \nabla _{u}{\cal J} $ for each control
1605  }  }
1606  \\  \\
1607  In addition to writing the gradient at the end of the  In addition to writing the gradient at the end of the
1608  forward/adjoint integration, many more adjoint variables,  forward/adjoint integration, many more adjoint variables
1609  representing the Lagrange multipliers of the model state  of the model state
1610  w.r.t. the model state  at intermediate times can be written using S/R
 at different times can be written using S/R  
1611  {\it addummy\_in\_stepping}.  {\it addummy\_in\_stepping}.
1612  This routine is part of the adjoint support package  This routine is part of the adjoint support package
1613  {\it pkg/autodiff} (cf.f. below).  {\it pkg/autodiff} (cf.f. below).
# Line 1493  than generated automatically. Line 1621  than generated automatically.
1621  Appropriate flow directives ({\it dummy\_in\_stepping.flow})  Appropriate flow directives ({\it dummy\_in\_stepping.flow})
1622  ensure that TAMC does not automatically  ensure that TAMC does not automatically
1623  generate {\it addummy\_in\_stepping} by trying to differentiate  generate {\it addummy\_in\_stepping} by trying to differentiate
1624  {\it dummy\_in\_stepping}, but rather takes the hand-written routine.  {\it dummy\_in\_stepping}, but instead refers to
1625    the hand-written routine.
1626    
1627  {\it dummy\_in\_stepping} is called in the forward code  {\it dummy\_in\_stepping} is called in the forward code
1628  at the beginning of each  at the beginning of each
# Line 1503  each timestep in the adjoint calculation Line 1632  each timestep in the adjoint calculation
1632  {\it addynamics}.  {\it addynamics}.
1633    
1634  {\it addummy\_in\_stepping} includes the header files  {\it addummy\_in\_stepping} includes the header files
1635  {\it adffields.h, addynamics.h, adtr1.h}.  {\it adcommon.h}.
1636  These header files are also hand-written. They contain  This header file is also hand-written. It contains
1637  the common blocks {\bf /addynvars\_r/}, {\bf /addynvars\_cd/},  the common blocks
1638    {\bf /addynvars\_r/}, {\bf /addynvars\_cd/},
1639    {\bf /addynvars\_diffkr/}, {\bf /addynvars\_kapgm/},
1640  {\bf /adtr1\_r/}, {\bf /adffields/},  {\bf /adtr1\_r/}, {\bf /adffields/},
1641  which have been extracted from the adjoint code to enable  which have been extracted from the adjoint code to enable
1642  access to the adjoint variables.  access to the adjoint variables.
# Line 1523  The gradient $ \nabla _{u}{\cal J} |_{u_ Line 1654  The gradient $ \nabla _{u}{\cal J} |_{u_
1654  with the value of the cost function itself $ {\cal J}(u_{[k]}) $  with the value of the cost function itself $ {\cal J}(u_{[k]}) $
1655  at iteration step $ k $ serve  at iteration step $ k $ serve
1656  as input to a minimization routine (e.g. quasi-Newton method,  as input to a minimization routine (e.g. quasi-Newton method,
1657  conjugate gradient, ...) to compute an update in the  conjugate gradient, ... \cite{gil_lem:89})
1658    to compute an update in the
1659  control variable for iteration step $k+1$  control variable for iteration step $k+1$
1660  \[  \[
1661  u_{[k+1]} \, = \,  u_{[0]} \, + \, \Delta u_{[k+1]}  u_{[k+1]} \, = \,  u_{[0]} \, + \, \Delta u_{[k+1]}
# Line 1537  Tab. \ref{???} sketches the flow between Line 1669  Tab. \ref{???} sketches the flow between
1669  and the minimization routine.  and the minimization routine.
1670    
1671  \begin{eqnarray*}  \begin{eqnarray*}
1672  \footnotesize  \scriptsize
1673  \begin{array}{ccccc}  \begin{array}{ccccc}
1674  u_{[0]} \,\, ,  \,\, \Delta u_{[k]}    & ~ & ~ & ~ & ~ \\  u_{[0]} \,\, ,  \,\, \Delta u_{[k]}    & ~ & ~ & ~ & ~ \\
1675  {\Big\downarrow}  {\Big\downarrow}
# Line 1554  v_{[k]} = M \left( u_{[k]} \right) & Line 1686  v_{[k]} = M \left( u_{[k]} \right) &
1686  {\cal J}_{[k]} = {\cal J} \left( M \left( u_{[k]} \right) \right)} \\  {\cal J}_{[k]} = {\cal J} \left( M \left( u_{[k]} \right) \right)} \\
1687  \multicolumn{1}{|c}{~} & ~ & ~ & ~ & \multicolumn{1}{c|}{~} \\  \multicolumn{1}{|c}{~} & ~ & ~ & ~ & \multicolumn{1}{c|}{~} \\
1688  \hline  \hline
1689    \multicolumn{1}{|c}{~} & ~ & ~ & ~ & \multicolumn{1}{c|}{~}  \\
1690    \multicolumn{1}{|c}{~} & ~ & ~ & ~ & \multicolumn{1}{c|}{{\Big\downarrow}} \\
1691    \multicolumn{1}{|c}{~} & ~ & ~ & ~ & \multicolumn{1}{c|}{~}  \\
1692  \hline  \hline
1693  \multicolumn{1}{|c}{~} & ~ & ~ & ~ & \multicolumn{1}{c|}{~} \\  \multicolumn{1}{|c}{~} & ~ & ~ & ~ & \multicolumn{1}{c|}{~} \\
1694  \multicolumn{1}{|c}{  \multicolumn{1}{|c}{
1695  \nabla_u {\cal J}_{[k]} (\delta {\cal J}) =  \nabla_u {\cal J}_{[k]} (\delta {\cal J}) =
1696  T\!\!^{\ast} \cdot \nabla_v {\cal J} |_{v_{[k]}} (\delta {\cal J})} &  T^{\ast} \cdot \nabla_v {\cal J} |_{v_{[k]}} (\delta {\cal J})} &
1697  \stackrel{\bf adjoint}{\mathbf \longleftarrow} &  \stackrel{\bf adjoint}{\mathbf \longleftarrow} &
1698  ad \, v_{[k]} (\delta {\cal J}) =  ad \, v_{[k]} (\delta {\cal J}) =
1699  \nabla_v {\cal J} |_{v_{[k]}} (\delta {\cal J}) &  \nabla_v {\cal J} |_{v_{[k]}} (\delta {\cal J}) &
# Line 1567  ad \, v_{[k]} (\delta {\cal J}) = Line 1702  ad \, v_{[k]} (\delta {\cal J}) =
1702  \multicolumn{1}{|c}{~} & ~ & ~ & ~ & \multicolumn{1}{c|}{~} \\  \multicolumn{1}{|c}{~} & ~ & ~ & ~ & \multicolumn{1}{c|}{~} \\
1703  \hline  \hline
1704   ~ & ~ & ~ & ~ & ~ \\   ~ & ~ & ~ & ~ & ~ \\
1705  ~ & ~ &  \hspace*{15ex}{\Bigg\downarrow}  
1706  {\cal J}_{[k]} \qquad {\Bigg\downarrow}  \qquad \nabla_u {\cal J}_{[k]}  \quad {\cal J}_{[k]}, \quad \nabla_u {\cal J}_{[k]}
1707   & ~ & ~ \\   & ~ & ~ & ~ & ~ \\
1708   ~ & ~ & ~ & ~ & ~ \\   ~ & ~ & ~ & ~ & ~ \\
1709  \hline  \hline
1710  \multicolumn{1}{|c}{~} & ~ & ~ & ~ & \multicolumn{1}{c|}{~} \\  \multicolumn{1}{|c}{~} & ~ & ~ & ~ & \multicolumn{1}{c|}{~} \\
# Line 1597  The corresponding I/O flow looks as foll Line 1732  The corresponding I/O flow looks as foll
1732    
1733  \vspace*{0.5cm}  \vspace*{0.5cm}
1734    
1735    {\scriptsize
1736  \begin{tabular}{ccccc}  \begin{tabular}{ccccc}
1737  {\bf vector\_ctrl\_$<$k$>$ } & ~ & ~ & ~ & ~ \\  {\bf vector\_ctrl\_$<$k$>$ } & ~ & ~ & ~ & ~ \\
1738  {\big\downarrow}  & ~ & ~ & ~ & ~ \\  {\big\downarrow}  & ~ & ~ & ~ & ~ \\
# Line 1607  The corresponding I/O flow looks as foll Line 1743  The corresponding I/O flow looks as foll
1743  \cline{3-3}  \cline{3-3}
1744  \multicolumn{1}{l}{\bf xx\_theta0...$<$k$>$} & ~ &  \multicolumn{1}{l}{\bf xx\_theta0...$<$k$>$} & ~ &
1745  \multicolumn{1}{|c|}{~} & ~ & ~ \\  \multicolumn{1}{|c|}{~} & ~ & ~ \\
1746  \multicolumn{1}{l}{\bf xx\_salt0...$<$k$>$} & $\longrightarrow$ &  \multicolumn{1}{l}{\bf xx\_salt0...$<$k$>$} &
1747    $\stackrel{\mbox{read}}{\longrightarrow}$ &
1748  \multicolumn{1}{|c|}{forward integration} & ~ & ~ \\  \multicolumn{1}{|c|}{forward integration} & ~ & ~ \\
1749  \multicolumn{1}{l}{\bf \vdots} & ~ & \multicolumn{1}{|c|}{~}    \multicolumn{1}{l}{\bf \vdots} & ~ & \multicolumn{1}{|c|}{~}  
1750  & ~ & ~ \\  & ~ & ~ \\
1751  \cline{3-3}  \cline{3-3}
1752  ~ & ~ & ~ & ~ & ~ \\  ~ & ~ & $\downarrow$ & ~ & ~ \\
1753  \cline{3-3}  \cline{3-3}
1754  ~ & ~ &  ~ & ~ &
1755  \multicolumn{1}{|c|}{~} & ~ &  \multicolumn{1}{|c|}{~} & ~ &
1756  \multicolumn{1}{l}{\bf adxx\_theta0...$<$k$>$}  \\  \multicolumn{1}{l}{\bf adxx\_theta0...$<$k$>$}  \\
1757  ~ & ~ & \multicolumn{1}{|c|}{adjoint integration} &  ~ & ~ & \multicolumn{1}{|c|}{adjoint integration} &
1758  $\longrightarrow$ &  $\stackrel{\mbox{write}}{\longrightarrow}$ &
1759  \multicolumn{1}{l}{\bf adxx\_salt0...$<$k$>$} \\  \multicolumn{1}{l}{\bf adxx\_salt0...$<$k$>$} \\
1760  ~ & ~ & \multicolumn{1}{|c|}{~}    ~ & ~ & \multicolumn{1}{|c|}{~}  
1761  & ~ & \multicolumn{1}{l}{\bf \vdots} \\  & ~ & \multicolumn{1}{l}{\bf \vdots} \\
# Line 1630  $\longrightarrow$ & Line 1767  $\longrightarrow$ &
1767  ~ & ~ & ~ & ~ &  {\big\downarrow} \\  ~ & ~ & ~ & ~ &  {\big\downarrow} \\
1768  ~ & ~ & ~ & ~ &  {\bf vector\_grad\_$<$k$>$ } \\  ~ & ~ & ~ & ~ &  {\bf vector\_grad\_$<$k$>$ } \\
1769  \end{tabular}  \end{tabular}
1770    }
1771    
1772  \vspace*{0.5cm}  \vspace*{0.5cm}
1773    
1774    
1775  {\it ctrl\_unpack} reads in the updated control vector  {\it ctrl\_unpack} reads the updated control vector
1776  {\bf vector\_ctrl\_$<$k$>$}.  {\bf vector\_ctrl\_$<$k$>$}.
1777  It distributes the different control variables to  It distributes the different control variables to
1778  2-dim. and 3-dim. files {\it xx\_...$<$k$>$}.  2-dim. and 3-dim. files {\it xx\_...$<$k$>$}.
1779  During the forward integration the control variables  At the start of the forward integration the control variables
1780  are read from {\it xx\_...$<$k$>$}.  are read from {\it xx\_...$<$k$>$} and added to the
1781  Correspondingly, the adjoint fields are written  field.
1782    Correspondingly, at the end of the adjoint integration
1783    the adjoint fields are written
1784  to {\it adxx\_...$<$k$>$}, again via the active file routines.  to {\it adxx\_...$<$k$>$}, again via the active file routines.
1785  Finally, {\it ctrl\_pack} collects all adjoint field files  Finally, {\it ctrl\_pack} collects all adjoint files
1786  and writes them to the compressed vector file  and writes them to the compressed vector file
1787  {\bf vector\_grad\_$<$k$>$}.  {\bf vector\_grad\_$<$k$>$}.
1788    
# Line 1650  and writes them to the compressed vector Line 1790  and writes them to the compressed vector
1790    
1791    
1792    
1793  \subsection{Flow directives and adjoint support routines}  \subsection{Flow directives and adjoint support routines \label{section_flowdir}}
1794    
1795  \subsection{Store directives and checkpointing}  \subsection{Store directives and checkpointing \label{section_checkpointing}}
1796    
1797  \subsection{Gradient checks}  \subsection{Gradient checks \label{section_grdchk}}
1798    
1799  \subsection{Second derivative generation via TAMC}  \subsection{Second derivative generation via TAMC}
1800    

Legend:
Removed from v.1.3  
changed lines
  Added in v.1.4

  ViewVC Help
Powered by ViewVC 1.1.22