/[MITgcm]/manual/s_autodiff/text/doc_ad_2.tex
ViewVC logotype

Diff of /manual/s_autodiff/text/doc_ad_2.tex

Parent Directory Parent Directory | Revision Log Revision Log | View Revision Graph Revision Graph | View Patch Patch

revision 1.19 by heimbach, Tue Aug 2 22:26:58 2005 UTC revision 1.21 by heimbach, Thu Jan 17 22:32:06 2008 UTC
# Line 4  Line 4 
4  Author: Patrick Heimbach  Author: Patrick Heimbach
5    
6  {\sf Automatic differentiation} (AD), also referred to as algorithmic  {\sf Automatic differentiation} (AD), also referred to as algorithmic
7  (or, more loosely, computational) differentiation, involves  (or, more loosely, computational) differentiation, involves
8  automatically deriving code to calculate  automatically deriving code to calculate partial derivatives from an
9  partial derivatives from an existing fully non-linear prognostic code.  existing fully non-linear prognostic code.  (see \cite{gri:00}).  A
10  (see \cite{gri:00}).  software tool is used that parses and transforms source files
11  A software tool is used that parses and transforms source files  according to a set of linguistic and mathematical rules.  AD tools are
12  according to a set of linguistic and mathematical rules.  like source-to-source translators in that they parse a program code as
13  AD tools are like source-to-source translators in that  input and produce a new program code as output
14  they parse a program code as input and produce a new program code  (we restrict our discussion to source-to-source tools, ignoring
15  as output.  operator-overloading tools).  However, unlike a
16  However, unlike a pure source-to-source translation, the output program  pure source-to-source translation, the output program represents a new
17  represents a new algorithm, such as the evaluation of the  algorithm, such as the evaluation of the Jacobian, the Hessian, or
18  Jacobian, the Hessian, or higher derivative operators.  higher derivative operators.  In principle, a variety of derived
19  In principle, a variety of derived algorithms  algorithms can be generated automatically in this way.
20  can be generated automatically in this way.  
21    MITgcm has been adapted for use with the Tangent linear and Adjoint
22  The MITGCM has been adapted for use with the  Model Compiler (TAMC) and its successor TAF (Transformation of
23  Tangent linear and Adjoint Model Compiler (TAMC) and its successor TAF  Algorithms in Fortran), developed by Ralf Giering (\cite{gie-kam:98},
24  (Transformation of Algorithms in Fortran), developed  \cite{gie:99,gie:00}).  The first application of the adjoint of MITgcm
25  by Ralf Giering (\cite{gie-kam:98}, \cite{gie:99,gie:00}).  for sensitivity studies has been published by \cite{maro-eta:99}.
26  The first application of the adjoint of the MITGCM for sensitivity  \cite{stam-etal:97,stam-etal:02} use MITgcm and its adjoint for ocean
27  studies has been published by \cite{maro-eta:99}.  state estimation studies.  In the following we shall refer to TAMC and
28  \cite{sta-eta:97,sta-eta:01} use the MITGCM and its adjoint  TAF synonymously, except were explicitly stated otherwise.
29  for ocean state estimation studies.  
30  In the following we shall refer to TAMC and TAF synonymously,  As of mid-2007 we are also able to generate fairly efficient
31  except were explicitly stated otherwise.  adjoint code of the MITgcm using a new, open-source AD tool,
32    called OpenAD (see \cite{naum-etal:06,utke-etal:08}.
33  TAMC exploits the chain rule for computing the first  This enables us for the first time to compare adjoint models
34  derivative of a function with  generated from different AD tools, providing an additional
35  respect to a set of input variables.  accuracy check, complementary to finite-difference gradient checks.
36  Treating a given forward code as a composition of operations --  OpenAD and its application to  MITgcm is described in detail
37  each line representing a compositional element, the chain rule is  in section \ref{sec_ad_openad}.
38  rigorously applied to the code, line by line. The resulting  
39  tangent linear or adjoint code,  The AD tool exploits the chain rule for computing the first derivative of a
40  then, may be thought of as the composition in  function with respect to a set of input variables.  Treating a given
41  forward or reverse order, respectively, of the  forward code as a composition of operations -- each line representing
42  Jacobian matrices of the forward code's compositional elements.  a compositional element, the chain rule is rigorously applied to the
43    code, line by line. The resulting tangent linear or adjoint code,
44    then, may be thought of as the composition in forward or reverse
45    order, respectively, of the Jacobian matrices of the forward code's
46    compositional elements.
47    
48  %**********************************************************************  %**********************************************************************
49  \section{Some basic algebra}  \section{Some basic algebra}
# Line 688  Schematic view of intermediate dump and Line 692  Schematic view of intermediate dump and
692    
693  In this section we describe in a general fashion  In this section we describe in a general fashion
694  the parts of the code that are relevant for automatic  the parts of the code that are relevant for automatic
695  differentiation using the software tool TAF.  differentiation using the software tool TAF.
696    Modifications to use OpenAD are described in \ref{sec_ad_openad}.
697    
698  \input{part5/doc_ad_the_model}  \input{part5/doc_ad_the_model}
699    
# Line 771  w.r.t. the 3-level checkpointing (see se Line 776  w.r.t. the 3-level checkpointing (see se
776    
777  %------------------------------------------------------------------  %------------------------------------------------------------------
778    
779  \subsection{Building the AD code  \subsection{Building the AD code using TAF
780  \label{section_ad_build}}  \label{section_ad_build}}
781    
782  The build process of an AD code is very similar to building  The build process of an AD code is very similar to building
# Line 781  the following {\tt make} targets are ava Line 786  the following {\tt make} targets are ava
786    
787  \begin{table}[h!]  \begin{table}[h!]
788  {\footnotesize  {\footnotesize
789  \begin{tabular}{ccll}  \begin{tabular}{|ccll|}
790    \hline
791  ~ & {\it AD-target} & {\it output} & {\it description} \\  ~ & {\it AD-target} & {\it output} & {\it description} \\
792  \hline  \hline
793  \hline  \hline
# Line 800  generates code for $<$MODE$>$ using $<$T Line 806  generates code for $<$MODE$>$ using $<$T
806  ~ & ~ & ~ & and compiles all code \\  ~ & ~ & ~ & and compiles all code \\
807  ~ & ~ & ~ & (use of TAF is set as default) \\  ~ & ~ & ~ & (use of TAF is set as default) \\
808  \hline  \hline
 \hline  
809  \end{tabular}  \end{tabular}
810  }  }
811  \end{table}  \end{table}
# Line 809  Here, the following placeholders are use Line 814  Here, the following placeholders are use
814  %  %
815  \begin{itemize}  \begin{itemize}
816  %  %
817  \item [$<$TOOL$>$]  \item $<$TOOL$>$
818  %  %
819  \begin{itemize}  \begin{itemize}
820  %  %
# Line 818  Here, the following placeholders are use Line 823  Here, the following placeholders are use
823  %  %
824  \end{itemize}  \end{itemize}
825  %  %
826  \item [$<$MODE$>$]  \item $<$MODE$>$
827  %  %
828  \begin{itemize}  \begin{itemize}
829  %  %
# Line 860  The {\tt make <MODE>all} target consists Line 865  The {\tt make <MODE>all} target consists
865  \item  \item
866  A header file {\tt AD\_CONFIG.h} is generated which contains a CPP option  A header file {\tt AD\_CONFIG.h} is generated which contains a CPP option
867  on which code ought to be generated. Depending on the {\tt make} target,  on which code ought to be generated. Depending on the {\tt make} target,
868  the contents is  the contents is one of the following:
869  \begin{itemize}  \begin{itemize}
870  \item  \item
871  {\tt \#define ALLOW\_ADJOINT\_RUN}  {\tt \#define ALLOW\_ADJOINT\_RUN}
# Line 876  consisting of all {\tt .f} files that ar Line 881  consisting of all {\tt .f} files that ar
881  and all {\tt .flow} files that are part of the list {\bf AD\_FLOW\_FILES}.  and all {\tt .flow} files that are part of the list {\bf AD\_FLOW\_FILES}.
882  %  %
883  \item  \item
884  The AD tool is invoked with the {\bf <MODE>\_<TOOL>\_FLAGS}.  The AD tool is invoked with the {\tt <MODE>\_<TOOL>\_FLAGS}.
885  The default AD tool flags in {\tt genmake2} can be overrwritten by  The default AD tool flags in {\tt genmake2} can be overrwritten by
886  an {\tt adjoint\_options} file (similar to the platform-specific  an {\tt adjoint\_options} file (similar to the platform-specific
887  {\tt build\_options}, see Section ???.  {\tt build\_options}, see Section ???.
# Line 955  The flow directives for the core MITgcm Line 960  The flow directives for the core MITgcm
960  {\tt eesupp/src/} and {\tt model/src/}  {\tt eesupp/src/} and {\tt model/src/}
961  reside in {\tt pkg/autodiff/}.  reside in {\tt pkg/autodiff/}.
962  This directory also contains hand-written adjoint code  This directory also contains hand-written adjoint code
963  for the MITgcm WRAPPER (see Section ???).  for the MITgcm WRAPPER (section \ref{chap:sarch}).
964    
965  Flow directives for package-specific routines are contained in  Flow directives for package-specific routines are contained in
966  the corresponding package directories in the file  the corresponding package directories in the file
# Line 1124  from each contribution and sums over all Line 1129  from each contribution and sums over all
1129  \end{equation}  \end{equation}
1130  %  %
1131  The total cost function {\bf fc} will be the  The total cost function {\bf fc} will be the
1132  'dependent' variable in the argument list for TAMC, i.e.  'dependent' variable in the argument list for TAF, i.e.
1133  \begin{verbatim}  \begin{verbatim}
1134  tamc -output 'fc' ...  taf -output 'fc' ...
1135  \end{verbatim}  \end{verbatim}
1136    
1137  %%%% \end{document}  %%%% \end{document}
# Line 1209  and their gradients: {\it ctrl\_unpack} Line 1214  and their gradients: {\it ctrl\_unpack}
1214  \\  \\
1215  %  %
1216  Two important issues related to the handling of the control  Two important issues related to the handling of the control
1217  variables in the MITGCM need to be addressed.  variables in MITgcm need to be addressed.
1218  First, in order to save memory, the control variable arrays  First, in order to save memory, the control variable arrays
1219  are not kept in memory, but rather read from file and added  are not kept in memory, but rather read from file and added
1220  to the initial fields during the model initialization phase.  to the initial fields during the model initialization phase.
# Line 1241  and gradient are generated and initialis Line 1246  and gradient are generated and initialis
1246  %  %
1247  The dependency flow for differentiation w.r.t. the controls  The dependency flow for differentiation w.r.t. the controls
1248  starts with adding a perturbation onto the input variable,  starts with adding a perturbation onto the input variable,
1249  thus defining the independent or control variables for TAMC.  thus defining the independent or control variables for TAF.
1250  Three types of controls may be considered:  Three types of controls may be considered:
1251  %  %
1252  \begin{itemize}  \begin{itemize}
# Line 1274  u         & = \, u_{[0]} \, + \, \Delta Line 1279  u         & = \, u_{[0]} \, + \, \Delta
1279  holding the perturbation. In the case of a simple  holding the perturbation. In the case of a simple
1280  sensitivity study this array is identical to zero.  sensitivity study this array is identical to zero.
1281  However, it's specification is essential in the context  However, it's specification is essential in the context
1282  of automatic differentiation since TAMC  of automatic differentiation since TAF
1283  treats the corresponding line in the code symbolically  treats the corresponding line in the code symbolically
1284  when determining the differentiation chain and its origin.  when determining the differentiation chain and its origin.
1285  Thus, the variable names are part of the argument list  Thus, the variable names are part of the argument list
1286  when calling TAMC:  when calling TAF:
1287  %  %
1288  \begin{verbatim}  \begin{verbatim}
1289  tamc -input 'xx_tr1 ...' ...  taf -input 'xx_tr1 ...' ...
1290  \end{verbatim}  \end{verbatim}
1291  %  %
1292  Now, as mentioned above, the MITGCM avoids maintaining  Now, as mentioned above, MITgcm avoids maintaining
1293  an array for each control variable by reading the  an array for each control variable by reading the
1294  perturbation to a temporary array from file.  perturbation to a temporary array from file.
1295  To ensure the symbolic link to be recognized by TAMC, a scalar  To ensure the symbolic link to be recognized by TAF, a scalar
1296  dummy variable {\bf xx\_tr1\_dummy} is introduced  dummy variable {\bf xx\_tr1\_dummy} is introduced
1297  and an 'active read' routine of the adjoint support  and an 'active read' routine of the adjoint support
1298  package {\it pkg/autodiff} is invoked.  package {\it pkg/autodiff} is invoked.
1299  The read-procedure is tagged with the variable  The read-procedure is tagged with the variable
1300  {\bf xx\_tr1\_dummy} enabling TAMC to recognize the  {\bf xx\_tr1\_dummy} enabling TAF to recognize the
1301  initialization of the perturbation.  initialization of the perturbation.
1302  The modified call of TAMC thus reads  The modified call of TAF thus reads
1303  %  %
1304  \begin{verbatim}  \begin{verbatim}
1305  tamc -input 'xx_tr1_dummy ...' ...  taf -input 'xx_tr1_dummy ...' ...
1306  \end{verbatim}  \end{verbatim}
1307  %  %
1308  and the modified operation to (\ref{perturb})  and the modified operation to (\ref{perturb})

Legend:
Removed from v.1.19  
changed lines
  Added in v.1.21

  ViewVC Help
Powered by ViewVC 1.1.22