4 |
Author: Patrick Heimbach |
Author: Patrick Heimbach |
5 |
|
|
6 |
{\sf Automatic differentiation} (AD), also referred to as algorithmic |
{\sf Automatic differentiation} (AD), also referred to as algorithmic |
7 |
(or, more loosely, computational) differentiation, involves |
(or, more loosely, computational) differentiation, involves |
8 |
automatically deriving code to calculate |
automatically deriving code to calculate partial derivatives from an |
9 |
partial derivatives from an existing fully non-linear prognostic code. |
existing fully non-linear prognostic code. (see \cite{gri:00}). A |
10 |
(see \cite{gri:00}). |
software tool is used that parses and transforms source files |
11 |
A software tool is used that parses and transforms source files |
according to a set of linguistic and mathematical rules. AD tools are |
12 |
according to a set of linguistic and mathematical rules. |
like source-to-source translators in that they parse a program code as |
13 |
AD tools are like source-to-source translators in that |
input and produce a new program code as output |
14 |
they parse a program code as input and produce a new program code |
(we restrict our discussion to source-to-source tools, ignoring |
15 |
as output. |
operator-overloading tools). However, unlike a |
16 |
However, unlike a pure source-to-source translation, the output program |
pure source-to-source translation, the output program represents a new |
17 |
represents a new algorithm, such as the evaluation of the |
algorithm, such as the evaluation of the Jacobian, the Hessian, or |
18 |
Jacobian, the Hessian, or higher derivative operators. |
higher derivative operators. In principle, a variety of derived |
19 |
In principle, a variety of derived algorithms |
algorithms can be generated automatically in this way. |
20 |
can be generated automatically in this way. |
|
21 |
|
MITgcm has been adapted for use with the Tangent linear and Adjoint |
22 |
The MITGCM has been adapted for use with the |
Model Compiler (TAMC) and its successor TAF (Transformation of |
23 |
Tangent linear and Adjoint Model Compiler (TAMC) and its successor TAF |
Algorithms in Fortran), developed by Ralf Giering (\cite{gie-kam:98}, |
24 |
(Transformation of Algorithms in Fortran), developed |
\cite{gie:99,gie:00}). The first application of the adjoint of MITgcm |
25 |
by Ralf Giering (\cite{gie-kam:98}, \cite{gie:99,gie:00}). |
for sensitivity studies has been published by \cite{maro-eta:99}. |
26 |
The first application of the adjoint of the MITGCM for sensitivity |
\cite{stam-etal:97,stam-etal:02} use MITgcm and its adjoint for ocean |
27 |
studies has been published by \cite{maro-eta:99}. |
state estimation studies. In the following we shall refer to TAMC and |
28 |
\cite{sta-eta:97,sta-eta:01} use the MITGCM and its adjoint |
TAF synonymously, except were explicitly stated otherwise. |
29 |
for ocean state estimation studies. |
|
30 |
In the following we shall refer to TAMC and TAF synonymously, |
As of mid-2007 we are also able to generate fairly efficient |
31 |
except were explicitly stated otherwise. |
adjoint code of the MITgcm using a new, open-source AD tool, |
32 |
|
called OpenAD (see \cite{naum-etal:06,utke-etal:08}. |
33 |
TAMC exploits the chain rule for computing the first |
This enables us for the first time to compare adjoint models |
34 |
derivative of a function with |
generated from different AD tools, providing an additional |
35 |
respect to a set of input variables. |
accuracy check, complementary to finite-difference gradient checks. |
36 |
Treating a given forward code as a composition of operations -- |
OpenAD and its application to MITgcm is described in detail |
37 |
each line representing a compositional element, the chain rule is |
in section \ref{sec_ad_openad}. |
38 |
rigorously applied to the code, line by line. The resulting |
|
39 |
tangent linear or adjoint code, |
The AD tool exploits the chain rule for computing the first derivative of a |
40 |
then, may be thought of as the composition in |
function with respect to a set of input variables. Treating a given |
41 |
forward or reverse order, respectively, of the |
forward code as a composition of operations -- each line representing |
42 |
Jacobian matrices of the forward code's compositional elements. |
a compositional element, the chain rule is rigorously applied to the |
43 |
|
code, line by line. The resulting tangent linear or adjoint code, |
44 |
|
then, may be thought of as the composition in forward or reverse |
45 |
|
order, respectively, of the Jacobian matrices of the forward code's |
46 |
|
compositional elements. |
47 |
|
|
48 |
%********************************************************************** |
%********************************************************************** |
49 |
\section{Some basic algebra} |
\section{Some basic algebra} |
692 |
|
|
693 |
In this section we describe in a general fashion |
In this section we describe in a general fashion |
694 |
the parts of the code that are relevant for automatic |
the parts of the code that are relevant for automatic |
695 |
differentiation using the software tool TAF. |
differentiation using the software tool TAF. |
696 |
|
Modifications to use OpenAD are described in \ref{sec_ad_openad}. |
697 |
|
|
698 |
\input{part5/doc_ad_the_model} |
\input{part5/doc_ad_the_model} |
699 |
|
|
776 |
|
|
777 |
%------------------------------------------------------------------ |
%------------------------------------------------------------------ |
778 |
|
|
779 |
\subsection{Building the AD code |
\subsection{Building the AD code using TAF |
780 |
\label{section_ad_build}} |
\label{section_ad_build}} |
781 |
|
|
782 |
The build process of an AD code is very similar to building |
The build process of an AD code is very similar to building |
786 |
|
|
787 |
\begin{table}[h!] |
\begin{table}[h!] |
788 |
{\footnotesize |
{\footnotesize |
789 |
\begin{tabular}{ccll} |
\begin{tabular}{|ccll|} |
790 |
|
\hline |
791 |
~ & {\it AD-target} & {\it output} & {\it description} \\ |
~ & {\it AD-target} & {\it output} & {\it description} \\ |
792 |
\hline |
\hline |
793 |
\hline |
\hline |
806 |
~ & ~ & ~ & and compiles all code \\ |
~ & ~ & ~ & and compiles all code \\ |
807 |
~ & ~ & ~ & (use of TAF is set as default) \\ |
~ & ~ & ~ & (use of TAF is set as default) \\ |
808 |
\hline |
\hline |
|
\hline |
|
809 |
\end{tabular} |
\end{tabular} |
810 |
} |
} |
811 |
\end{table} |
\end{table} |
814 |
% |
% |
815 |
\begin{itemize} |
\begin{itemize} |
816 |
% |
% |
817 |
\item [$<$TOOL$>$] |
\item $<$TOOL$>$ |
818 |
% |
% |
819 |
\begin{itemize} |
\begin{itemize} |
820 |
% |
% |
823 |
% |
% |
824 |
\end{itemize} |
\end{itemize} |
825 |
% |
% |
826 |
\item [$<$MODE$>$] |
\item $<$MODE$>$ |
827 |
% |
% |
828 |
\begin{itemize} |
\begin{itemize} |
829 |
% |
% |
865 |
\item |
\item |
866 |
A header file {\tt AD\_CONFIG.h} is generated which contains a CPP option |
A header file {\tt AD\_CONFIG.h} is generated which contains a CPP option |
867 |
on which code ought to be generated. Depending on the {\tt make} target, |
on which code ought to be generated. Depending on the {\tt make} target, |
868 |
the contents is |
the contents is one of the following: |
869 |
\begin{itemize} |
\begin{itemize} |
870 |
\item |
\item |
871 |
{\tt \#define ALLOW\_ADJOINT\_RUN} |
{\tt \#define ALLOW\_ADJOINT\_RUN} |
881 |
and all {\tt .flow} files that are part of the list {\bf AD\_FLOW\_FILES}. |
and all {\tt .flow} files that are part of the list {\bf AD\_FLOW\_FILES}. |
882 |
% |
% |
883 |
\item |
\item |
884 |
The AD tool is invoked with the {\bf <MODE>\_<TOOL>\_FLAGS}. |
The AD tool is invoked with the {\tt <MODE>\_<TOOL>\_FLAGS}. |
885 |
The default AD tool flags in {\tt genmake2} can be overrwritten by |
The default AD tool flags in {\tt genmake2} can be overrwritten by |
886 |
an {\tt adjoint\_options} file (similar to the platform-specific |
an {\tt adjoint\_options} file (similar to the platform-specific |
887 |
{\tt build\_options}, see Section ???. |
{\tt build\_options}, see Section ???. |
960 |
{\tt eesupp/src/} and {\tt model/src/} |
{\tt eesupp/src/} and {\tt model/src/} |
961 |
reside in {\tt pkg/autodiff/}. |
reside in {\tt pkg/autodiff/}. |
962 |
This directory also contains hand-written adjoint code |
This directory also contains hand-written adjoint code |
963 |
for the MITgcm WRAPPER (see Section ???). |
for the MITgcm WRAPPER (section \ref{chap:sarch}). |
964 |
|
|
965 |
Flow directives for package-specific routines are contained in |
Flow directives for package-specific routines are contained in |
966 |
the corresponding package directories in the file |
the corresponding package directories in the file |
1129 |
\end{equation} |
\end{equation} |
1130 |
% |
% |
1131 |
The total cost function {\bf fc} will be the |
The total cost function {\bf fc} will be the |
1132 |
'dependent' variable in the argument list for TAMC, i.e. |
'dependent' variable in the argument list for TAF, i.e. |
1133 |
\begin{verbatim} |
\begin{verbatim} |
1134 |
tamc -output 'fc' ... |
taf -output 'fc' ... |
1135 |
\end{verbatim} |
\end{verbatim} |
1136 |
|
|
1137 |
%%%% \end{document} |
%%%% \end{document} |
1214 |
\\ |
\\ |
1215 |
% |
% |
1216 |
Two important issues related to the handling of the control |
Two important issues related to the handling of the control |
1217 |
variables in the MITGCM need to be addressed. |
variables in MITgcm need to be addressed. |
1218 |
First, in order to save memory, the control variable arrays |
First, in order to save memory, the control variable arrays |
1219 |
are not kept in memory, but rather read from file and added |
are not kept in memory, but rather read from file and added |
1220 |
to the initial fields during the model initialization phase. |
to the initial fields during the model initialization phase. |
1246 |
% |
% |
1247 |
The dependency flow for differentiation w.r.t. the controls |
The dependency flow for differentiation w.r.t. the controls |
1248 |
starts with adding a perturbation onto the input variable, |
starts with adding a perturbation onto the input variable, |
1249 |
thus defining the independent or control variables for TAMC. |
thus defining the independent or control variables for TAF. |
1250 |
Three types of controls may be considered: |
Three types of controls may be considered: |
1251 |
% |
% |
1252 |
\begin{itemize} |
\begin{itemize} |
1279 |
holding the perturbation. In the case of a simple |
holding the perturbation. In the case of a simple |
1280 |
sensitivity study this array is identical to zero. |
sensitivity study this array is identical to zero. |
1281 |
However, it's specification is essential in the context |
However, it's specification is essential in the context |
1282 |
of automatic differentiation since TAMC |
of automatic differentiation since TAF |
1283 |
treats the corresponding line in the code symbolically |
treats the corresponding line in the code symbolically |
1284 |
when determining the differentiation chain and its origin. |
when determining the differentiation chain and its origin. |
1285 |
Thus, the variable names are part of the argument list |
Thus, the variable names are part of the argument list |
1286 |
when calling TAMC: |
when calling TAF: |
1287 |
% |
% |
1288 |
\begin{verbatim} |
\begin{verbatim} |
1289 |
tamc -input 'xx_tr1 ...' ... |
taf -input 'xx_tr1 ...' ... |
1290 |
\end{verbatim} |
\end{verbatim} |
1291 |
% |
% |
1292 |
Now, as mentioned above, the MITGCM avoids maintaining |
Now, as mentioned above, MITgcm avoids maintaining |
1293 |
an array for each control variable by reading the |
an array for each control variable by reading the |
1294 |
perturbation to a temporary array from file. |
perturbation to a temporary array from file. |
1295 |
To ensure the symbolic link to be recognized by TAMC, a scalar |
To ensure the symbolic link to be recognized by TAF, a scalar |
1296 |
dummy variable {\bf xx\_tr1\_dummy} is introduced |
dummy variable {\bf xx\_tr1\_dummy} is introduced |
1297 |
and an 'active read' routine of the adjoint support |
and an 'active read' routine of the adjoint support |
1298 |
package {\it pkg/autodiff} is invoked. |
package {\it pkg/autodiff} is invoked. |
1299 |
The read-procedure is tagged with the variable |
The read-procedure is tagged with the variable |
1300 |
{\bf xx\_tr1\_dummy} enabling TAMC to recognize the |
{\bf xx\_tr1\_dummy} enabling TAF to recognize the |
1301 |
initialization of the perturbation. |
initialization of the perturbation. |
1302 |
The modified call of TAMC thus reads |
The modified call of TAF thus reads |
1303 |
% |
% |
1304 |
\begin{verbatim} |
\begin{verbatim} |
1305 |
tamc -input 'xx_tr1_dummy ...' ... |
taf -input 'xx_tr1_dummy ...' ... |
1306 |
\end{verbatim} |
\end{verbatim} |
1307 |
% |
% |
1308 |
and the modified operation to (\ref{perturb}) |
and the modified operation to (\ref{perturb}) |