1 |
% $Header$ |
% $Header$ |
2 |
% $Name$ |
% $Name$ |
3 |
|
|
4 |
|
Author: Patrick Heimbach |
5 |
|
|
6 |
{\sf Automatic differentiation} (AD), also referred to as algorithmic |
{\sf Automatic differentiation} (AD), also referred to as algorithmic |
7 |
(or, more loosely, computational) differentiation, involves |
(or, more loosely, computational) differentiation, involves |
8 |
automatically deriving code to calculate |
automatically deriving code to calculate partial derivatives from an |
9 |
partial derivatives from an existing fully non-linear prognostic code. |
existing fully non-linear prognostic code. (see \cite{gri:00}). A |
10 |
(see \cite{gri:00}). |
software tool is used that parses and transforms source files |
11 |
A software tool is used that parses and transforms source files |
according to a set of linguistic and mathematical rules. AD tools are |
12 |
according to a set of linguistic and mathematical rules. |
like source-to-source translators in that they parse a program code as |
13 |
AD tools are like source-to-source translators in that |
input and produce a new program code as output. However, unlike a |
14 |
they parse a program code as input and produce a new program code |
pure source-to-source translation, the output program represents a new |
15 |
as output. |
algorithm, such as the evaluation of the Jacobian, the Hessian, or |
16 |
However, unlike a pure source-to-source translation, the output program |
higher derivative operators. In principle, a variety of derived |
17 |
represents a new algorithm, such as the evaluation of the |
algorithms can be generated automatically in this way. |
18 |
Jacobian, the Hessian, or higher derivative operators. |
|
19 |
In principle, a variety of derived algorithms |
MITgcm has been adapted for use with the Tangent linear and Adjoint |
20 |
can be generated automatically in this way. |
Model Compiler (TAMC) and its successor TAF (Transformation of |
21 |
|
Algorithms in Fortran), developed by Ralf Giering (\cite{gie-kam:98}, |
22 |
The MITGCM has been adapted for use with the |
\cite{gie:99,gie:00}). The first application of the adjoint of MITgcm |
23 |
Tangent linear and Adjoint Model Compiler (TAMC) and its successor TAF |
for sensitivity studies has been published by \cite{maro-eta:99}. |
24 |
(Transformation of Algorithms in Fortran), developed |
\cite{sta-eta:97,sta-eta:01} use MITgcm and its adjoint for ocean |
25 |
by Ralf Giering (\cite{gie-kam:98}, \cite{gie:99,gie:00}). |
state estimation studies. In the following we shall refer to TAMC and |
26 |
The first application of the adjoint of the MITGCM for sensitivity |
TAF synonymously, except were explicitly stated otherwise. |
27 |
studies has been published by \cite{maro-eta:99}. |
|
28 |
\cite{sta-eta:97,sta-eta:01} use the MITGCM and its adjoint |
TAMC exploits the chain rule for computing the first derivative of a |
29 |
for ocean state estimation studies. |
function with respect to a set of input variables. Treating a given |
30 |
In the following we shall refer to TAMC and TAF synonymously, |
forward code as a composition of operations -- each line representing |
31 |
except were explicitly stated otherwise. |
a compositional element, the chain rule is rigorously applied to the |
32 |
|
code, line by line. The resulting tangent linear or adjoint code, |
33 |
TAMC exploits the chain rule for computing the first |
then, may be thought of as the composition in forward or reverse |
34 |
derivative of a function with |
order, respectively, of the Jacobian matrices of the forward code's |
35 |
respect to a set of input variables. |
compositional elements. |
|
Treating a given forward code as a composition of operations -- |
|
|
each line representing a compositional element, the chain rule is |
|
|
rigorously applied to the code, line by line. The resulting |
|
|
tangent linear or adjoint code, |
|
|
then, may be thought of as the composition in |
|
|
forward or reverse order, respectively, of the |
|
|
Jacobian matrices of the forward code's compositional elements. |
|
36 |
|
|
37 |
%********************************************************************** |
%********************************************************************** |
38 |
\section{Some basic algebra} |
\section{Some basic algebra} |
39 |
\label{sec_ad_algebra} |
\label{sec_ad_algebra} |
40 |
|
\begin{rawhtml} |
41 |
|
<!-- CMIREDIR:sec_ad_algebra: --> |
42 |
|
\end{rawhtml} |
43 |
%********************************************************************** |
%********************************************************************** |
44 |
|
|
45 |
Let $ \cal{M} $ be a general nonlinear, model, i.e. a |
Let $ \cal{M} $ be a general nonlinear, model, i.e. a |
674 |
%********************************************************************** |
%********************************************************************** |
675 |
\section{TLM and ADM generation in general} |
\section{TLM and ADM generation in general} |
676 |
\label{sec_ad_setup_gen} |
\label{sec_ad_setup_gen} |
677 |
|
\begin{rawhtml} |
678 |
|
<!-- CMIREDIR:sec_ad_setup_gen: --> |
679 |
|
\end{rawhtml} |
680 |
%********************************************************************** |
%********************************************************************** |
681 |
|
|
682 |
In this section we describe in a general fashion |
In this section we describe in a general fashion |
683 |
the parts of the code that are relevant for automatic |
the parts of the code that are relevant for automatic |
684 |
differentiation using the software tool TAMC. |
differentiation using the software tool TAF. |
685 |
|
|
686 |
\input{part5/doc_ad_the_model} |
\input{part5/doc_ad_the_model} |
687 |
|
|
688 |
The basic flow is depicted in \ref{fig:adthemodel}. |
The basic flow is depicted in \ref{fig:adthemodel}. |
689 |
If the option {\tt ALLOW\_AUTODIFF\_TAMC} is defined, the driver routine |
If CPP option \texttt{ALLOW\_AUTODIFF\_TAMC} is defined, |
690 |
|
the driver routine |
691 |
{\it the\_model\_main}, instead of calling {\it the\_main\_loop}, |
{\it the\_model\_main}, instead of calling {\it the\_main\_loop}, |
692 |
invokes the adjoint of this routine, {\it adthe\_main\_loop}, |
invokes the adjoint of this routine, {\it adthe\_main\_loop} |
693 |
which is the toplevel routine in terms of reverse mode computation. |
(case \texttt{\#define ALLOW\_ADJOINT\_RUN}), or |
694 |
The routine {\it adthe\_main\_loop} has been generated by TAMC. |
the tangent linear of this routine {\it g\_the\_main\_loop} |
695 |
It contains both the forward integration of the full model, |
(case \texttt{\#define ALLOW\_TANGENTLINEAR\_RUN}), |
696 |
|
which are the toplevel routines in terms of automatic differentiation. |
697 |
|
The routines {\it adthe\_main\_loop} or {\it g\_the\_main\_loop} |
698 |
|
are generated by TAF. |
699 |
|
It contains both the forward integration of the full model, the |
700 |
|
cost function calculation, |
701 |
any additional storing that is required for efficient checkpointing, |
any additional storing that is required for efficient checkpointing, |
702 |
and the reverse integration of the adjoint model. |
and the reverse integration of the adjoint model. |
703 |
The structure of {\it adthe\_main\_loop} has been strongly |
|
704 |
simplified for clarification; in particular, no checkpointing |
[DESCRIBE IN A SEPARATE SECTION THE WORKING OF THE TLM] |
705 |
|
|
706 |
|
In Fig. \ref{fig:adthemodel} |
707 |
|
the structure of {\it adthe\_main\_loop} has been strongly |
708 |
|
simplified to focus on the essentials; in particular, no checkpointing |
709 |
procedures are shown here. |
procedures are shown here. |
710 |
Prior to the call of {\it adthe\_main\_loop}, the routine |
Prior to the call of {\it adthe\_main\_loop}, the routine |
711 |
{\it ctrl\_unpack} is invoked to unpack the control vector |
{\it ctrl\_unpack} is invoked to unpack the control vector |
720 |
the gradient has been computed via the adjoint |
the gradient has been computed via the adjoint |
721 |
(cf. Section \ref{section_grdchk}). |
(cf. Section \ref{section_grdchk}). |
722 |
|
|
723 |
|
%------------------------------------------------------------------ |
724 |
|
|
725 |
|
\subsection{General setup |
726 |
|
\label{section_ad_setup}} |
727 |
|
|
728 |
|
In order to configure AD-related setups the following packages need |
729 |
|
to be enabled: |
730 |
|
{\it |
731 |
|
\begin{table}[h!] |
732 |
|
\begin{tabular}{l} |
733 |
|
autodiff \\ |
734 |
|
ctrl \\ |
735 |
|
cost \\ |
736 |
|
grdchk \\ |
737 |
|
\end{tabular} |
738 |
|
\end{table} |
739 |
|
} |
740 |
|
The packages are enabled by adding them to your experiment-specific |
741 |
|
configuration file |
742 |
|
{\it packages.conf} (see Section ???). |
743 |
|
|
744 |
|
The following AD-specific CPP option files need to be customized: |
745 |
|
% |
746 |
|
\begin{itemize} |
747 |
|
% |
748 |
|
\item {\it ECCO\_CPPOPTIONS.h} \\ |
749 |
|
This header file collects CPP options for the packages |
750 |
|
{\it autodiff, cost, ctrl} as well as AD-unrelated options for |
751 |
|
the external forcing package {\it exf}. |
752 |
|
\footnote{NOTE: These options are not set in their package-specific |
753 |
|
headers such as {\it COST\_CPPOPTIONS.h}, but are instead collected |
754 |
|
in the single header file {\it ECCO\_CPPOPTIONS.h}. |
755 |
|
The package-specific header files serve as simple |
756 |
|
placeholders at this point.} |
757 |
|
% |
758 |
|
\item {\it tamc.h} \\ |
759 |
|
This header configures the splitting of the time stepping loop |
760 |
|
w.r.t. the 3-level checkpointing (see section ???). |
761 |
|
|
762 |
|
% |
763 |
|
\end{itemize} |
764 |
|
|
765 |
|
%------------------------------------------------------------------ |
766 |
|
|
767 |
|
\subsection{Building the AD code |
768 |
|
\label{section_ad_build}} |
769 |
|
|
770 |
|
The build process of an AD code is very similar to building |
771 |
|
the forward model. However, depending on which AD code one wishes |
772 |
|
to generate, and on which AD tool is available (TAF or TAMC), |
773 |
|
the following {\tt make} targets are available: |
774 |
|
|
775 |
|
\begin{table}[h!] |
776 |
|
{\footnotesize |
777 |
|
\begin{tabular}{ccll} |
778 |
|
~ & {\it AD-target} & {\it output} & {\it description} \\ |
779 |
|
\hline |
780 |
|
\hline |
781 |
|
(1) & {\tt <MODE><TOOL>only} & {\tt <MODE>\_<TOOL>\_output.f} & |
782 |
|
generates code for $<$MODE$>$ using $<$TOOL$>$ \\ |
783 |
|
~ & ~ & ~ & no {\tt make} dependencies on {\tt .F .h} \\ |
784 |
|
~ & ~ & ~ & useful for compiling on remote platforms \\ |
785 |
|
\hline |
786 |
|
(2) & {\tt <MODE><TOOL>} & {\tt <MODE>\_<TOOL>\_output.f} & |
787 |
|
generates code for $<$MODE$>$ using $<$TOOL$>$ \\ |
788 |
|
~ & ~ & ~ & includes {\tt make} dependencies on {\tt .F .h} \\ |
789 |
|
~ & ~ & ~ & i.e. input for $<$TOOL$>$ may be re-generated \\ |
790 |
|
\hline |
791 |
|
(3) & {\tt <MODE>all} & {\tt mitgcmuv\_<MODE>} & |
792 |
|
generates code for $<$MODE$>$ using $<$TOOL$>$ \\ |
793 |
|
~ & ~ & ~ & and compiles all code \\ |
794 |
|
~ & ~ & ~ & (use of TAF is set as default) \\ |
795 |
|
\hline |
796 |
|
\hline |
797 |
|
\end{tabular} |
798 |
|
} |
799 |
|
\end{table} |
800 |
|
% |
801 |
|
Here, the following placeholders are used |
802 |
|
% |
803 |
|
\begin{itemize} |
804 |
|
% |
805 |
|
\item [$<$TOOL$>$] |
806 |
|
% |
807 |
|
\begin{itemize} |
808 |
|
% |
809 |
|
\item {\tt TAF} |
810 |
|
\item {\tt TAMC} |
811 |
|
% |
812 |
|
\end{itemize} |
813 |
|
% |
814 |
|
\item [$<$MODE$>$] |
815 |
|
% |
816 |
|
\begin{itemize} |
817 |
|
% |
818 |
|
\item {\tt ad} generates the adjoint model (ADM) |
819 |
|
\item {\tt ftl} generates the tangent linear model (TLM) |
820 |
|
\item {\tt svd} generates both ADM and TLM for \\ |
821 |
|
singular value decomposition (SVD) type calculations |
822 |
|
% |
823 |
|
\end{itemize} |
824 |
|
% |
825 |
|
\end{itemize} |
826 |
|
|
827 |
|
For example, to generate the adjoint model using TAF after routines ({\tt .F}) |
828 |
|
or headers ({\tt .h}) have been modified, but without compilation, |
829 |
|
type {\tt make adtaf}; |
830 |
|
or, to generate the tangent linear model using TAMC without |
831 |
|
re-generating the input code, type {\tt make ftltamconly}. |
832 |
|
|
833 |
|
|
834 |
|
A typical full build process to generate the ADM via TAF would |
835 |
|
look like follows: |
836 |
|
\begin{verbatim} |
837 |
|
% mkdir build |
838 |
|
% cd build |
839 |
|
% ../../../tools/genmake2 -mods=../code_ad |
840 |
|
% make depend |
841 |
|
% make adall |
842 |
|
\end{verbatim} |
843 |
|
|
844 |
|
%------------------------------------------------------------------ |
845 |
|
|
846 |
|
\subsection{The AD build process in detail |
847 |
|
\label{section_ad_build_detail}} |
848 |
|
|
849 |
|
The {\tt make <MODE>all} target consists of the following procedures: |
850 |
|
|
851 |
|
\begin{enumerate} |
852 |
|
% |
853 |
|
\item |
854 |
|
A header file {\tt AD\_CONFIG.h} is generated which contains a CPP option |
855 |
|
on which code ought to be generated. Depending on the {\tt make} target, |
856 |
|
the contents is |
857 |
|
\begin{itemize} |
858 |
|
\item |
859 |
|
{\tt \#define ALLOW\_ADJOINT\_RUN} |
860 |
|
\item |
861 |
|
{\tt \#define ALLOW\_TANGENTLINEAR\_RUN} |
862 |
|
\item |
863 |
|
{\tt \#define ALLOW\_ECCO\_OPTIMIZATION} |
864 |
|
\end{itemize} |
865 |
|
% |
866 |
|
\item |
867 |
|
A single file {\tt <MODE>\_input\_code.f} is concatenated |
868 |
|
consisting of all {\tt .f} files that are part of the list {\bf AD\_FILES} |
869 |
|
and all {\tt .flow} files that are part of the list {\bf AD\_FLOW\_FILES}. |
870 |
|
% |
871 |
|
\item |
872 |
|
The AD tool is invoked with the {\bf <MODE>\_<TOOL>\_FLAGS}. |
873 |
|
The default AD tool flags in {\tt genmake2} can be overrwritten by |
874 |
|
an {\tt adjoint\_options} file (similar to the platform-specific |
875 |
|
{\tt build\_options}, see Section ???. |
876 |
|
The AD tool writes the resulting AD code into the file |
877 |
|
{\tt <MODE>\_input\_code\_ad.f} |
878 |
|
% |
879 |
|
\item |
880 |
|
A short sed script {\tt adjoint\_sed} is applied to |
881 |
|
{\tt <MODE>\_input\_code\_ad.f} |
882 |
|
to reinstate {\bf myThid} into the CALL argument list of active file I/O. |
883 |
|
The result is written to file {\tt <MODE>\_<TOOL>\_output.f}. |
884 |
|
% |
885 |
|
\item |
886 |
|
All routines are compiled and an executable is generated |
887 |
|
(see Table ???). |
888 |
|
% |
889 |
|
\end{enumerate} |
890 |
|
|
891 |
|
\subsubsection{The list AD\_FILES and {\tt .list} files} |
892 |
|
|
893 |
|
Not all routines are presented to the AD tool. |
894 |
|
Routines typically hidden are diagnostics routines which |
895 |
|
do not influence the cost function, but may create |
896 |
|
artificial flow dependencies such as I/O of active variables. |
897 |
|
|
898 |
|
{\tt genmake2} generates a list (or variable) {\bf AD\_FILES} |
899 |
|
which contains all routines that are shown to the AD tool. |
900 |
|
This list is put together from all files with suffix {\tt .list} |
901 |
|
that {\tt genmake2} finds in its search directories. |
902 |
|
The list file for the core MITgcm routines is in {\tt model/src/} |
903 |
|
is called {\tt model\_ad\_diff.list}. |
904 |
|
Note that no wrapper routine is shown to TAF. These are either |
905 |
|
not visible at all to the AD code, or hand-written AD code |
906 |
|
is available (see next section). |
907 |
|
|
908 |
|
Each package directory contains its package-specific |
909 |
|
list file {\tt <PKG>\_ad\_diff.list}. For example, |
910 |
|
{\tt pkg/ptracers/} contains the file {\tt ptracers\_ad\_diff.list}. |
911 |
|
Thus, enabling a package will automatically extend the |
912 |
|
{\bf AD\_FILES} list of {\tt genmake2} to incorporate the |
913 |
|
package-specific routines. |
914 |
|
Note that you will need to regenerate the {\tt Makefile} if |
915 |
|
you enable a package (e.g. by adding it to {\tt packages.conf}) |
916 |
|
and a {\tt Makefile} already exists. |
917 |
|
|
918 |
|
\subsubsection{The list AD\_FLOW\_FILES and {\tt .flow} files} |
919 |
|
|
920 |
|
TAMC and TAF can evaluate user-specified directives |
921 |
|
that start with a specific syntax ({\tt CADJ}, {\tt C\$TAF}, {\tt !\$TAF}). |
922 |
|
The main categories of directives are STORE directives and |
923 |
|
FLOW directives. Here, we are concerned with flow directives, |
924 |
|
store directives are treated elsewhere. |
925 |
|
|
926 |
|
Flow directives enable the AD tool to evaluate how it should treat |
927 |
|
routines that are 'hidden' by the user, i.e. routines which are |
928 |
|
not contained in the {\bf AD\_FILES} list (see previous section), |
929 |
|
but which are called in part of the code that the AD tool does see. |
930 |
|
The flow directive tell the AD tool |
931 |
|
% |
932 |
|
\begin{itemize} |
933 |
|
% |
934 |
|
\item which subroutine arguments are input/output |
935 |
|
\item which subroutine arguments are active |
936 |
|
\item which subroutine arguments are required to compute the cost |
937 |
|
\item which subroutine arguments are dependent |
938 |
|
% |
939 |
|
\end{itemize} |
940 |
|
% |
941 |
|
The syntax for the flow directives can be found in the |
942 |
|
AD tool manuals. |
943 |
|
|
944 |
|
{\tt genmake2} generates a list (or variable) {\bf AD\_FLOW\_FILES} |
945 |
|
which contains all files with suffix{\tt .flow} that it finds |
946 |
|
in its search directories. |
947 |
|
The flow directives for the core MITgcm routines of |
948 |
|
{\tt eesupp/src/} and {\tt model/src/} |
949 |
|
reside in {\tt pkg/autodiff/}. |
950 |
|
This directory also contains hand-written adjoint code |
951 |
|
for the MITgcm WRAPPER (section \ref{chap:sarch}). |
952 |
|
|
953 |
|
Flow directives for package-specific routines are contained in |
954 |
|
the corresponding package directories in the file |
955 |
|
{\tt <PKG>\_ad.flow}, e.g. ptracers-specific directives are in |
956 |
|
{\tt ptracers\_ad.flow}. |
957 |
|
|
958 |
|
\subsubsection{Store directives for 3-level checkpointing} |
959 |
|
|
960 |
|
The storing that is required at each period of the |
961 |
|
3-level checkpointing is controled by three |
962 |
|
top-level headers. |
963 |
|
|
964 |
|
\begin{verbatim} |
965 |
|
do ilev_3 = 1, nchklev_3 |
966 |
|
# include ``checkpoint_lev3.h'' |
967 |
|
do ilev_2 = 1, nchklev_2 |
968 |
|
# include ``checkpoint_lev2.h'' |
969 |
|
do ilev_1 = 1, nchklev_1 |
970 |
|
# include ``checkpoint_lev1.h'' |
971 |
|
|
972 |
|
... |
973 |
|
|
974 |
|
end do |
975 |
|
end do |
976 |
|
end do |
977 |
|
\end{verbatim} |
978 |
|
|
979 |
|
All files {\tt checkpoint\_lev?.h} are contained in directory |
980 |
|
{\tt pkg/autodiff/}. |
981 |
|
|
982 |
|
|
983 |
|
\subsubsection{Changing the default AD tool flags: ad\_options files} |
984 |
|
|
985 |
|
|
986 |
|
\subsubsection{Hand-written adjoint code} |
987 |
|
|
988 |
|
%------------------------------------------------------------------ |
989 |
|
|
990 |
\subsection{The cost function (dependent variable) |
\subsection{The cost function (dependent variable) |
991 |
\label{section_cost}} |
\label{section_cost}} |
992 |
|
|
1004 |
|
|
1005 |
\input{part5/doc_cost_flow} |
\input{part5/doc_cost_flow} |
1006 |
|
|
1007 |
\subsubsection{genmake and CPP options} |
\subsubsection{Enabling the package} |
1008 |
% |
|
|
\begin{itemize} |
|
|
% |
|
|
\item |
|
1009 |
\fbox{ |
\fbox{ |
1010 |
\begin{minipage}{12cm} |
\begin{minipage}{12cm} |
1011 |
{\it genmake}, {\it CPP\_OPTIONS.h}, {\it ECCO\_CPPOPTIONS.h} |
{\it packages.conf}, {\it ECCO\_CPPOPTIONS.h} |
1012 |
\end{minipage} |
\end{minipage} |
1013 |
} |
} |
1014 |
\end{itemize} |
\begin{itemize} |
|
% |
|
|
The directory {\it pkg/cost} can be included to the |
|
|
compile list in 3 different ways (cf. Section \ref{???}): |
|
1015 |
% |
% |
1016 |
\begin{enumerate} |
\item |
1017 |
|
The package is enabled by adding {\it cost} to your file {\it packages.conf} |
1018 |
|
(see Section ???) |
1019 |
% |
% |
1020 |
\item {\it genmake}: \\ |
\item |
1021 |
Change the default settings in the file {\it genmake} by adding |
|
1022 |
{\bf cost} to the {\bf enable} list (not recommended). |
|
1023 |
% |
\end{itemize} |
|
\item {\it .genmakerc}: \\ |
|
|
Customize the settings of {\bf enable}, {\bf disable} which are |
|
|
appropriate for your experiment in the file {\it .genmakerc} |
|
|
and add the file to your compile directory. |
|
|
% |
|
|
\item genmake-options: \\ |
|
|
Call {\it genmake} with the option |
|
|
{\tt genmake -enable=cost}. |
|
1024 |
% |
% |
1025 |
\end{enumerate} |
|
1026 |
N.B.: In general the following packages ought to be enabled |
N.B.: In general the following packages ought to be enabled |
1027 |
simultaneously: {\it autodiff, cost, ctrl}. |
simultaneously: {\it autodiff, cost, ctrl}. |
1028 |
The basic CPP option to enable the cost function is {\bf ALLOW\_COST}. |
The basic CPP option to enable the cost function is {\bf ALLOW\_COST}. |
1202 |
\\ |
\\ |
1203 |
% |
% |
1204 |
Two important issues related to the handling of the control |
Two important issues related to the handling of the control |
1205 |
variables in the MITGCM need to be addressed. |
variables in MITgcm need to be addressed. |
1206 |
First, in order to save memory, the control variable arrays |
First, in order to save memory, the control variable arrays |
1207 |
are not kept in memory, but rather read from file and added |
are not kept in memory, but rather read from file and added |
1208 |
to the initial fields during the model initialization phase. |
to the initial fields during the model initialization phase. |
1277 |
tamc -input 'xx_tr1 ...' ... |
tamc -input 'xx_tr1 ...' ... |
1278 |
\end{verbatim} |
\end{verbatim} |
1279 |
% |
% |
1280 |
Now, as mentioned above, the MITGCM avoids maintaining |
Now, as mentioned above, MITgcm avoids maintaining |
1281 |
an array for each control variable by reading the |
an array for each control variable by reading the |
1282 |
perturbation to a temporary array from file. |
perturbation to a temporary array from file. |
1283 |
To ensure the symbolic link to be recognized by TAMC, a scalar |
To ensure the symbolic link to be recognized by TAMC, a scalar |