557 |
(a derivative is defined w.r.t. a point along the trajectory), |
(a derivative is defined w.r.t. a point along the trajectory), |
558 |
the intermediate results of the model trajectory |
the intermediate results of the model trajectory |
559 |
$\vec{v}^{(\lambda+1)}={\cal M}_{\lambda}(v^{(\lambda)})$ |
$\vec{v}^{(\lambda+1)}={\cal M}_{\lambda}(v^{(\lambda)})$ |
560 |
are needed to evaluate the intermediate Jacobian |
may be required to evaluate the intermediate Jacobian |
561 |
$M_{\lambda}|_{\vec{v}^{(\lambda)}} \, \delta \vec{v}^{(\lambda)} $. |
$M_{\lambda}|_{\vec{v}^{(\lambda)}} \, \delta \vec{v}^{(\lambda)} $. |
562 |
|
This is the case e.g. for nonlinear expressions |
563 |
|
(momentum advection, nonlinear equation of state), state-dependent |
564 |
|
conditional statements (parameterization schemes). |
565 |
In the forward mode, the intermediate results are required |
In the forward mode, the intermediate results are required |
566 |
in the same order as computed by the full forward model ${\cal M}$, |
in the same order as computed by the full forward model ${\cal M}$, |
567 |
but in the reverse mode they are required in the reverse order. |
but in the reverse mode they are required in the reverse order. |
572 |
|
|
573 |
A method to balance the amount of recomputations vs. |
A method to balance the amount of recomputations vs. |
574 |
storage requirements is called {\sf checkpointing} |
storage requirements is called {\sf checkpointing} |
575 |
(e.g. \cite{res-eta:98}). |
(e.g. \cite{gri:92}, \cite{res-eta:98}). |
576 |
It is depicted in \ref{fig:3levelcheck} for a 3-level checkpointing |
It is depicted in \ref{fig:3levelcheck} for a 3-level checkpointing |
577 |
[as an example, we give explicit numbers for a 3-day |
[as an example, we give explicit numbers for a 3-day |
578 |
integration with a 1-hourly timestep in square brackets]. |
integration with a 1-hourly timestep in square brackets]. |
583 |
$ {n}^{lev3} $ subsections [$ {n}^{lev3} $=3 1-day intervals], |
$ {n}^{lev3} $ subsections [$ {n}^{lev3} $=3 1-day intervals], |
584 |
with the label $lev3$ for this outermost loop. |
with the label $lev3$ for this outermost loop. |
585 |
The model is then integrated along the full trajectory, |
The model is then integrated along the full trajectory, |
586 |
and the model state stored only at every $ k_{i}^{lev3} $-th timestep |
and the model state stored to disk only at every $ k_{i}^{lev3} $-th timestep |
587 |
[i.e. 3 times, at |
[i.e. 3 times, at |
588 |
$ i = 0,1,2 $ corresponding to $ k_{i}^{lev3} = 0, 24, 48 $]. |
$ i = 0,1,2 $ corresponding to $ k_{i}^{lev3} = 0, 24, 48 $]. |
589 |
|
In addition, the cost function is computed, if needed. |
590 |
% |
% |
591 |
\item [$lev2$] |
\item [$lev2$] |
592 |
In a second step each subsection itself is divided into |
In a second step each subsection itself is divided into |
593 |
$ {n}^{lev2} $ sub-subsections |
$ {n}^{lev2} $ subsections |
594 |
[$ {n}^{lev2} $=4 6-hour intervals per subsection]. |
[$ {n}^{lev2} $=4 6-hour intervals per subsection]. |
595 |
The model picks up at the last outermost dumped state |
The model picks up at the last outermost dumped state |
596 |
$ v_{k_{n}^{lev3}} $ and is integrated forward in time along |
$ v_{k_{n}^{lev3}} $ and is integrated forward in time along |
597 |
the last subsection, with the label $lev2$ for this |
the last subsection, with the label $lev2$ for this |
598 |
intermediate loop. |
intermediate loop. |
599 |
The model state is now stored at every $ k_{i}^{lev2} $-th |
The model state is now stored to disk at every $ k_{i}^{lev2} $-th |
600 |
timestep |
timestep |
601 |
[i.e. 4 times, at |
[i.e. 4 times, at |
602 |
$ i = 0,1,2,3 $ corresponding to $ k_{i}^{lev2} = 48, 54, 60, 66 $]. |
$ i = 0,1,2,3 $ corresponding to $ k_{i}^{lev2} = 48, 54, 60, 66 $]. |
604 |
\item [$lev1$] |
\item [$lev1$] |
605 |
Finally, the model picks up at the last intermediate dump state |
Finally, the model picks up at the last intermediate dump state |
606 |
$ v_{k_{n}^{lev2}} $ and is integrated forward in time along |
$ v_{k_{n}^{lev2}} $ and is integrated forward in time along |
607 |
the last sub-subsection, with the label $lev1$ for this |
the last subsection, with the label $lev1$ for this |
608 |
intermediate loop. |
intermediate loop. |
609 |
Within this sub-subsection only, the model state is stored |
Within this sub-subsection only, parts of the model state is stored |
610 |
at every timestep |
to memory at every timestep |
611 |
[i.e. every hour $ i=0,...,5$ corresponding to |
[i.e. every hour $ i=0,...,5$ corresponding to |
612 |
$ k_{i}^{lev1} = 66, 67, \ldots, 71 $]. |
$ k_{i}^{lev1} = 66, 67, \ldots, 71 $]. |
613 |
Thus, the final state $ v_n = v_{k_{n}^{lev1}} $ is reached |
The final state $ v_n = v_{k_{n}^{lev1}} $ is reached |
614 |
and the model state of all proceeding timesteps along the last |
and the model state of all preceding timesteps along the last |
615 |
sub-subsections are available, enabling integration backwards |
innermost subsection are available, enabling integration backwards |
616 |
in time along the last sub-subsection. |
in time along the last subsection. |
617 |
Thus, the adjoint can be computed along this last |
The adjoint can thus be computed along this last |
618 |
sub-subsection $k_{n}^{lev2}$. |
subsection $k_{n}^{lev2}$. |
619 |
% |
% |
620 |
\end{itemize} |
\end{itemize} |
621 |
% |
% |
622 |
This procedure is repeated consecutively for each previous |
This procedure is repeated consecutively for each previous |
623 |
sub-subsection $k_{n-1}^{lev2}, \ldots, k_{1}^{lev2} $ |
subsection $k_{n-1}^{lev2}, \ldots, k_{1}^{lev2} $ |
624 |
carrying the adjoint computation to the initial time |
carrying the adjoint computation to the initial time |
625 |
of the subsection $k_{n}^{lev3}$. |
of the subsection $k_{n}^{lev3}$. |
626 |
Then, the procedure is repeated for the previous subsection |
Then, the procedure is repeated for the previous subsection |
631 |
For the full model trajectory of |
For the full model trajectory of |
632 |
$ n^{lev3} \cdot n^{lev2} \cdot n^{lev1} $ timesteps |
$ n^{lev3} \cdot n^{lev2} \cdot n^{lev1} $ timesteps |
633 |
the required storing of the model state was significantly reduced to |
the required storing of the model state was significantly reduced to |
634 |
$ n^{lev1} + n^{lev2} + n^{lev3} $ |
$ n^{lev2} + n^{lev3} $ to disk and roughly $ n^{lev1} $ to memory |
635 |
[i.e. for the 3-day integration with a total oof 72 timesteps |
[i.e. for the 3-day integration with a total oof 72 timesteps |
636 |
the model state was stored 13 times]. |
the model state was stored 7 times to disk and roughly 6 times |
637 |
|
to memory]. |
638 |
This saving in memory comes at a cost of a required |
This saving in memory comes at a cost of a required |
639 |
3 full forward integrations of the model (one for each |
3 full forward integrations of the model (one for each |
640 |
checkpointing level). |
checkpointing level). |
641 |
The balance of storage vs. recomputation certainly depends |
The optimal balance of storage vs. recomputation certainly depends |
642 |
on the computing resources available. |
on the computing resources available and may be adjusted by |
643 |
|
adjusting the partitioning among the |
644 |
|
$ n^{lev3}, \,\, n^{lev2}, \,\, n^{lev1} $. |
645 |
|
|
646 |
\begin{figure}[t!] |
\begin{figure}[t!] |
647 |
\begin{center} |
\begin{center} |
689 |
{\it the\_model\_main}, instead of calling {\it the\_main\_loop}, |
{\it the\_model\_main}, instead of calling {\it the\_main\_loop}, |
690 |
invokes the adjoint of this routine, {\it adthe\_main\_loop}, |
invokes the adjoint of this routine, {\it adthe\_main\_loop}, |
691 |
which is the toplevel routine in terms of reverse mode computation. |
which is the toplevel routine in terms of reverse mode computation. |
692 |
The routine {\it adthe\_main\_loop} has been generated using TAMC. |
The routine {\it adthe\_main\_loop} has been generated by TAMC. |
693 |
It contains both the forward integration of the full model, |
It contains both the forward integration of the full model, |
694 |
any additional storing that is required for efficient checkpointing, |
any additional storing that is required for efficient checkpointing, |
695 |
and the reverse integration of the adjoint model. |
and the reverse integration of the adjoint model. |
697 |
simplified for clarification; in particular, no checkpointing |
simplified for clarification; in particular, no checkpointing |
698 |
procedures are shown here. |
procedures are shown here. |
699 |
Prior to the call of {\it adthe\_main\_loop}, the routine |
Prior to the call of {\it adthe\_main\_loop}, the routine |
700 |
{\it ctrl\_unpack} is invoked to unpack the control vector, |
{\it ctrl\_unpack} is invoked to unpack the control vector |
701 |
and following that call, the routine {\it ctrl\_pack} |
or initialise the control variables. |
702 |
|
Following the call of {\it adthe\_main\_loop}, |
703 |
|
the routine {\it ctrl\_pack} |
704 |
is invoked to pack the control vector |
is invoked to pack the control vector |
705 |
(cf. Section \ref{section_ctrl}). |
(cf. Section \ref{section_ctrl}). |
706 |
If gradient checks are to be performed, the option |
If gradient checks are to be performed, the option |
715 |
The cost function $ {\cal J} $ is referred to as the {\sf dependent variable}. |
The cost function $ {\cal J} $ is referred to as the {\sf dependent variable}. |
716 |
It is a function of the input variables $ \vec{u} $ via the composition |
It is a function of the input variables $ \vec{u} $ via the composition |
717 |
$ {\cal J}(\vec{u}) \, = \, {\cal J}(M(\vec{u})) $. |
$ {\cal J}(\vec{u}) \, = \, {\cal J}(M(\vec{u})) $. |
718 |
The input is referred to as the |
The input are referred to as the |
719 |
{\sf independent variables} or {\sf control variables}. |
{\sf independent variables} or {\sf control variables}. |
720 |
All aspects relevant to the treatment of the cost function $ {\cal J} $ |
All aspects relevant to the treatment of the cost function $ {\cal J} $ |
721 |
(parameter setting, initialization, accumulation, |
(parameter setting, initialization, accumulation, |
722 |
final evaluation), are controlled by the package {\it pkg/cost}. |
final evaluation), are controlled by the package {\it pkg/cost}. |
723 |
|
The aspects relevant to the treatment of the independent variables |
724 |
|
are controlled by the package {\it pkg/ctrl} and will be treated |
725 |
|
in the next section. |
726 |
|
|
727 |
\input{part5/doc_cost_flow} |
\input{part5/doc_cost_flow} |
728 |
|
|
757 |
{\tt genmake -enable=cost}. |
{\tt genmake -enable=cost}. |
758 |
% |
% |
759 |
\end{enumerate} |
\end{enumerate} |
760 |
|
N.B.: In general the following packages ought to be enabled |
761 |
|
simultaneously: {\it autodiff, cost, ctrl}. |
762 |
The basic CPP option to enable the cost function is {\bf ALLOW\_COST}. |
The basic CPP option to enable the cost function is {\bf ALLOW\_COST}. |
763 |
Each specific cost function contribution has its own option. |
Each specific cost function contribution has its own option. |
764 |
For the present example the option is {\bf ALLOW\_COST\_TRACER}. |
For the present example the option is {\bf ALLOW\_COST\_TRACER}. |
765 |
All cost-specific options are set in {\it ECCO\_CPPOPTIONS.h} |
All cost-specific options are set in {\it ECCO\_CPPOPTIONS.h} |
766 |
Since the cost function is usually used in conjunction with |
Since the cost function is usually used in conjunction with |
767 |
automatic differentiation, the CPP option |
automatic differentiation, the CPP option |
768 |
{\bf ALLOW\_ADJOINT\_RUN} should be defined |
{\bf ALLOW\_ADJOINT\_RUN} (file {\it CPP\_OPTIONS.h}) and |
769 |
(file {\it CPP\_OPTIONS.h}). |
{\bf ALLOW\_AUTODIFF\_TAMC} (file {\it ECCO\_CPPOPTIONS.h}) |
770 |
|
should be defined. |
771 |
|
|
772 |
\subsubsection{Initialization} |
\subsubsection{Initialization} |
773 |
% |
% |
774 |
The initialization of the {\it cost} package is readily enabled |
The initialization of the {\it cost} package is readily enabled |
775 |
as soon as the CPP option {\bf ALLOW\_ADJOINT\_RUN} is defined. |
as soon as the CPP option {\bf ALLOW\_COST} is defined. |
776 |
% |
% |
777 |
\begin{itemize} |
\begin{itemize} |
778 |
% |
% |
846 |
\begin{equation} |
\begin{equation} |
847 |
{\cal J} \, = \, |
{\cal J} \, = \, |
848 |
{\rm fc} \, = \, |
{\rm fc} \, = \, |
849 |
{\rm mult\_tracer} \sum_{bi,\,bj}^{nSx,\,nSy} |
{\rm mult\_tracer} \sum_{\text{global sum}} \sum_{bi,\,bj}^{nSx,\,nSy} |
850 |
{\rm objf\_tracer}(bi,bj) \, + \, ... |
{\rm objf\_tracer}(bi,bj) \, + \, ... |
851 |
\end{equation} |
\end{equation} |
852 |
% |
% |
894 |
% |
% |
895 |
To enable the directory to be included to the compile list, |
To enable the directory to be included to the compile list, |
896 |
{\bf ctrl} has to be added to the {\bf enable} list in |
{\bf ctrl} has to be added to the {\bf enable} list in |
897 |
{\it .genmakerc} (or {\it genmake} itself). |
{\it .genmakerc} or in {\it genmake} itself (analogous to {\it cost} |
898 |
|
package, cf. previous section). |
899 |
Each control variable is enabled via its own CPP option |
Each control variable is enabled via its own CPP option |
900 |
in {\it ECCO\_CPPOPTIONS.h}. |
in {\it ECCO\_CPPOPTIONS.h}. |
901 |
|
|
1039 |
% |
% |
1040 |
Note, that reading an active variable corresponds |
Note, that reading an active variable corresponds |
1041 |
to a variable assignment. Its derivative corresponds |
to a variable assignment. Its derivative corresponds |
1042 |
to a write statement of the adjoint variable. |
to a write statement of the adjoint variable, followed by |
1043 |
|
a reset. |
1044 |
The 'active file' routines have been designed |
The 'active file' routines have been designed |
1045 |
to support active read and corresponding adjoint active write |
to support active read and corresponding adjoint active write |
1046 |
operations (and vice versa). |
operations (and vice versa). |
1157 |
{\it addummy\_in\_stepping}. |
{\it addummy\_in\_stepping}. |
1158 |
This routine is part of the adjoint support package |
This routine is part of the adjoint support package |
1159 |
{\it pkg/autodiff} (cf.f. below). |
{\it pkg/autodiff} (cf.f. below). |
1160 |
|
The procedure is enabled using via the CPP-option |
1161 |
|
{\bf ALLOW\_AUTODIFF\_MONITOR} (file {\it ECCO\_CPPOPTIONS.h}). |
1162 |
To be part of the adjoint code, the corresponding S/R |
To be part of the adjoint code, the corresponding S/R |
1163 |
{\it dummy\_in\_stepping} has to be called in the forward |
{\it dummy\_in\_stepping} has to be called in the forward |
1164 |
model (S/R {\it the\_main\_loop}) at the appropriate place. |
model (S/R {\it the\_main\_loop}) at the appropriate place. |
1165 |
|
The adjoint common blocks are extracted from the adjoint code |
1166 |
|
via the header file {\it adcommon.h}. |
1167 |
|
|
1168 |
{\it dummy\_in\_stepping} is essentially empty, |
{\it dummy\_in\_stepping} is essentially empty, |
1169 |
the corresponding adjoint routine is hand-written rather |
the corresponding adjoint routine is hand-written rather |
1190 |
{\bf /adtr1\_r/}, {\bf /adffields/}, |
{\bf /adtr1\_r/}, {\bf /adffields/}, |
1191 |
which have been extracted from the adjoint code to enable |
which have been extracted from the adjoint code to enable |
1192 |
access to the adjoint variables. |
access to the adjoint variables. |
1193 |
|
|
1194 |
|
{\bf WARNING:} If the structure of the common blocks |
1195 |
|
{\bf /dynvars\_r/}, {\bf /dynvars\_cd/}, etc., changes |
1196 |
|
similar changes will occur in the adjoint common blocks. |
1197 |
|
Therefore, consistency between the TAMC-generated common blocks |
1198 |
|
and those in {\it adcommon.h} have to be checked. |
1199 |
% |
% |
1200 |
\end{itemize} |
\end{itemize} |
1201 |
|
|