18 |
can be generated automatically in this way. |
can be generated automatically in this way. |
19 |
|
|
20 |
The MITGCM has been adapted for use with the |
The MITGCM has been adapted for use with the |
21 |
Tangent linear and Adjoint Model Compiler (TAMC) and its succssor TAF |
Tangent linear and Adjoint Model Compiler (TAMC) and its successor TAF |
22 |
(Transformation of Algorithms in Fortran), developed |
(Transformation of Algorithms in Fortran), developed |
23 |
by Ralf Giering (\cite{gie-kam:98}, \cite{gie:99,gie:00}). |
by Ralf Giering (\cite{gie-kam:98}, \cite{gie:99,gie:00}). |
24 |
The first application of the adjoint of the MITGCM for senistivity |
The first application of the adjoint of the MITGCM for senistivity |
25 |
studies has been published by \cite{maro-eta:99}. |
studies has been published by \cite{maro-eta:99}. |
26 |
\cite{sta-eta:97,sta-eta:01} use the MITGCM and its adjoint |
\cite{sta-eta:97,sta-eta:01} use the MITGCM and its adjoint |
27 |
for ocean state estimation studies. |
for ocean state estimation studies. |
28 |
|
In the following we shall refer to TAMC and TAF synonymously, |
29 |
|
except were explicitly stated otherwise. |
30 |
|
|
31 |
TAMC exploits the chain rule for computing the first |
TAMC exploits the chain rule for computing the first |
32 |
derivative of a function with |
derivative of a function with |
33 |
respect to a set of input variables. |
respect to a set of input variables. |
34 |
Treating a given forward code as a composition of operations -- |
Treating a given forward code as a composition of operations -- |
35 |
each line representing a compositional element -- the chain rule is |
each line representing a compositional element, the chain rule is |
36 |
rigorously applied to the code, line by line. The resulting |
rigorously applied to the code, line by line. The resulting |
37 |
tangent linear or adjoint code, |
tangent linear or adjoint code, |
38 |
then, may be thought of as the composition in |
then, may be thought of as the composition in |
39 |
forward or reverse order, respectively, of the |
forward or reverse order, respectively, of the |
40 |
Jacobian matrices of the forward code compositional elements. |
Jacobian matrices of the forward code's compositional elements. |
41 |
|
|
42 |
%********************************************************************** |
%********************************************************************** |
43 |
\section{Some basic algebra} |
\section{Some basic algebra} |
107 |
$ M $ is just a matrix |
$ M $ is just a matrix |
108 |
which can readily be used to find the forward sensitivity of $\vec{v}$ to |
which can readily be used to find the forward sensitivity of $\vec{v}$ to |
109 |
perturbations in $u$, |
perturbations in $u$, |
110 |
but if there are very many input variables $(>>O(10^{6})$ for |
but if there are very many input variables $(\gg O(10^{6})$ for |
111 |
large-scale oceanographic application), it quickly becomes |
large-scale oceanographic application), it quickly becomes |
112 |
prohibitive to proceed directly as in (\ref{tangent_linear}), |
prohibitive to proceed directly as in (\ref{tangent_linear}), |
113 |
if the impact of each component $ {\bf e_{i}} $ is to be assessed. |
if the impact of each component $ {\bf e_{i}} $ is to be assessed. |
132 |
\label{compo} |
\label{compo} |
133 |
\end{eqnarray} |
\end{eqnarray} |
134 |
% |
% |
135 |
The linear approximation of $ {\cal J} $, |
The perturbation of $ {\cal J} $ around a fixed point $ {\cal J}_0 $, |
136 |
\[ |
\[ |
137 |
{\cal J} \, \approx \, {\cal J}_0 \, + \, \delta {\cal J} |
{\cal J} \, = \, {\cal J}_0 \, + \, \delta {\cal J} |
138 |
\] |
\] |
139 |
can be expressed in both bases of $ \vec{u} $ and $ \vec{v} $ |
can be expressed in both bases of $ \vec{u} $ and $ \vec{v} $ |
140 |
w.r.t. their corresponding inner product |
w.r.t. their corresponding inner product |
154 |
\label{deljidentity} |
\label{deljidentity} |
155 |
\end{equation} |
\end{equation} |
156 |
% |
% |
157 |
(note, that the gradient $ \nabla f $ is a pseudo-vector, therefore |
(note, that the gradient $ \nabla f $ is a co-vector, therefore |
158 |
its transpose is required in the above inner product). |
its transpose is required in the above inner product). |
159 |
Then, using the representation of |
Then, using the representation of |
160 |
$ \delta {\cal J} = |
$ \delta {\cal J} = |
170 |
\[ |
\[ |
171 |
A^{\ast} \, = \, A^T |
A^{\ast} \, = \, A^T |
172 |
\] |
\] |
173 |
and from eq. (\ref{tangent_linear}), we note that |
and from eq. (\ref{tangent_linear}), (\ref{deljidentity}), |
174 |
|
we note that |
175 |
(omitting $|$'s): |
(omitting $|$'s): |
176 |
% |
% |
177 |
\begin{equation} |
\begin{equation} |
207 |
$ \delta \vec{u}^{\ast} $ the adjoint variable of the control variable $ \vec{u} $. |
$ \delta \vec{u}^{\ast} $ the adjoint variable of the control variable $ \vec{u} $. |
208 |
|
|
209 |
The {\sf reverse} nature of the adjoint calculation can be readily |
The {\sf reverse} nature of the adjoint calculation can be readily |
210 |
seen as follows. Let us decompose ${\cal J}(u)$, thus: |
seen as follows. |
211 |
|
Consider a model integration which consists of $ \Lambda $ |
212 |
|
consecutive operations |
213 |
|
$ {\cal M}_{\Lambda} ( {\cal M}_{\Lambda-1} ( |
214 |
|
...... ( {\cal M}_{\lambda} ( |
215 |
|
...... |
216 |
|
( {\cal M}_{1} ( {\cal M}_{0}(\vec{u}) )))) $, |
217 |
|
where the ${\cal M}$'s could be the elementary steps, i.e. single lines |
218 |
|
in the code of the model, or successive time steps of the |
219 |
|
model integration, |
220 |
|
starting at step 0 and moving up to step $\Lambda$, with intermediate |
221 |
|
${\cal M}_{\lambda} (\vec{u}) = \vec{v}^{(\lambda+1)}$ and final |
222 |
|
${\cal M}_{\Lambda} (\vec{u}) = \vec{v}^{(\Lambda+1)} = \vec{v}$. |
223 |
|
Let ${\cal J}$ be a cost funciton which explicitly depends on the |
224 |
|
final state $\vec{v}$ only |
225 |
|
(this restriction is for clarity reasons only). |
226 |
|
% |
227 |
|
${\cal J}(u)$ may be decomposed according to: |
228 |
% |
% |
229 |
\begin{equation} |
\begin{equation} |
230 |
{\cal J}({\cal M}(\vec{u})) \, = \, |
{\cal J}({\cal M}(\vec{u})) \, = \, |
235 |
\label{compos} |
\label{compos} |
236 |
\end{equation} |
\end{equation} |
237 |
% |
% |
238 |
where the ${\cal M}$'s could be the elementary steps, i.e. single lines |
Then, according to the chain rule, the forward calculation reads, |
239 |
in the code of the model, |
in terms of the Jacobi matrices |
|
starting at step 0 and moving up to step $\Lambda$, with intermediate |
|
|
${\cal M}_{\lambda} (\vec{u}) = \vec{v}^{(\lambda+1)}$ and final |
|
|
${\cal M}_{\Lambda} (\vec{u}) = \vec{v}^{(\Lambda+1)} = \vec{v}$ |
|
|
Then, according to the chain rule the forward calculation reads in |
|
|
terms of the Jacobi matrices |
|
240 |
(we've omitted the $ | $'s which, nevertheless are important |
(we've omitted the $ | $'s which, nevertheless are important |
241 |
to the aspect of {\it tangent} linearity; |
to the aspect of {\it tangent} linearity; |
242 |
note also that per definition |
note also that by definition |
243 |
$ \langle \, \nabla _{v}{\cal J}^T \, , \, \delta \vec{v} \, \rangle |
$ \langle \, \nabla _{v}{\cal J}^T \, , \, \delta \vec{v} \, \rangle |
244 |
= \nabla_v {\cal J} \cdot \delta \vec{v} $ ) |
= \nabla_v {\cal J} \cdot \delta \vec{v} $ ) |
245 |
% |
% |
274 |
% |
% |
275 |
clearly expressing the reverse nature of the calculation. |
clearly expressing the reverse nature of the calculation. |
276 |
Eq. (\ref{reverse}) is at the heart of automatic adjoint compilers. |
Eq. (\ref{reverse}) is at the heart of automatic adjoint compilers. |
277 |
The intermediate steps $\lambda$ in |
If the intermediate steps $\lambda$ in |
278 |
eqn. (\ref{compos}) -- (\ref{reverse}) |
eqn. (\ref{compos}) -- (\ref{reverse}) |
279 |
could represent the model state (forward or adjoint) at each |
represent the model state (forward or adjoint) at each |
280 |
intermediate time step in which case |
intermediate time step as noted above, then correspondingly, |
281 |
$ {\cal M}(\vec{v}^{(\lambda)}) = \vec{v}^{(\lambda+1)} $, and correspondingly, |
$ M^T (\delta \vec{v}^{(\lambda) \, \ast}) = |
282 |
$ M^T (\delta \vec{v}^{(\lambda) \, \ast}) = \delta \vec{v}^{(\lambda-1) \, \ast} $, |
\delta \vec{v}^{(\lambda-1) \, \ast} $ for the adjoint variables. |
283 |
but they can also be viewed more generally as |
It thus becomes evident that the adjoint calculation also |
284 |
single lines of code in the numerical algorithm. |
yields the adjoint of each model state component |
285 |
In both cases it becomes evident that the adjoint calculation |
$ \vec{v}^{(\lambda)} $ at each intermediate step $ \lambda $, namely |
|
yields at the same time the adjoint of each model state component |
|
|
$ \vec{v}^{(\lambda)} $ at each intermediate step $ l $, namely |
|
286 |
% |
% |
287 |
\begin{equation} |
\begin{equation} |
288 |
\boxed{ |
\boxed{ |
298 |
% |
% |
299 |
in close analogy to eq. (\ref{adjoint}) |
in close analogy to eq. (\ref{adjoint}) |
300 |
We note in passing that that the $\delta \vec{v}^{(\lambda) \, \ast}$ |
We note in passing that that the $\delta \vec{v}^{(\lambda) \, \ast}$ |
301 |
are the Lagrange multipliers of the model state $ \vec{v}^{(\lambda)}$. |
are the Lagrange multipliers of the model equations which determine |
302 |
|
$ \vec{v}^{(\lambda)}$. |
303 |
|
|
304 |
In coponents, eq. (\ref{adjoint}) reads as follows. |
In coponents, eq. (\ref{adjoint}) reads as follows. |
305 |
Let |
Let |
409 |
$ \delta v^{(\lambda) \, \ast}_{j} = \frac{\partial}{\partial v^{(\lambda)}_{j}} |
$ \delta v^{(\lambda) \, \ast}_{j} = \frac{\partial}{\partial v^{(\lambda)}_{j}} |
410 |
{\cal J}^T $, $ j = 1, \ldots , n_{\lambda} $, |
{\cal J}^T $, $ j = 1, \ldots , n_{\lambda} $, |
411 |
for intermediate components, yielding |
for intermediate components, yielding |
412 |
\[ |
\begin{equation} |
413 |
\footnotesize |
\small |
414 |
|
\begin{split} |
415 |
\left( |
\left( |
416 |
\begin{array}{c} |
\begin{array}{c} |
417 |
\delta v^{(\lambda) \, \ast}_1 \\ |
\delta v^{(\lambda) \, \ast}_1 \\ |
419 |
\delta v^{(\lambda) \, \ast}_{n_{\lambda}} \\ |
\delta v^{(\lambda) \, \ast}_{n_{\lambda}} \\ |
420 |
\end{array} |
\end{array} |
421 |
\right) |
\right) |
422 |
\, = \, |
\, = & |
423 |
\left( |
\left( |
424 |
\begin{array}{ccc} |
\begin{array}{ccc} |
425 |
\frac{\partial ({\cal M}_{\lambda})_1}{\partial v^{(\lambda)}_1} |
\frac{\partial ({\cal M}_{\lambda})_1}{\partial v^{(\lambda)}_1} |
426 |
& \ldots & |
& \ldots \,\, \ldots & |
427 |
\frac{\partial ({\cal M}_{\lambda})_{n_{\lambda+1}}}{\partial v^{(\lambda)}_1} \\ |
\frac{\partial ({\cal M}_{\lambda})_{n_{\lambda+1}}}{\partial v^{(\lambda)}_1} \\ |
428 |
\vdots & ~ & \vdots \\ |
\vdots & ~ & \vdots \\ |
429 |
\frac{\partial ({\cal M}_{\lambda})_1}{\partial v^{(\lambda)}_{n_{\lambda}}} |
\frac{\partial ({\cal M}_{\lambda})_1}{\partial v^{(\lambda)}_{n_{\lambda}}} |
430 |
& \ldots & |
& \ldots \,\, \ldots & |
431 |
\frac{\partial ({\cal M}_{\lambda})_{n_{\lambda+1}}}{\partial v^{(\lambda)}_{n_{\lambda}}} \\ |
\frac{\partial ({\cal M}_{\lambda})_{n_{\lambda+1}}}{\partial v^{(\lambda)}_{n_{\lambda}}} \\ |
432 |
\end{array} |
\end{array} |
433 |
\right) |
\right) |
|
% |
|
434 |
\cdot |
\cdot |
435 |
% |
% |
436 |
|
\\ ~ & ~ |
437 |
|
\\ ~ & |
438 |
|
% |
439 |
\left( |
\left( |
440 |
\begin{array}{ccc} |
\begin{array}{ccc} |
441 |
\frac{\partial ({\cal M}_{\lambda+1})_1}{\partial v^{(\lambda+1)}_1} |
\frac{\partial ({\cal M}_{\lambda+1})_1}{\partial v^{(\lambda+1)}_1} |
448 |
\frac{\partial ({\cal M}_{\lambda+1})_{n_{\lambda+2}}}{\partial v^{(\lambda+1)}_{n_{\lambda+1}}} \\ |
\frac{\partial ({\cal M}_{\lambda+1})_{n_{\lambda+2}}}{\partial v^{(\lambda+1)}_{n_{\lambda+1}}} \\ |
449 |
\end{array} |
\end{array} |
450 |
\right) |
\right) |
451 |
\cdot \ldots \ldots \cdot |
\cdot \, \ldots \, \cdot |
452 |
\left( |
\left( |
453 |
\begin{array}{c} |
\begin{array}{c} |
454 |
\delta v^{\ast}_1 \\ |
\delta v^{\ast}_1 \\ |
456 |
\delta v^{\ast}_{n} \\ |
\delta v^{\ast}_{n} \\ |
457 |
\end{array} |
\end{array} |
458 |
\right) |
\right) |
459 |
\] |
\end{split} |
460 |
|
\end{equation} |
461 |
|
|
462 |
Eq. (\ref{forward}) and (\ref{reverse}) are perhaps clearest in |
Eq. (\ref{forward}) and (\ref{reverse}) are perhaps clearest in |
463 |
showing the advantage of the reverse over the forward mode |
showing the advantage of the reverse over the forward mode |
478 |
gradient $\nabla _{u}{\cal J}$ (and all intermediate gradients |
gradient $\nabla _{u}{\cal J}$ (and all intermediate gradients |
479 |
$\nabla _{v^{(\lambda)}}{\cal J}$) within a single reverse calculation. |
$\nabla _{v^{(\lambda)}}{\cal J}$) within a single reverse calculation. |
480 |
|
|
481 |
Note, that in case $ {\cal J} $ is a vector-valued function |
Note, that if $ {\cal J} $ is a vector-valued function |
482 |
of dimension $ l > 1 $, |
of dimension $ l > 1 $, |
483 |
eq. (\ref{reverse}) has to be modified according to |
eq. (\ref{reverse}) has to be modified according to |
484 |
\[ |
\[ |
486 |
\, = \, |
\, = \, |
487 |
\nabla_u {\cal J}^T \cdot \delta \vec{J} |
\nabla_u {\cal J}^T \cdot \delta \vec{J} |
488 |
\] |
\] |
489 |
where now $ \delta \vec{J} \in I\!\!R $ is a vector of dimenison $ l $. |
where now $ \delta \vec{J} \in I\!\!R^l $ is a vector of |
490 |
|
dimenison $ l $. |
491 |
In this case $ l $ reverse simulations have to be performed |
In this case $ l $ reverse simulations have to be performed |
492 |
for each $ \delta J_{k}, \,\, k = 1, \ldots, l $. |
for each $ \delta J_{k}, \,\, k = 1, \ldots, l $. |
493 |
Then, the reverse mode is more efficient as long as |
Then, the reverse mode is more efficient as long as |
522 |
\paragraph{Example 2: |
\paragraph{Example 2: |
523 |
$ {\cal J} = \langle \, {\cal H}(\vec{v}) - \vec{d} \, , |
$ {\cal J} = \langle \, {\cal H}(\vec{v}) - \vec{d} \, , |
524 |
\, {\cal H}(\vec{v}) - \vec{d} \, \rangle $} ~ \\ |
\, {\cal H}(\vec{v}) - \vec{d} \, \rangle $} ~ \\ |
525 |
The cost function represents the quadratic model vs.data misfit. |
The cost function represents the quadratic model vs. data misfit. |
526 |
Here, $ \vec{d} $ is the data vector and $ {\cal H} $ represents the |
Here, $ \vec{d} $ is the data vector and $ {\cal H} $ represents the |
527 |
operator which maps the model state space onto the data space. |
operator which maps the model state space onto the data space. |
528 |
Then, $ \nabla_v {\cal J} $ takes the form |
Then, $ \nabla_v {\cal J} $ takes the form |
553 |
|
|
554 |
We note an important aspect of the forward vs. reverse |
We note an important aspect of the forward vs. reverse |
555 |
mode calculation. |
mode calculation. |
556 |
Because of the locality of the derivative, |
Because of the local character of the derivative |
557 |
|
(a derivative is defined w.r.t. a point along the trajectory), |
558 |
the intermediate results of the model trajectory |
the intermediate results of the model trajectory |
559 |
$\vec{v}^{(\lambda+1)}={\cal M}_{\lambda}(v^{(\lambda)})$ |
$\vec{v}^{(\lambda+1)}={\cal M}_{\lambda}(v^{(\lambda)})$ |
560 |
are needed to evaluate the intermediate Jacobian |
are needed to evaluate the intermediate Jacobian |
561 |
$M_{\lambda}|_{\vec{v}^{(\lambda)}} \, \delta \vec{v}^{(\lambda)} $. |
$M_{\lambda}|_{\vec{v}^{(\lambda)}} \, \delta \vec{v}^{(\lambda)} $. |
562 |
In the forward mode, the intermediate results are required |
In the forward mode, the intermediate results are required |
563 |
in the same order as computed by the full forward model ${\cal M}$, |
in the same order as computed by the full forward model ${\cal M}$, |
564 |
in the reverse mode they are required in the reverse order. |
but in the reverse mode they are required in the reverse order. |
565 |
Thus, in the reverse mode the trajectory of the forward model |
Thus, in the reverse mode the trajectory of the forward model |
566 |
integration ${\cal M}$ has to be stored to be available in the reverse |
integration ${\cal M}$ has to be stored to be available in the reverse |
567 |
calculation. Alternatively, the model state would have to be |
calculation. Alternatively, the complete model state up to the |
568 |
recomputed whenever its value is required. |
point of evaluation has to be recomputed whenever its value is required. |
569 |
|
|
570 |
A method to balance the amount of recomputations vs. |
A method to balance the amount of recomputations vs. |
571 |
storage requirements is called {\sf checkpointing} |
storage requirements is called {\sf checkpointing} |
572 |
(e.g. \cite{res-eta:98}). |
(e.g. \cite{res-eta:98}). |
573 |
It is depicted in Fig. ... for a 3-level checkpointing |
It is depicted in \reffig{3levelcheck} for a 3-level checkpointing |
574 |
[as concrete example, we give explicit numbers for a 3-day |
[as an example, we give explicit numbers for a 3-day |
575 |
integration with a 1-hourly timestep in square brackets]. |
integration with a 1-hourly timestep in square brackets]. |
576 |
\begin{itemize} |
\begin{itemize} |
577 |
% |
% |
579 |
In a first step, the model trajectory is subdivided into |
In a first step, the model trajectory is subdivided into |
580 |
$ {n}^{lev3} $ subsections [$ {n}^{lev3} $=3 1-day intervals], |
$ {n}^{lev3} $ subsections [$ {n}^{lev3} $=3 1-day intervals], |
581 |
with the label $lev3$ for this outermost loop. |
with the label $lev3$ for this outermost loop. |
582 |
The model is then integrated over the full trajectory, |
The model is then integrated along the full trajectory, |
583 |
and the model state stored only at every $ k_{i}^{lev3} $-th timestep |
and the model state stored only at every $ k_{i}^{lev3} $-th timestep |
584 |
[i.e. 3 times, at |
[i.e. 3 times, at |
585 |
$ i = 0,1,2 $ corresponding to $ k_{i}^{lev3} = 0, 24, 48 $]. |
$ i = 0,1,2 $ corresponding to $ k_{i}^{lev3} = 0, 24, 48 $]. |
586 |
% |
% |
587 |
\item [$lev2$] |
\item [$lev2$] |
588 |
In a second step each subsection is itself divided into |
In a second step each subsection itself is divided into |
589 |
$ {n}^{lev2} $ subsubsections |
$ {n}^{lev2} $ sub-subsections |
590 |
[$ {n}^{lev2} $=4 6-hour intervals per subsection]. |
[$ {n}^{lev2} $=4 6-hour intervals per subsection]. |
591 |
The model picks up at the last outermost dumped state |
The model picks up at the last outermost dumped state |
592 |
$ v_{k_{n}^{lev3}} $ and is integrated forward in time over |
$ v_{k_{n}^{lev3}} $ and is integrated forward in time along |
593 |
the last subsection, with the label $lev2$ for this |
the last subsection, with the label $lev2$ for this |
594 |
intermediate loop. |
intermediate loop. |
595 |
The model state is now stored only at every $ k_{i}^{lev2} $-th |
The model state is now stored at every $ k_{i}^{lev2} $-th |
596 |
timestep |
timestep |
597 |
[i.e. 4 times, at |
[i.e. 4 times, at |
598 |
$ i = 0,1,2,3 $ corresponding to $ k_{i}^{lev2} = 48, 54, 60, 66 $]. |
$ i = 0,1,2,3 $ corresponding to $ k_{i}^{lev2} = 48, 54, 60, 66 $]. |
599 |
% |
% |
600 |
\item [$lev1$] |
\item [$lev1$] |
601 |
Finally, the mode picks up at the last intermediate dump state |
Finally, the model picks up at the last intermediate dump state |
602 |
$ v_{k_{n}^{lev2}} $ and is integrated forward in time over |
$ v_{k_{n}^{lev2}} $ and is integrated forward in time along |
603 |
the last subsubsection, with the label $lev1$ for this |
the last sub-subsection, with the label $lev1$ for this |
604 |
intermediate loop. |
intermediate loop. |
605 |
Within this subsubsection only, the model state is stored |
Within this sub-subsection only, the model state is stored |
606 |
at every timestep |
at every timestep |
607 |
[i.e. every hour $ i=0,...,5$ corresponding to |
[i.e. every hour $ i=0,...,5$ corresponding to |
608 |
$ k_{i}^{lev1} = 66, 67, \ldots, 71 $]. |
$ k_{i}^{lev1} = 66, 67, \ldots, 71 $]. |
609 |
Thus, the final state $ v_n = v_{k_{n}^{lev1}} $ is reached |
Thus, the final state $ v_n = v_{k_{n}^{lev1}} $ is reached |
610 |
and the model state of all peceeding timesteps over the last |
and the model state of all peceeding timesteps along the last |
611 |
subsubsections are available, enabling integration backwards |
sub-subsections are available, enabling integration backwards |
612 |
in time over the last subsubsection. |
in time along the last sub-subsection. |
613 |
Thus, the adjoint can be computed over this last |
Thus, the adjoint can be computed along this last |
614 |
subsubsection $k_{n}^{lev2}$. |
sub-subsection $k_{n}^{lev2}$. |
615 |
% |
% |
616 |
\end{itemize} |
\end{itemize} |
617 |
% |
% |
618 |
This procedure is repeated consecutively for each previous |
This procedure is repeated consecutively for each previous |
619 |
subsubsection $k_{n-1}^{lev2}, \ldots, k_{1}^{lev2} $ |
sub-subsection $k_{n-1}^{lev2}, \ldots, k_{1}^{lev2} $ |
620 |
carrying the adjoint computation to the initial time |
carrying the adjoint computation to the initial time |
621 |
of the subsection $k_{n}^{lev3}$. |
of the subsection $k_{n}^{lev3}$. |
622 |
Then, the procedure is repeated for the previous subsection |
Then, the procedure is repeated for the previous subsection |
652 |
\caption |
\caption |
653 |
{Schematic view of intermediate dump and restart for |
{Schematic view of intermediate dump and restart for |
654 |
3-level checkpointing.} |
3-level checkpointing.} |
655 |
\label{fig:erswns} |
\label{fig:3levelcheck} |
656 |
\end{figure} |
\end{figure} |
657 |
|
|
658 |
\subsection{Optimal perturbations} |
% \subsection{Optimal perturbations} |
659 |
\label{optpert} |
% \label{sec_optpert} |
660 |
|
|
661 |
|
|
662 |
\subsection{Error covariance estimate and Hessian matrix} |
% \subsection{Error covariance estimate and Hessian matrix} |
663 |
\label{sec_hessian} |
% \label{sec_hessian} |
664 |
|
|
665 |
\newpage |
\newpage |
666 |
|
|
669 |
\label{sec_ad_setup_ex} |
\label{sec_ad_setup_ex} |
670 |
%********************************************************************** |
%********************************************************************** |
671 |
|
|
672 |
The MITGCM has been adapted to enable AD using TAMC or TAF |
The MITGCM has been adapted to enable AD using TAMC or TAF. |
|
(we'll refer to TAMC and TAF interchangeably, except where |
|
|
distinctions are explicitly mentioned). |
|
673 |
The present description, therefore, is specific to the |
The present description, therefore, is specific to the |
674 |
use of TAMC as AD tool. |
use of TAMC or TAF as AD tool. |
675 |
The following sections describe the steps which are necessary to |
The following sections describe the steps which are necessary to |
676 |
generate a tangent linear or adjoint model of the MITGCM. |
generate a tangent linear or adjoint model of the MITGCM. |
677 |
We take as an example the sensitivity of carbon sequestration |
We take as an example the sensitivity of carbon sequestration |
682 |
\subsection{Overview of the experiment} |
\subsection{Overview of the experiment} |
683 |
|
|
684 |
We describe an adjoint sensitivity analysis of outgassing from |
We describe an adjoint sensitivity analysis of outgassing from |
685 |
the ocean into the atmosphere of a carbon like tracer injected |
the ocean into the atmosphere of a carbon-like tracer injected |
686 |
into the ocean interior (see \cite{hil-eta:01}). |
into the ocean interior (see \cite{hil-eta:01}). |
687 |
|
|
688 |
\subsubsection{Passive tracer equation} |
\subsubsection{Passive tracer equation} |
704 |
direct injection. |
direct injection. |
705 |
The velocity term, $U$, is the sum of the |
The velocity term, $U$, is the sum of the |
706 |
model Eulerian circulation and an eddy-induced velocity, the latter |
model Eulerian circulation and an eddy-induced velocity, the latter |
707 |
parameterized according to Gent/McWilliams (\cite{gen:90, dan:95}). |
parameterized according to Gent/McWilliams |
708 |
|
(\cite{gen-mcw:90, gen-eta:95}). |
709 |
The convection function, $\Gamma$, mixes $C$ vertically wherever the |
The convection function, $\Gamma$, mixes $C$ vertically wherever the |
710 |
fluid is locally statically unstable. |
fluid is locally statically unstable. |
711 |
|
|
796 |
% |
% |
797 |
\item {\it data.ctrl} |
\item {\it data.ctrl} |
798 |
% |
% |
799 |
|
\item {\it data.gmredi} |
800 |
|
% |
801 |
|
\item {\it data.grdchk} |
802 |
|
% |
803 |
|
\item {\it data.optim} |
804 |
|
% |
805 |
\item {\it data.pkg} |
\item {\it data.pkg} |
806 |
% |
% |
807 |
\item {\it eedata} |
\item {\it eedata} |
832 |
specific to this experiment. |
specific to this experiment. |
833 |
|
|
834 |
\subsubsection{File {\it .genmakerc}} |
\subsubsection{File {\it .genmakerc}} |
835 |
This file overwites default settings of {\it genmake}. |
This file overwrites default settings of {\it genmake}. |
836 |
In the present example it is used to switch on the following |
In the present example it is used to switch on the following |
837 |
packages which are related to automatic differentiation |
packages which are related to automatic differentiation |
838 |
and are disabled by default: \\ |
and are disabled by default: \\ |
839 |
\hspace*{4ex} {\tt set ENABLE=( autodiff cost ctrl ecco )} \\ |
\hspace*{4ex} {\tt set ENABLE=( autodiff cost ctrl ecco gmredi grdchk kpp )} \\ |
840 |
Other packages which are not needed are switched off: \\ |
Other packages which are not needed are switched off: \\ |
841 |
\hspace*{4ex} {\tt set DISABLE=( aim obcs zonal\_filt shap\_filt cal exf )} |
\hspace*{4ex} {\tt set DISABLE=( aim obcs zonal\_filt shap\_filt cal exf )} |
842 |
|
|
853 |
|
|
854 |
This file contains 'wrapper'-specific CPP options. |
This file contains 'wrapper'-specific CPP options. |
855 |
It only needs to be changed if the code is to be run |
It only needs to be changed if the code is to be run |
856 |
in parallel environment (see Section \ref{???}). |
in a parallel environment (see Section \ref{???}). |
857 |
|
|
858 |
\subsubsection{File {\it CPP\_OPTIONS.h}} |
\subsubsection{File {\it CPP\_OPTIONS.h}} |
859 |
|
|
862 |
Most options are related to the forward model setup. |
Most options are related to the forward model setup. |
863 |
They are identical to the global steady circulation setup of |
They are identical to the global steady circulation setup of |
864 |
{\it verification/exp2/}. |
{\it verification/exp2/}. |
865 |
The option specific to this experiment is \\ |
The three options specific to this experiment are \\ |
866 |
|
\hspace*{4ex} {\tt \#define ALLOW\_PASSIVE\_TRACER} \\ |
867 |
|
This flag enables the code to carry through the |
868 |
|
advection/diffusion of a passive tracer along the |
869 |
|
model integration. \\ |
870 |
\hspace*{4ex} {\tt \#define ALLOW\_MIT\_ADJOINT\_RUN} \\ |
\hspace*{4ex} {\tt \#define ALLOW\_MIT\_ADJOINT\_RUN} \\ |
871 |
This flag enables the inclusion of some AD-related fields |
This flag enables the inclusion of some AD-related fields |
872 |
concerning initialisation, link between control variables |
concerning initialisation, link between control variables |
873 |
and forward model variables, and the call to the top-level |
and forward model variables, and the call to the top-level |
874 |
forward/adjoint subroutine {\it adthe\_main\_loop} |
forward/adjoint subroutine {\it adthe\_main\_loop} |
875 |
instead of {\it the\_main\_loop}. |
instead of {\it the\_main\_loop}. \\ |
876 |
|
\hspace*{4ex} {\tt \#define ALLOW\_GRADIENT\_CHECK} \\ |
877 |
|
This flag enables the gradient check package. |
878 |
|
After computing the unperturbed cost function and its gradient, |
879 |
|
a series of computations are performed for which \\ |
880 |
|
$\bullet$ an element of the control vector is perturbed \\ |
881 |
|
$\bullet$ the cost function w.r.t. the perturbed element is |
882 |
|
computed \\ |
883 |
|
$\bullet$ the difference between the perturbed and unperturbed |
884 |
|
cost function is computed to compute the finite difference gradient \\ |
885 |
|
$\bullet$ the finite difference gradient is compared with the |
886 |
|
adjoint-generated gradient. |
887 |
|
The gradient check package is further described in Section ???. |
888 |
|
|
889 |
\subsubsection{File {\it ECCO\_OPTIONS.h}} |
\subsubsection{File {\it ECCO\_OPTIONS.h}} |
890 |
|
|
919 |
in particular the hooks in the foorward code for |
in particular the hooks in the foorward code for |
920 |
initialising, accumulating and finalizing the cost function. \\ |
initialising, accumulating and finalizing the cost function. \\ |
921 |
\hspace*{4ex} {\tt \#define ALLOW\_COST\_TRACER} \\ |
\hspace*{4ex} {\tt \#define ALLOW\_COST\_TRACER} \\ |
922 |
includes the subroutine with the cost function for this |
includes the call to the cost function for this |
923 |
particular experiment, eqn. (\ref{cost_tracer}). |
particular experiment, eqn. (\ref{cost_tracer}). |
924 |
% |
% |
925 |
\item Control variable package: {\it pkg/ctrl/} \\ |
\item Control variable package: {\it pkg/ctrl/} \\ |
941 |
freshwater flux \\ |
freshwater flux \\ |
942 |
\hspace*{2ex} {\tt \#define ALLOW\_HFLUX0\_CONTROL} & |
\hspace*{2ex} {\tt \#define ALLOW\_HFLUX0\_CONTROL} & |
943 |
heat flux \\ |
heat flux \\ |
944 |
\hspace*{2ex} {\tt \#undef ALLOW\_DIFFKR\_CONTROL} & |
\hspace*{2ex} {\tt \#define ALLOW\_DIFFKR\_CONTROL} & |
945 |
diapycnal diffusivity \\ |
diapycnal diffusivity \\ |
946 |
\hspace*{2ex} {\tt \#undef ALLOW\_KAPPAGM\_CONTROL} & |
\hspace*{2ex} {\tt \#undef ALLOW\_KAPPAGM\_CONTROL} & |
947 |
isopycnal diffusivity \\ |
isopycnal diffusivity \\ |
973 |
\hspace*{4ex} is related to {\it DYNVARS.h} \\ |
\hspace*{4ex} is related to {\it DYNVARS.h} \\ |
974 |
\hspace*{4ex} {\tt common /addynvars\_cd/} & |
\hspace*{4ex} {\tt common /addynvars\_cd/} & |
975 |
\hspace*{4ex} is related to {\it DYNVARS.h} \\ |
\hspace*{4ex} is related to {\it DYNVARS.h} \\ |
976 |
|
\hspace*{4ex} {\tt common /addynvars\_diffkr/} & |
977 |
|
\hspace*{4ex} is related to {\it DYNVARS.h} \\ |
978 |
|
\hspace*{4ex} {\tt common /addynvars\_kapgm/} & |
979 |
|
\hspace*{4ex} is related to {\it DYNVARS.h} \\ |
980 |
\hspace*{4ex} {\tt common /adtr1\_r/} & |
\hspace*{4ex} {\tt common /adtr1\_r/} & |
981 |
\hspace*{4ex} is related to {\it TR1.h} \\ |
\hspace*{4ex} is related to {\it TR1.h} \\ |
982 |
\hspace*{4ex} {\tt common /adffields/} & |
\hspace*{4ex} {\tt common /adffields/} & |
1001 |
3-level checkpointing is enabled, i.e. the timestepping |
3-level checkpointing is enabled, i.e. the timestepping |
1002 |
is divided into three different levels (see Section \ref{???}). |
is divided into three different levels (see Section \ref{???}). |
1003 |
The model state of the outermost ({\tt nchklev\_3}) and the |
The model state of the outermost ({\tt nchklev\_3}) and the |
1004 |
itermediate ({\tt nchklev\_2}) timestepping loop are stored to file |
intermediate ({\tt nchklev\_2}) timestepping loop are stored to file |
1005 |
(handled in {\it the\_main\_loop}). |
(handled in {\it the\_main\_loop}). |
1006 |
The innermost loop ({\tt nchklev\_1}) |
The innermost loop ({\tt nchklev\_1}) |
1007 |
avoids I/O by storing all required variables |
avoids I/O by storing all required variables |
1013 |
\hspace*{4ex} {\tt nchklev\_2 = 30 } \\ |
\hspace*{4ex} {\tt nchklev\_2 = 30 } \\ |
1014 |
\hspace*{4ex} {\tt nchklev\_3 = 60 } \\ |
\hspace*{4ex} {\tt nchklev\_3 = 60 } \\ |
1015 |
To guarantee that the checkpointing intervals span the entire |
To guarantee that the checkpointing intervals span the entire |
1016 |
integration period the relation \\ |
integration period the following relation must be satisfied: \\ |
1017 |
\hspace*{4ex} {\tt nchklev\_1*nchklev\_2*nchklev\_3 $ \ge $ nTimeSteps} \\ |
\hspace*{4ex} {\tt nchklev\_1*nchklev\_2*nchklev\_3 $ \ge $ nTimeSteps} \\ |
1018 |
where {\tt nTimeSteps} is either specified in {\it data} |
where {\tt nTimeSteps} is either specified in {\it data} |
1019 |
or computed via \\ |
or computed via \\ |
1027 |
% |
% |
1028 |
\end{itemize} |
\end{itemize} |
1029 |
|
|
1030 |
|
The following parameters may be worth describing: \\ |
1031 |
|
% |
1032 |
|
\hspace*{4ex} {\tt isbyte} \\ |
1033 |
|
\hspace*{4ex} {\tt maxpass} \\ |
1034 |
|
~ |
1035 |
|
|
1036 |
\subsubsection{File {\it makefile}} |
\subsubsection{File {\it makefile}} |
1037 |
|
|
1038 |
This file contains all relevant paramter flags and |
This file contains all relevant paramter flags and |
1039 |
lists to run TAMC. |
lists to run TAMC or TAF. |
1040 |
It is assumed that TAMC is available to you, either locally, |
It is assumed that TAMC is available to you, either locally, |
1041 |
being installed on your network, or remotely through the 'TAMC Utility'. |
being installed on your network, or remotely through the 'TAMC Utility'. |
1042 |
TAMC is called with the command {\tt tamc} followed by a |
TAMC is called with the command {\tt tamc} followed by a |
1047 |
\begin{itemize} |
\begin{itemize} |
1048 |
\item [{\tt tamc}] {\tt |
\item [{\tt tamc}] {\tt |
1049 |
-input <variable names> |
-input <variable names> |
1050 |
-output <variable name> ... \\ |
-output <variable name> -r4 ... \\ |
1051 |
-toplevel <S/R name> -reverse <file names> |
-toplevel <S/R name> -reverse <file names> |
1052 |
} |
} |
1053 |
\end{itemize} |
\end{itemize} |
1068 |
\item {\tt -reverse <file names>} \\ |
\item {\tt -reverse <file names>} \\ |
1069 |
Adjoint code is generated to compute the sensitivity of an |
Adjoint code is generated to compute the sensitivity of an |
1070 |
independent variable w.r.t. many dependent variables. |
independent variable w.r.t. many dependent variables. |
1071 |
The generated adjoint top-level routine computes the product |
In the discussion of Section ??? |
1072 |
|
the generated adjoint top-level routine computes the product |
1073 |
of the transposed Jacobian matrix $ M^T $ times |
of the transposed Jacobian matrix $ M^T $ times |
1074 |
the gradient vector $ \nabla_v J $. |
the gradient vector $ \nabla_v J $. |
1075 |
\\ |
\\ |
1080 |
deliberately hidden from TAMC, either because hand-written |
deliberately hidden from TAMC, either because hand-written |
1081 |
adjoint routines exist, or the routines must not (or don't have to) |
adjoint routines exist, or the routines must not (or don't have to) |
1082 |
be differentiated. For each routine which is part of the flow tree |
be differentiated. For each routine which is part of the flow tree |
1083 |
of the top-level routine, but deliberately hidden from TAMC, |
of the top-level routine, but deliberately hidden from TAMC |
1084 |
|
(or for each package which contains such routines), |
1085 |
a corresponding file {\it .flow} exists containing flow directives |
a corresponding file {\it .flow} exists containing flow directives |
1086 |
for TAMC. |
for TAMC. |
1087 |
% |
% |
1088 |
|
\item {\tt -r4} \\ |
1089 |
|
~ |
1090 |
|
% |
1091 |
\end{itemize} |
\end{itemize} |
1092 |
|
|
1093 |
|
|
1097 |
|
|
1098 |
\subsubsection{File {\it data.ctrl}} |
\subsubsection{File {\it data.ctrl}} |
1099 |
|
|
1100 |
|
\subsubsection{File {\it data.gmredi}} |
1101 |
|
|
1102 |
|
\subsubsection{File {\it data.grdchk}} |
1103 |
|
|
1104 |
|
\subsubsection{File {\it data.optim}} |
1105 |
|
|
1106 |
\subsubsection{File {\it data.pkg}} |
\subsubsection{File {\it data.pkg}} |
1107 |
|
|
1108 |
\subsubsection{File {\it eedata}} |
\subsubsection{File {\it eedata}} |
1122 |
\newpage |
\newpage |
1123 |
|
|
1124 |
%********************************************************************** |
%********************************************************************** |
1125 |
\section{TLM and ADM code generation in general} |
\section{TLM and ADM generation in general} |
1126 |
\label{sec_ad_setup_gen} |
\label{sec_ad_setup_gen} |
1127 |
%********************************************************************** |
%********************************************************************** |
1128 |
|
|
1130 |
the parts of the code that are relevant for automatic |
the parts of the code that are relevant for automatic |
1131 |
differentiation using the software tool TAMC. |
differentiation using the software tool TAMC. |
1132 |
|
|
1133 |
\subsection{The cost function (dependent variable)} |
\begin{figure}[b!] |
1134 |
|
\input{part5/doc_ad_the_model} |
1135 |
|
\caption{~} |
1136 |
|
\label{fig:adthemodel} |
1137 |
|
\end{figure} |
1138 |
|
|
1139 |
|
The basic flow is depicted in \reffig{adthemodel}. |
1140 |
|
If the option {\tt ALLOW\_AUTODIFF\_TAMC} is defined, the driver routine |
1141 |
|
{\it the\_model\_main}, instead of calling {\it the\_main\_loop}, |
1142 |
|
invokes the adjoint of this routine, {\it adthe\_main\_loop}, |
1143 |
|
which is the toplevel routine in terms of reverse mode computation. |
1144 |
|
The routine {\it adthe\_main\_loop} has been generated using TAMC. |
1145 |
|
It contains both the forward integration of the full model, |
1146 |
|
any additional storing that is required for efficient checkpointing, |
1147 |
|
and the reverse integration of the adjoint model. |
1148 |
|
The structure of {\it adthe\_main\_loop} has been strongly |
1149 |
|
simplified for clarification; in particular, no checkpointing |
1150 |
|
procedures are shown here. |
1151 |
|
Prior to the call of {\it adthe\_main\_loop}, the routine |
1152 |
|
{\it ctrl\_unpack} is invoked to unpack the control vector, |
1153 |
|
and following that call, the routine {\it ctrl\_pack} |
1154 |
|
is invoked to pack the control vector |
1155 |
|
(cf. Section \ref{section_ctrl}). |
1156 |
|
If gradient checks are to be performed, the option |
1157 |
|
{\tt ALLOW\_GRADIENT\_CHECK} is defined. In this case |
1158 |
|
the driver routine {\it grdchk\_main} is called after |
1159 |
|
the gradient has been computed via the adjoint |
1160 |
|
(cf. Section \ref{section_grdchk}). |
1161 |
|
|
1162 |
|
\subsection{The cost function (dependent variable) |
1163 |
|
\label{section_cost}} |
1164 |
|
|
1165 |
The cost function $ {\cal J} $ is referred to as the {\sf dependent variable}. |
The cost function $ {\cal J} $ is referred to as the {\sf dependent variable}. |
1166 |
It is a function of the input variables $ \vec{u} $ via the composition |
It is a function of the input variables $ \vec{u} $ via the composition |
1168 |
The input is referred to as the |
The input is referred to as the |
1169 |
{\sf independent variables} or {\sf control variables}. |
{\sf independent variables} or {\sf control variables}. |
1170 |
All aspects relevant to the treatment of the cost function $ {\cal J} $ |
All aspects relevant to the treatment of the cost function $ {\cal J} $ |
1171 |
(parameter setting, initialisation, incrementation, |
(parameter setting, initialisation, accumulation, |
1172 |
final evaluation), are controled by the package {\it pkg/cost}. |
final evaluation), are controlled by the package {\it pkg/cost}. |
1173 |
|
|
1174 |
|
\begin{figure}[h!] |
1175 |
|
\input{part5/doc_cost_flow} |
1176 |
|
\caption{~} |
1177 |
|
\label{fig:costflow} |
1178 |
|
\end{figure} |
1179 |
|
|
1180 |
\subsubsection{genmake and CPP options} |
\subsubsection{genmake and CPP options} |
1181 |
% |
% |
1195 |
\begin{enumerate} |
\begin{enumerate} |
1196 |
% |
% |
1197 |
\item {\it genmake}: \\ |
\item {\it genmake}: \\ |
1198 |
Change the default settngs in the file {\it genmake} by adding |
Change the default settings in the file {\it genmake} by adding |
1199 |
{\bf cost} to the {\bf enable} list (not recommended). |
{\bf cost} to the {\bf enable} list (not recommended). |
1200 |
% |
% |
1201 |
\item {\it .genmakerc}: \\ |
\item {\it .genmakerc}: \\ |
1208 |
{\tt genmake -enable=cost}. |
{\tt genmake -enable=cost}. |
1209 |
% |
% |
1210 |
\end{enumerate} |
\end{enumerate} |
|
Since the cost function is usually used in conjunction with |
|
|
automatic differentiation, the CPP option |
|
|
{\bf ALLOW\_ADJOINT\_RUN} should be defined |
|
|
(file {\it CPP\_OPTIONS.h}). |
|
1211 |
The basic CPP option to enable the cost function is {\bf ALLOW\_COST}. |
The basic CPP option to enable the cost function is {\bf ALLOW\_COST}. |
1212 |
Each specific cost function contribution has its own option. |
Each specific cost function contribution has its own option. |
1213 |
For the present example the option is {\bf ALLOW\_COST\_TRACER}. |
For the present example the option is {\bf ALLOW\_COST\_TRACER}. |
1214 |
All cost-specific options are set in {\it ECCO\_CPPOPTIONS.h} |
All cost-specific options are set in {\it ECCO\_CPPOPTIONS.h} |
1215 |
|
Since the cost function is usually used in conjunction with |
1216 |
|
automatic differentiation, the CPP option |
1217 |
|
{\bf ALLOW\_ADJOINT\_RUN} should be defined |
1218 |
|
(file {\it CPP\_OPTIONS.h}). |
1219 |
|
|
1220 |
\subsubsection{Initialisation} |
\subsubsection{Initialisation} |
1221 |
% |
% |
1256 |
% |
% |
1257 |
\end{itemize} |
\end{itemize} |
1258 |
% |
% |
1259 |
\subsubsection{Incrementation} |
\subsubsection{Accumulation} |
1260 |
% |
% |
1261 |
\begin{itemize} |
\begin{itemize} |
1262 |
% |
% |
1304 |
tamc -output 'fc' ... |
tamc -output 'fc' ... |
1305 |
\end{verbatim} |
\end{verbatim} |
1306 |
|
|
1307 |
\begin{figure}[t!] |
%%%% \end{document} |
|
\input{part5/doc_ad_the_model} |
|
|
\label{fig:adthemodel} |
|
|
\caption{~} |
|
|
\end{figure} |
|
1308 |
|
|
1309 |
\begin{figure} |
\begin{figure} |
1310 |
\input{part5/doc_ad_the_main} |
\input{part5/doc_ad_the_main} |
|
\label{fig:adthemain} |
|
1311 |
\caption{~} |
\caption{~} |
1312 |
|
\label{fig:adthemain} |
1313 |
\end{figure} |
\end{figure} |
1314 |
|
|
1315 |
\subsection{The control variables (independent variables)} |
\subsection{The control variables (independent variables) |
1316 |
|
\label{section_ctrl}} |
1317 |
|
|
1318 |
The control variables are a subset of the model input |
The control variables are a subset of the model input |
1319 |
(initial conditions, boundary conditions, model parameters). |
(initial conditions, boundary conditions, model parameters). |
1320 |
Here we identify them with the variable $ \vec{u} $. |
Here we identify them with the variable $ \vec{u} $. |
1321 |
All intermediate variables whose derivative w.r.t. control |
All intermediate variables whose derivative w.r.t. control |
1322 |
variables don't vanish are called {\sf active variables}. |
variables do not vanish are called {\sf active variables}. |
1323 |
All subroutines whose derivative w.r.t. the control variables |
All subroutines whose derivative w.r.t. the control variables |
1324 |
don't vanish are called {\sf active routines}. |
don't vanish are called {\sf active routines}. |
1325 |
Read and write operations from and to file can be viewed |
Read and write operations from and to file can be viewed |
1330 |
(parameter setting, initialisation, perturbation) |
(parameter setting, initialisation, perturbation) |
1331 |
are controled by the package {\it pkg/ctrl}. |
are controled by the package {\it pkg/ctrl}. |
1332 |
|
|
1333 |
|
\begin{figure}[h!] |
1334 |
|
\input{part5/doc_ctrl_flow} |
1335 |
|
\caption{~} |
1336 |
|
\label{fig:ctrlflow} |
1337 |
|
\end{figure} |
1338 |
|
|
1339 |
\subsubsection{genmake and CPP options} |
\subsubsection{genmake and CPP options} |
1340 |
% |
% |
1341 |
\begin{itemize} |
\begin{itemize} |
1394 |
variables in the MITGCM need to be addressed. |
variables in the MITGCM need to be addressed. |
1395 |
First, in order to save memory, the control variable arrays |
First, in order to save memory, the control variable arrays |
1396 |
are not kept in memory, but rather read from file and added |
are not kept in memory, but rather read from file and added |
1397 |
to the initial (or first guess) fields. |
to the initial fields during the model initialisation phase. |
1398 |
Similarly, the corresponding adjoint fields which represent |
Similarly, the corresponding adjoint fields which represent |
1399 |
the gradient of the cost function w.r.t. the control variables |
the gradient of the cost function w.r.t. the control variables |
1400 |
are written to to file. |
are written to file at the end of the adjoint integration. |
1401 |
Second, in addition to the files holding the 2-dim. and 3-dim. |
Second, in addition to the files holding the 2-dim. and 3-dim. |
1402 |
control variables and the gradient, a 1-dim. {\sf control vector} |
control variables and the corresponding cost gradients, |
1403 |
|
a 1-dim. {\sf control vector} |
1404 |
and {\sf gradient vector} are written to file. They contain |
and {\sf gradient vector} are written to file. They contain |
1405 |
only the wet points of the control variables and the corresponding |
only the wet points of the control variables and the corresponding |
1406 |
gradient. |
gradient. |
1407 |
This leads to a significant data compression. |
This leads to a significant data compression. |
1408 |
Furthermore, the control and the gradient vector can be passed to a |
Furthermore, an option is available |
1409 |
|
({\tt ALLOW\_NONDIMENSIONAL\_CONTROL\_IO}) to |
1410 |
|
non-dimensionalise the control and gradient vector, |
1411 |
|
which otherwise would contain different pieces of different |
1412 |
|
magnitudes and units. |
1413 |
|
Finally, the control and gradient vector can be passed to a |
1414 |
minimization routine if an update of the control variables |
minimization routine if an update of the control variables |
1415 |
is sought as part of a minimization exercise. |
is sought as part of a minimization exercise. |
1416 |
|
|
1421 |
|
|
1422 |
\subsubsection{Perturbation of the independent variables} |
\subsubsection{Perturbation of the independent variables} |
1423 |
% |
% |
1424 |
The dependency chain for differentiation starts |
The dependency flow for differentiation w.r.t. the controls |
1425 |
with adding a perturbation onto the the input variable, |
starts with adding a perturbation onto the input variable, |
1426 |
thus defining the independent or control variables for TAMC. |
thus defining the independent or control variables for TAMC. |
1427 |
Three classes of controls may be considered: |
Three types of controls may be considered: |
1428 |
% |
% |
1429 |
\begin{itemize} |
\begin{itemize} |
1430 |
% |
% |
1439 |
Consider as an example the initial tracer distribution |
Consider as an example the initial tracer distribution |
1440 |
{\bf tr1} as control variable. |
{\bf tr1} as control variable. |
1441 |
After {\bf tr1} has been initialised in |
After {\bf tr1} has been initialised in |
1442 |
{\it ini\_tr1} (dynamical variables including |
{\it ini\_tr1} (dynamical variables such as |
1443 |
temperature and salinity are initialised in {\it ini\_fields}), |
temperature and salinity are initialised in {\it ini\_fields}), |
1444 |
a perturbation anomaly is added to the field in S/R |
a perturbation anomaly is added to the field in S/R |
1445 |
{\it ctrl\_map\_ini} |
{\it ctrl\_map\_ini} |
1452 |
\end{split} |
\end{split} |
1453 |
\end{equation} |
\end{equation} |
1454 |
% |
% |
1455 |
In principle {\bf xx\_tr1} is a 3-dim. global array |
{\bf xx\_tr1} is a 3-dim. global array |
1456 |
holding the perturbation. In the case of a simple |
holding the perturbation. In the case of a simple |
1457 |
sensitivity study this array is identical to zero. |
sensitivity study this array is identical to zero. |
1458 |
However, it's specification is essential since TAMC |
However, it's specification is essential in the context |
1459 |
|
of automatic differentiation since TAMC |
1460 |
treats the corresponding line in the code symbolically |
treats the corresponding line in the code symbolically |
1461 |
when determining the differentiation chain and its origin. |
when determining the differentiation chain and its origin. |
1462 |
Thus, the variable names are part of the argument list |
Thus, the variable names are part of the argument list |
1496 |
to a variable assignment. Its derivative corresponds |
to a variable assignment. Its derivative corresponds |
1497 |
to a write statement of the adjoint variable. |
to a write statement of the adjoint variable. |
1498 |
The 'active file' routines have been designed |
The 'active file' routines have been designed |
1499 |
to support active read and corresponding active write |
to support active read and corresponding adjoint active write |
1500 |
operations. |
operations (and vice versa). |
1501 |
% |
% |
1502 |
\item |
\item |
1503 |
\fbox{ |
\fbox{ |
1514 |
Note however an important difference: |
Note however an important difference: |
1515 |
Since the boundary values are time dependent with a new |
Since the boundary values are time dependent with a new |
1516 |
forcing field applied at each time steps, |
forcing field applied at each time steps, |
1517 |
the general problem may be be thought of as |
the general problem may be thought of as |
1518 |
a new control variable at each time step, i.e. |
a new control variable at each time step |
1519 |
|
(or, if the perturbation is averaged over a certain period, |
1520 |
|
at each $ N $ timesteps), i.e. |
1521 |
\[ |
\[ |
1522 |
u_{\rm forcing} \, = \, |
u_{\rm forcing} \, = \, |
1523 |
\{ \, u_{\rm forcing} ( t_n ) \, \}_{ |
\{ \, u_{\rm forcing} ( t_n ) \, \}_{ |
1542 |
% |
% |
1543 |
This routine is not yet implemented, but would proceed |
This routine is not yet implemented, but would proceed |
1544 |
proceed along the same lines as the initial value sensitivity. |
proceed along the same lines as the initial value sensitivity. |
1545 |
|
The mixing parameters {\bf diffkr} and {\bf kapgm} |
1546 |
|
are currently added as controls in {\it ctrl\_map\_ini.F}. |
1547 |
% |
% |
1548 |
\end{itemize} |
\end{itemize} |
1549 |
% |
% |
1550 |
|
|
1551 |
\subsubsection{Output of adjoint variables and gradient} |
\subsubsection{Output of adjoint variables and gradient} |
1552 |
% |
% |
1553 |
Two ways exist to generate output of adjoint fields. |
Several ways exist to generate output of adjoint fields. |
1554 |
% |
% |
1555 |
\begin{itemize} |
\begin{itemize} |
1556 |
% |
% |
1557 |
\item |
\item |
1558 |
\fbox{ |
\fbox{ |
1559 |
\begin{minipage}{12cm} |
\begin{minipage}{12cm} |
1560 |
{\it ctrl\_pack}: |
{\it ctrl\_map\_ini, ctrl\_map\_forcing}: |
1561 |
\end{minipage} |
\end{minipage} |
1562 |
} |
} |
1563 |
\\ |
\\ |
|
At the end of the forward/adjoint integration, the S/R |
|
|
{\it ctrl\_pack} is called which mirrors S/R {\it ctrl\_unpack}. |
|
|
It writes the following files: |
|
|
% |
|
1564 |
\begin{itemize} |
\begin{itemize} |
1565 |
% |
% |
1566 |
\item {\bf xx\_...}: the control variable fields |
\item {\bf xx\_...}: the control variable fields \\ |
1567 |
|
Before the forward integration, the control |
1568 |
|
variables are read from file {\bf xx\_ ...} and added to |
1569 |
|
the model field. |
1570 |
% |
% |
1571 |
\item {\bf adxx\_...}: the adjoint variable fields, i.e. the gradient |
\item {\bf adxx\_...}: the adjoint variable fields, i.e. the gradient |
1572 |
$ \nabla _{u}{\cal J} $ for each control variable, |
$ \nabla _{u}{\cal J} $ for each control variable \\ |
1573 |
|
After the adjoint integration the corresponding adjoint |
1574 |
|
variables are written to {\bf adxx\_ ...}. |
1575 |
% |
% |
1576 |
\item {\bf vector\_ctrl}: the control vector |
\end{itemize} |
1577 |
% |
% |
1578 |
\item {\bf vector\_grad}: the gradient vector |
\item |
1579 |
|
\fbox{ |
1580 |
|
\begin{minipage}{12cm} |
1581 |
|
{\it ctrl\_unpack, ctrl\_pack}: |
1582 |
|
\end{minipage} |
1583 |
|
} |
1584 |
|
\\ |
1585 |
|
% |
1586 |
|
\begin{itemize} |
1587 |
|
% |
1588 |
|
\item {\bf vector\_ctrl}: the control vector \\ |
1589 |
|
At the very beginning of the model initialisation, |
1590 |
|
the updated compressed control vector is read (or initialised) |
1591 |
|
and distributed to 2-dim. and 3-dim. control variable fields. |
1592 |
|
% |
1593 |
|
\item {\bf vector\_grad}: the gradient vector \\ |
1594 |
|
At the very end of the adjoint integration, |
1595 |
|
the 2-dim. and 3-dim. adjoint variables are read, |
1596 |
|
compressed to a single vector and written to file. |
1597 |
% |
% |
1598 |
\end{itemize} |
\end{itemize} |
1599 |
% |
% |
1605 |
} |
} |
1606 |
\\ |
\\ |
1607 |
In addition to writing the gradient at the end of the |
In addition to writing the gradient at the end of the |
1608 |
forward/adjoint integration, many more adjoint variables, |
forward/adjoint integration, many more adjoint variables |
1609 |
representing the Lagrange multipliers of the model state |
of the model state |
1610 |
w.r.t. the model state |
at intermediate times can be written using S/R |
|
at different times can be written using S/R |
|
1611 |
{\it addummy\_in\_stepping}. |
{\it addummy\_in\_stepping}. |
1612 |
This routine is part of the adjoint support package |
This routine is part of the adjoint support package |
1613 |
{\it pkg/autodiff} (cf.f. below). |
{\it pkg/autodiff} (cf.f. below). |
1621 |
Appropriate flow directives ({\it dummy\_in\_stepping.flow}) |
Appropriate flow directives ({\it dummy\_in\_stepping.flow}) |
1622 |
ensure that TAMC does not automatically |
ensure that TAMC does not automatically |
1623 |
generate {\it addummy\_in\_stepping} by trying to differentiate |
generate {\it addummy\_in\_stepping} by trying to differentiate |
1624 |
{\it dummy\_in\_stepping}, but rather takes the hand-written routine. |
{\it dummy\_in\_stepping}, but instead refers to |
1625 |
|
the hand-written routine. |
1626 |
|
|
1627 |
{\it dummy\_in\_stepping} is called in the forward code |
{\it dummy\_in\_stepping} is called in the forward code |
1628 |
at the beginning of each |
at the beginning of each |
1632 |
{\it addynamics}. |
{\it addynamics}. |
1633 |
|
|
1634 |
{\it addummy\_in\_stepping} includes the header files |
{\it addummy\_in\_stepping} includes the header files |
1635 |
{\it adffields.h, addynamics.h, adtr1.h}. |
{\it adcommon.h}. |
1636 |
These header files are also hand-written. They contain |
This header file is also hand-written. It contains |
1637 |
the common blocks {\bf /addynvars\_r/}, {\bf /addynvars\_cd/}, |
the common blocks |
1638 |
|
{\bf /addynvars\_r/}, {\bf /addynvars\_cd/}, |
1639 |
|
{\bf /addynvars\_diffkr/}, {\bf /addynvars\_kapgm/}, |
1640 |
{\bf /adtr1\_r/}, {\bf /adffields/}, |
{\bf /adtr1\_r/}, {\bf /adffields/}, |
1641 |
which have been extracted from the adjoint code to enable |
which have been extracted from the adjoint code to enable |
1642 |
access to the adjoint variables. |
access to the adjoint variables. |
1654 |
with the value of the cost function itself $ {\cal J}(u_{[k]}) $ |
with the value of the cost function itself $ {\cal J}(u_{[k]}) $ |
1655 |
at iteration step $ k $ serve |
at iteration step $ k $ serve |
1656 |
as input to a minimization routine (e.g. quasi-Newton method, |
as input to a minimization routine (e.g. quasi-Newton method, |
1657 |
conjugate gradient, ...) to compute an update in the |
conjugate gradient, ... \cite{gil_lem:89}) |
1658 |
|
to compute an update in the |
1659 |
control variable for iteration step $k+1$ |
control variable for iteration step $k+1$ |
1660 |
\[ |
\[ |
1661 |
u_{[k+1]} \, = \, u_{[0]} \, + \, \Delta u_{[k+1]} |
u_{[k+1]} \, = \, u_{[0]} \, + \, \Delta u_{[k+1]} |
1669 |
and the minimization routine. |
and the minimization routine. |
1670 |
|
|
1671 |
\begin{eqnarray*} |
\begin{eqnarray*} |
1672 |
\footnotesize |
\scriptsize |
1673 |
\begin{array}{ccccc} |
\begin{array}{ccccc} |
1674 |
u_{[0]} \,\, , \,\, \Delta u_{[k]} & ~ & ~ & ~ & ~ \\ |
u_{[0]} \,\, , \,\, \Delta u_{[k]} & ~ & ~ & ~ & ~ \\ |
1675 |
{\Big\downarrow} |
{\Big\downarrow} |
1686 |
{\cal J}_{[k]} = {\cal J} \left( M \left( u_{[k]} \right) \right)} \\ |
{\cal J}_{[k]} = {\cal J} \left( M \left( u_{[k]} \right) \right)} \\ |
1687 |
\multicolumn{1}{|c}{~} & ~ & ~ & ~ & \multicolumn{1}{c|}{~} \\ |
\multicolumn{1}{|c}{~} & ~ & ~ & ~ & \multicolumn{1}{c|}{~} \\ |
1688 |
\hline |
\hline |
1689 |
|
\multicolumn{1}{|c}{~} & ~ & ~ & ~ & \multicolumn{1}{c|}{~} \\ |
1690 |
|
\multicolumn{1}{|c}{~} & ~ & ~ & ~ & \multicolumn{1}{c|}{{\Big\downarrow}} \\ |
1691 |
|
\multicolumn{1}{|c}{~} & ~ & ~ & ~ & \multicolumn{1}{c|}{~} \\ |
1692 |
\hline |
\hline |
1693 |
\multicolumn{1}{|c}{~} & ~ & ~ & ~ & \multicolumn{1}{c|}{~} \\ |
\multicolumn{1}{|c}{~} & ~ & ~ & ~ & \multicolumn{1}{c|}{~} \\ |
1694 |
\multicolumn{1}{|c}{ |
\multicolumn{1}{|c}{ |
1695 |
\nabla_u {\cal J}_{[k]} (\delta {\cal J}) = |
\nabla_u {\cal J}_{[k]} (\delta {\cal J}) = |
1696 |
T\!\!^{\ast} \cdot \nabla_v {\cal J} |_{v_{[k]}} (\delta {\cal J})} & |
T^{\ast} \cdot \nabla_v {\cal J} |_{v_{[k]}} (\delta {\cal J})} & |
1697 |
\stackrel{\bf adjoint}{\mathbf \longleftarrow} & |
\stackrel{\bf adjoint}{\mathbf \longleftarrow} & |
1698 |
ad \, v_{[k]} (\delta {\cal J}) = |
ad \, v_{[k]} (\delta {\cal J}) = |
1699 |
\nabla_v {\cal J} |_{v_{[k]}} (\delta {\cal J}) & |
\nabla_v {\cal J} |_{v_{[k]}} (\delta {\cal J}) & |
1702 |
\multicolumn{1}{|c}{~} & ~ & ~ & ~ & \multicolumn{1}{c|}{~} \\ |
\multicolumn{1}{|c}{~} & ~ & ~ & ~ & \multicolumn{1}{c|}{~} \\ |
1703 |
\hline |
\hline |
1704 |
~ & ~ & ~ & ~ & ~ \\ |
~ & ~ & ~ & ~ & ~ \\ |
1705 |
~ & ~ & |
\hspace*{15ex}{\Bigg\downarrow} |
1706 |
{\cal J}_{[k]} \qquad {\Bigg\downarrow} \qquad \nabla_u {\cal J}_{[k]} |
\quad {\cal J}_{[k]}, \quad \nabla_u {\cal J}_{[k]} |
1707 |
& ~ & ~ \\ |
& ~ & ~ & ~ & ~ \\ |
1708 |
~ & ~ & ~ & ~ & ~ \\ |
~ & ~ & ~ & ~ & ~ \\ |
1709 |
\hline |
\hline |
1710 |
\multicolumn{1}{|c}{~} & ~ & ~ & ~ & \multicolumn{1}{c|}{~} \\ |
\multicolumn{1}{|c}{~} & ~ & ~ & ~ & \multicolumn{1}{c|}{~} \\ |
1732 |
|
|
1733 |
\vspace*{0.5cm} |
\vspace*{0.5cm} |
1734 |
|
|
1735 |
|
{\scriptsize |
1736 |
\begin{tabular}{ccccc} |
\begin{tabular}{ccccc} |
1737 |
{\bf vector\_ctrl\_$<$k$>$ } & ~ & ~ & ~ & ~ \\ |
{\bf vector\_ctrl\_$<$k$>$ } & ~ & ~ & ~ & ~ \\ |
1738 |
{\big\downarrow} & ~ & ~ & ~ & ~ \\ |
{\big\downarrow} & ~ & ~ & ~ & ~ \\ |
1743 |
\cline{3-3} |
\cline{3-3} |
1744 |
\multicolumn{1}{l}{\bf xx\_theta0...$<$k$>$} & ~ & |
\multicolumn{1}{l}{\bf xx\_theta0...$<$k$>$} & ~ & |
1745 |
\multicolumn{1}{|c|}{~} & ~ & ~ \\ |
\multicolumn{1}{|c|}{~} & ~ & ~ \\ |
1746 |
\multicolumn{1}{l}{\bf xx\_salt0...$<$k$>$} & $\longrightarrow$ & |
\multicolumn{1}{l}{\bf xx\_salt0...$<$k$>$} & |
1747 |
|
$\stackrel{\mbox{read}}{\longrightarrow}$ & |
1748 |
\multicolumn{1}{|c|}{forward integration} & ~ & ~ \\ |
\multicolumn{1}{|c|}{forward integration} & ~ & ~ \\ |
1749 |
\multicolumn{1}{l}{\bf \vdots} & ~ & \multicolumn{1}{|c|}{~} |
\multicolumn{1}{l}{\bf \vdots} & ~ & \multicolumn{1}{|c|}{~} |
1750 |
& ~ & ~ \\ |
& ~ & ~ \\ |
1751 |
\cline{3-3} |
\cline{3-3} |
1752 |
~ & ~ & ~ & ~ & ~ \\ |
~ & ~ & $\downarrow$ & ~ & ~ \\ |
1753 |
\cline{3-3} |
\cline{3-3} |
1754 |
~ & ~ & |
~ & ~ & |
1755 |
\multicolumn{1}{|c|}{~} & ~ & |
\multicolumn{1}{|c|}{~} & ~ & |
1756 |
\multicolumn{1}{l}{\bf adxx\_theta0...$<$k$>$} \\ |
\multicolumn{1}{l}{\bf adxx\_theta0...$<$k$>$} \\ |
1757 |
~ & ~ & \multicolumn{1}{|c|}{adjoint integration} & |
~ & ~ & \multicolumn{1}{|c|}{adjoint integration} & |
1758 |
$\longrightarrow$ & |
$\stackrel{\mbox{write}}{\longrightarrow}$ & |
1759 |
\multicolumn{1}{l}{\bf adxx\_salt0...$<$k$>$} \\ |
\multicolumn{1}{l}{\bf adxx\_salt0...$<$k$>$} \\ |
1760 |
~ & ~ & \multicolumn{1}{|c|}{~} |
~ & ~ & \multicolumn{1}{|c|}{~} |
1761 |
& ~ & \multicolumn{1}{l}{\bf \vdots} \\ |
& ~ & \multicolumn{1}{l}{\bf \vdots} \\ |
1767 |
~ & ~ & ~ & ~ & {\big\downarrow} \\ |
~ & ~ & ~ & ~ & {\big\downarrow} \\ |
1768 |
~ & ~ & ~ & ~ & {\bf vector\_grad\_$<$k$>$ } \\ |
~ & ~ & ~ & ~ & {\bf vector\_grad\_$<$k$>$ } \\ |
1769 |
\end{tabular} |
\end{tabular} |
1770 |
|
} |
1771 |
|
|
1772 |
\vspace*{0.5cm} |
\vspace*{0.5cm} |
1773 |
|
|
1774 |
|
|
1775 |
{\it ctrl\_unpack} reads in the updated control vector |
{\it ctrl\_unpack} reads the updated control vector |
1776 |
{\bf vector\_ctrl\_$<$k$>$}. |
{\bf vector\_ctrl\_$<$k$>$}. |
1777 |
It distributes the different control variables to |
It distributes the different control variables to |
1778 |
2-dim. and 3-dim. files {\it xx\_...$<$k$>$}. |
2-dim. and 3-dim. files {\it xx\_...$<$k$>$}. |
1779 |
During the forward integration the control variables |
At the start of the forward integration the control variables |
1780 |
are read from {\it xx\_...$<$k$>$}. |
are read from {\it xx\_...$<$k$>$} and added to the |
1781 |
Correspondingly, the adjoint fields are written |
field. |
1782 |
|
Correspondingly, at the end of the adjoint integration |
1783 |
|
the adjoint fields are written |
1784 |
to {\it adxx\_...$<$k$>$}, again via the active file routines. |
to {\it adxx\_...$<$k$>$}, again via the active file routines. |
1785 |
Finally, {\it ctrl\_pack} collects all adjoint field files |
Finally, {\it ctrl\_pack} collects all adjoint files |
1786 |
and writes them to the compressed vector file |
and writes them to the compressed vector file |
1787 |
{\bf vector\_grad\_$<$k$>$}. |
{\bf vector\_grad\_$<$k$>$}. |
1788 |
|
|
1790 |
|
|
1791 |
|
|
1792 |
|
|
1793 |
\subsection{Flow directives and adjoint support routines} |
\subsection{Flow directives and adjoint support routines \label{section_flowdir}} |
1794 |
|
|
1795 |
\subsection{Store directives and checkpointing} |
\subsection{Store directives and checkpointing \label{section_checkpointing}} |
1796 |
|
|
1797 |
\subsection{Gradient checks} |
\subsection{Gradient checks \label{section_grdchk}} |
1798 |
|
|
1799 |
\subsection{Second derivative generation via TAMC} |
\subsection{Second derivative generation via TAMC} |
1800 |
|
|