/[MITgcm]/manual/s_autodiff/text/doc_ad_2.tex
ViewVC logotype

Annotation of /manual/s_autodiff/text/doc_ad_2.tex

Parent Directory Parent Directory | Revision Log Revision Log | View Revision Graph Revision Graph


Revision 1.22 - (hide annotations) (download) (as text)
Fri Aug 27 13:09:40 2010 UTC (14 years, 10 months ago) by jmc
Branch: MAIN
Changes since 1.21: +7 -7 lines
File MIME type: application/x-tex
changed for new directory path:
 s/input{part5\//input{s_autodiff\/text\//g
 /\\includegraphics.*{part5\//s/{part5\//{s_autodiff\/figs\//g
 /\\epsfig.*{.*part5\//s/part5\//s_autodiff\/figs\//g

1 jmc 1.22 % $Header: /u/gcmpack/manual/s_autodiff/text/doc_ad_2.tex,v 1.21 2008/01/17 22:32:06 heimbach Exp $
2 heimbach 1.2 % $Name: $
3 adcroft 1.1
4 heimbach 1.18 Author: Patrick Heimbach
5    
6 adcroft 1.1 {\sf Automatic differentiation} (AD), also referred to as algorithmic
7 edhill 1.20 (or, more loosely, computational) differentiation, involves
8     automatically deriving code to calculate partial derivatives from an
9     existing fully non-linear prognostic code. (see \cite{gri:00}). A
10     software tool is used that parses and transforms source files
11     according to a set of linguistic and mathematical rules. AD tools are
12     like source-to-source translators in that they parse a program code as
13 heimbach 1.21 input and produce a new program code as output
14     (we restrict our discussion to source-to-source tools, ignoring
15     operator-overloading tools). However, unlike a
16 edhill 1.20 pure source-to-source translation, the output program represents a new
17     algorithm, such as the evaluation of the Jacobian, the Hessian, or
18     higher derivative operators. In principle, a variety of derived
19     algorithms can be generated automatically in this way.
20    
21     MITgcm has been adapted for use with the Tangent linear and Adjoint
22     Model Compiler (TAMC) and its successor TAF (Transformation of
23     Algorithms in Fortran), developed by Ralf Giering (\cite{gie-kam:98},
24     \cite{gie:99,gie:00}). The first application of the adjoint of MITgcm
25     for sensitivity studies has been published by \cite{maro-eta:99}.
26 heimbach 1.21 \cite{stam-etal:97,stam-etal:02} use MITgcm and its adjoint for ocean
27 edhill 1.20 state estimation studies. In the following we shall refer to TAMC and
28     TAF synonymously, except were explicitly stated otherwise.
29    
30 heimbach 1.21 As of mid-2007 we are also able to generate fairly efficient
31     adjoint code of the MITgcm using a new, open-source AD tool,
32     called OpenAD (see \cite{naum-etal:06,utke-etal:08}.
33     This enables us for the first time to compare adjoint models
34     generated from different AD tools, providing an additional
35     accuracy check, complementary to finite-difference gradient checks.
36     OpenAD and its application to MITgcm is described in detail
37     in section \ref{sec_ad_openad}.
38    
39     The AD tool exploits the chain rule for computing the first derivative of a
40 edhill 1.20 function with respect to a set of input variables. Treating a given
41     forward code as a composition of operations -- each line representing
42     a compositional element, the chain rule is rigorously applied to the
43     code, line by line. The resulting tangent linear or adjoint code,
44     then, may be thought of as the composition in forward or reverse
45     order, respectively, of the Jacobian matrices of the forward code's
46     compositional elements.
47 adcroft 1.1
48     %**********************************************************************
49     \section{Some basic algebra}
50     \label{sec_ad_algebra}
51 edhill 1.17 \begin{rawhtml}
52     <!-- CMIREDIR:sec_ad_algebra: -->
53     \end{rawhtml}
54 adcroft 1.1 %**********************************************************************
55    
56     Let $ \cal{M} $ be a general nonlinear, model, i.e. a
57     mapping from the $m$-dimensional space
58     $U \subset I\!\!R^m$ of input variables
59     $\vec{u}=(u_1,\ldots,u_m)$
60     (model parameters, initial conditions, boundary conditions
61     such as forcing functions) to the $n$-dimensional space
62     $V \subset I\!\!R^n$ of
63     model output variable $\vec{v}=(v_1,\ldots,v_n)$
64 cnh 1.7 (model state, model diagnostics, objective function, ...)
65 adcroft 1.1 under consideration,
66     %
67     \begin{equation}
68     \begin{split}
69     {\cal M} \, : & \, U \,\, \longrightarrow \, V \\
70     ~ & \, \vec{u} \,\, \longmapsto \, \vec{v} \, = \,
71     {\cal M}(\vec{u})
72     \label{fulloperator}
73     \end{split}
74     \end{equation}
75     %
76     The vectors $ \vec{u} \in U $ and $ v \in V $ may be represented w.r.t.
77     some given basis vectors
78     $ {\rm span} (U) = \{ {\vec{e}_i} \}_{i = 1, \ldots , m} $ and
79     $ {\rm span} (V) = \{ {\vec{f}_j} \}_{j = 1, \ldots , n} $ as
80     \[
81     \vec{u} \, = \, \sum_{i=1}^{m} u_i \, {\vec{e}_i},
82     \qquad
83     \vec{v} \, = \, \sum_{j=1}^{n} v_j \, {\vec{f}_j}
84     \]
85    
86     Two routes may be followed to determine the sensitivity of the
87     output variable $\vec{v}$ to its input $\vec{u}$.
88    
89     \subsection{Forward or direct sensitivity}
90     %
91     Consider a perturbation to the input variables $\delta \vec{u}$
92     (typically a single component
93     $\delta \vec{u} = \delta u_{i} \, {\vec{e}_{i}}$).
94     Their effect on the output may be obtained via the linear
95     approximation of the model $ {\cal M}$ in terms of its Jacobian matrix
96     $ M $, evaluated in the point $u^{(0)}$ according to
97     %
98     \begin{equation}
99     \delta \vec{v} \, = \, M |_{\vec{u}^{(0)}} \, \delta \vec{u}
100     \label{tangent_linear}
101     \end{equation}
102     with resulting output perturbation $\delta \vec{v}$.
103     In components
104     $M_{j i} \, = \, \partial {\cal M}_{j} / \partial u_{i} $,
105     it reads
106     %
107     \begin{equation}
108     \delta v_{j} \, = \, \sum_{i}
109     \left. \frac{\partial {\cal M}_{j}}{\partial u_{i}} \right|_{u^{(0)}} \,
110     \delta u_{i}
111     \label{jacobi_matrix}
112     \end{equation}
113     %
114     Eq. (\ref{tangent_linear}) is the {\sf tangent linear model (TLM)}.
115     In contrast to the full nonlinear model $ {\cal M} $, the operator
116     $ M $ is just a matrix
117     which can readily be used to find the forward sensitivity of $\vec{v}$ to
118     perturbations in $u$,
119 heimbach 1.4 but if there are very many input variables $(\gg O(10^{6})$ for
120 adcroft 1.1 large-scale oceanographic application), it quickly becomes
121     prohibitive to proceed directly as in (\ref{tangent_linear}),
122     if the impact of each component $ {\bf e_{i}} $ is to be assessed.
123    
124     \subsection{Reverse or adjoint sensitivity}
125     %
126     Let us consider the special case of a
127     scalar objective function ${\cal J}(\vec{v})$ of the model output (e.g.
128     the total meridional heat transport,
129     the total uptake of $CO_{2}$ in the Southern
130     Ocean over a time interval,
131     or a measure of some model-to-data misfit)
132     %
133     \begin{eqnarray}
134     \begin{array}{cccccc}
135     {\cal J} \, : & U &
136     \longrightarrow & V &
137     \longrightarrow & I \!\! R \\
138     ~ & \vec{u} & \longmapsto & \vec{v}={\cal M}(\vec{u}) &
139     \longmapsto & {\cal J}(\vec{u}) = {\cal J}({\cal M}(\vec{u}))
140     \end{array}
141     \label{compo}
142     \end{eqnarray}
143     %
144 heimbach 1.4 The perturbation of $ {\cal J} $ around a fixed point $ {\cal J}_0 $,
145 adcroft 1.1 \[
146 heimbach 1.4 {\cal J} \, = \, {\cal J}_0 \, + \, \delta {\cal J}
147 adcroft 1.1 \]
148     can be expressed in both bases of $ \vec{u} $ and $ \vec{v} $
149     w.r.t. their corresponding inner product
150     $\left\langle \,\, , \,\, \right\rangle $
151     %
152     \begin{equation}
153     \begin{split}
154     {\cal J} & = \,
155     {\cal J} |_{\vec{u}^{(0)}} \, + \,
156     \left\langle \, \nabla _{u}{\cal J}^T |_{\vec{u}^{(0)}} \, , \, \delta \vec{u} \, \right\rangle
157     \, + \, O(\delta \vec{u}^2) \\
158     ~ & = \,
159     {\cal J} |_{\vec{v}^{(0)}} \, + \,
160     \left\langle \, \nabla _{v}{\cal J}^T |_{\vec{v}^{(0)}} \, , \, \delta \vec{v} \, \right\rangle
161     \, + \, O(\delta \vec{v}^2)
162     \end{split}
163     \label{deljidentity}
164     \end{equation}
165     %
166 heimbach 1.2 (note, that the gradient $ \nabla f $ is a co-vector, therefore
167 adcroft 1.1 its transpose is required in the above inner product).
168     Then, using the representation of
169     $ \delta {\cal J} =
170     \left\langle \, \nabla _{v}{\cal J}^T \, , \, \delta \vec{v} \, \right\rangle $,
171     the definition
172     of an adjoint operator $ A^{\ast} $ of a given operator $ A $,
173     \[
174     \left\langle \, A^{\ast} \vec{x} \, , \, \vec{y} \, \right\rangle =
175     \left\langle \, \vec{x} \, , \, A \vec{y} \, \right\rangle
176     \]
177     which for finite-dimensional vector spaces is just the
178     transpose of $ A $,
179     \[
180     A^{\ast} \, = \, A^T
181     \]
182 heimbach 1.4 and from eq. (\ref{tangent_linear}), (\ref{deljidentity}),
183     we note that
184 adcroft 1.1 (omitting $|$'s):
185     %
186     \begin{equation}
187     \delta {\cal J}
188     \, = \,
189     \left\langle \, \nabla _{v}{\cal J}^T \, , \, \delta \vec{v} \, \right\rangle
190     \, = \,
191     \left\langle \, \nabla _{v}{\cal J}^T \, , \, M \, \delta \vec{u} \, \right\rangle
192     \, = \,
193     \left\langle \, M^T \, \nabla _{v}{\cal J}^T \, , \,
194     \delta \vec{u} \, \right\rangle
195     \label{inner}
196     \end{equation}
197     %
198     With the identity (\ref{deljidentity}), we then find that
199     the gradient $ \nabla _{u}{\cal J} $ can be readily inferred by
200     invoking the adjoint $ M^{\ast } $ of the tangent linear model $ M $
201     %
202     \begin{equation}
203     \begin{split}
204     \nabla _{u}{\cal J}^T |_{\vec{u}} &
205     = \, M^T |_{\vec{u}} \cdot \nabla _{v}{\cal J}^T |_{\vec{v}} \\
206     ~ & = \, M^T |_{\vec{u}} \cdot \delta \vec{v}^{\ast} \\
207     ~ & = \, \delta \vec{u}^{\ast}
208     \end{split}
209     \label{adjoint}
210     \end{equation}
211     %
212     Eq. (\ref{adjoint}) is the {\sf adjoint model (ADM)},
213     in which $M^T$ is the adjoint (here, the transpose) of the
214     tangent linear operator $M$, $ \delta \vec{v}^{\ast} $
215     the adjoint variable of the model state $ \vec{v} $, and
216     $ \delta \vec{u}^{\ast} $ the adjoint variable of the control variable $ \vec{u} $.
217    
218     The {\sf reverse} nature of the adjoint calculation can be readily
219 heimbach 1.4 seen as follows.
220     Consider a model integration which consists of $ \Lambda $
221     consecutive operations
222     $ {\cal M}_{\Lambda} ( {\cal M}_{\Lambda-1} (
223     ...... ( {\cal M}_{\lambda} (
224     ......
225     ( {\cal M}_{1} ( {\cal M}_{0}(\vec{u}) )))) $,
226     where the ${\cal M}$'s could be the elementary steps, i.e. single lines
227     in the code of the model, or successive time steps of the
228     model integration,
229     starting at step 0 and moving up to step $\Lambda$, with intermediate
230     ${\cal M}_{\lambda} (\vec{u}) = \vec{v}^{(\lambda+1)}$ and final
231     ${\cal M}_{\Lambda} (\vec{u}) = \vec{v}^{(\Lambda+1)} = \vec{v}$.
232 cnh 1.7 Let ${\cal J}$ be a cost function which explicitly depends on the
233 heimbach 1.4 final state $\vec{v}$ only
234     (this restriction is for clarity reasons only).
235     %
236     ${\cal J}(u)$ may be decomposed according to:
237 adcroft 1.1 %
238     \begin{equation}
239     {\cal J}({\cal M}(\vec{u})) \, = \,
240     {\cal J} ( {\cal M}_{\Lambda} ( {\cal M}_{\Lambda-1} (
241     ...... ( {\cal M}_{\lambda} (
242     ......
243     ( {\cal M}_{1} ( {\cal M}_{0}(\vec{u}) )))))
244     \label{compos}
245     \end{equation}
246     %
247 heimbach 1.4 Then, according to the chain rule, the forward calculation reads,
248     in terms of the Jacobi matrices
249 adcroft 1.1 (we've omitted the $ | $'s which, nevertheless are important
250     to the aspect of {\it tangent} linearity;
251 heimbach 1.4 note also that by definition
252 adcroft 1.1 $ \langle \, \nabla _{v}{\cal J}^T \, , \, \delta \vec{v} \, \rangle
253     = \nabla_v {\cal J} \cdot \delta \vec{v} $ )
254     %
255     \begin{equation}
256     \begin{split}
257     \nabla_v {\cal J} (M(\delta \vec{u})) & = \,
258     \nabla_v {\cal J} \cdot M_{\Lambda}
259     \cdot ...... \cdot M_{\lambda} \cdot ...... \cdot
260     M_{1} \cdot M_{0} \cdot \delta \vec{u} \\
261     ~ & = \, \nabla_v {\cal J} \cdot \delta \vec{v} \\
262     \end{split}
263     \label{forward}
264     \end{equation}
265     %
266     whereas in reverse mode we have
267     %
268     \begin{equation}
269     \boxed{
270     \begin{split}
271     M^T ( \nabla_v {\cal J}^T) & = \,
272     M_{0}^T \cdot M_{1}^T
273     \cdot ...... \cdot M_{\lambda}^T \cdot ...... \cdot
274     M_{\Lambda}^T \cdot \nabla_v {\cal J}^T \\
275     ~ & = \, M_{0}^T \cdot M_{1}^T
276     \cdot ...... \cdot
277     \nabla_{v^{(\lambda)}} {\cal J}^T \\
278     ~ & = \, \nabla_u {\cal J}^T
279     \end{split}
280     }
281     \label{reverse}
282     \end{equation}
283     %
284     clearly expressing the reverse nature of the calculation.
285     Eq. (\ref{reverse}) is at the heart of automatic adjoint compilers.
286 heimbach 1.4 If the intermediate steps $\lambda$ in
287 adcroft 1.1 eqn. (\ref{compos}) -- (\ref{reverse})
288 heimbach 1.4 represent the model state (forward or adjoint) at each
289     intermediate time step as noted above, then correspondingly,
290     $ M^T (\delta \vec{v}^{(\lambda) \, \ast}) =
291     \delta \vec{v}^{(\lambda-1) \, \ast} $ for the adjoint variables.
292     It thus becomes evident that the adjoint calculation also
293     yields the adjoint of each model state component
294     $ \vec{v}^{(\lambda)} $ at each intermediate step $ \lambda $, namely
295 adcroft 1.1 %
296     \begin{equation}
297     \boxed{
298     \begin{split}
299     \nabla_{v^{(\lambda)}} {\cal J}^T |_{\vec{v}^{(\lambda)}}
300     & = \,
301     M_{\lambda}^T |_{\vec{v}^{(\lambda)}} \cdot ...... \cdot
302     M_{\Lambda}^T |_{\vec{v}^{(\lambda)}} \cdot \delta \vec{v}^{\ast} \\
303     ~ & = \, \delta \vec{v}^{(\lambda) \, \ast}
304     \end{split}
305     }
306     \end{equation}
307     %
308     in close analogy to eq. (\ref{adjoint})
309     We note in passing that that the $\delta \vec{v}^{(\lambda) \, \ast}$
310 heimbach 1.4 are the Lagrange multipliers of the model equations which determine
311     $ \vec{v}^{(\lambda)}$.
312 adcroft 1.1
313 cnh 1.7 In components, eq. (\ref{adjoint}) reads as follows.
314 adcroft 1.1 Let
315     \[
316     \begin{array}{rclcrcl}
317     \delta \vec{u} & = &
318     \left( \delta u_1,\ldots, \delta u_m \right)^T , & \qquad &
319     \delta \vec{u}^{\ast} \,\, = \,\, \nabla_u {\cal J}^T & = &
320     \left(
321     \frac{\partial {\cal J}}{\partial u_1},\ldots,
322     \frac{\partial {\cal J}}{\partial u_m}
323     \right)^T \\
324     \delta \vec{v} & = &
325     \left( \delta v_1,\ldots, \delta u_n \right)^T , & \qquad &
326     \delta \vec{v}^{\ast} \,\, = \,\, \nabla_v {\cal J}^T & = &
327     \left(
328     \frac{\partial {\cal J}}{\partial v_1},\ldots,
329     \frac{\partial {\cal J}}{\partial v_n}
330     \right)^T \\
331     \end{array}
332     \]
333     denote the perturbations in $\vec{u}$ and $\vec{v}$, respectively,
334 cnh 1.7 and their adjoint variables;
335 adcroft 1.1 further
336     \[
337     M \, = \, \left(
338     \begin{array}{ccc}
339     \frac{\partial {\cal M}_1}{\partial u_1} & \ldots &
340     \frac{\partial {\cal M}_1}{\partial u_m} \\
341     \vdots & ~ & \vdots \\
342     \frac{\partial {\cal M}_n}{\partial u_1} & \ldots &
343     \frac{\partial {\cal M}_n}{\partial u_m} \\
344     \end{array}
345     \right)
346     \]
347     is the Jacobi matrix of $ {\cal M} $
348     (an $ n \times m $ matrix)
349     such that $ \delta \vec{v} = M \cdot \delta \vec{u} $, or
350     \[
351     \delta v_{j}
352     \, = \, \sum_{i=1}^m M_{ji} \, \delta u_{i}
353     \, = \, \sum_{i=1}^m \, \frac{\partial {\cal M}_{j}}{\partial u_{i}}
354     \delta u_{i}
355     \]
356     %
357     Then eq. (\ref{adjoint}) takes the form
358     \[
359     \delta u_{i}^{\ast}
360     \, = \, \sum_{j=1}^n M_{ji} \, \delta v_{j}^{\ast}
361     \, = \, \sum_{j=1}^n \, \frac{\partial {\cal M}_{j}}{\partial u_{i}}
362     \delta v_{j}^{\ast}
363     \]
364     %
365     or
366     %
367     \[
368     \left(
369     \begin{array}{c}
370     \left. \frac{\partial}{\partial u_1} {\cal J} \right|_{\vec{u}^{(0)}} \\
371     \vdots \\
372     \left. \frac{\partial}{\partial u_m} {\cal J} \right|_{\vec{u}^{(0)}} \\
373     \end{array}
374     \right)
375     \, = \,
376     \left(
377     \begin{array}{ccc}
378     \left. \frac{\partial {\cal M}_1}{\partial u_1} \right|_{\vec{u}^{(0)}}
379     & \ldots &
380     \left. \frac{\partial {\cal M}_n}{\partial u_1} \right|_{\vec{u}^{(0)}} \\
381     \vdots & ~ & \vdots \\
382     \left. \frac{\partial {\cal M}_1}{\partial u_m} \right|_{\vec{u}^{(0)}}
383     & \ldots &
384     \left. \frac{\partial {\cal M}_n}{\partial u_m} \right|_{\vec{u}^{(0)}} \\
385     \end{array}
386     \right)
387     \cdot
388     \left(
389     \begin{array}{c}
390     \left. \frac{\partial}{\partial v_1} {\cal J} \right|_{\vec{v}} \\
391     \vdots \\
392     \left. \frac{\partial}{\partial v_n} {\cal J} \right|_{\vec{v}} \\
393     \end{array}
394     \right)
395     \]
396     %
397     Furthermore, the adjoint $ \delta v^{(\lambda) \, \ast} $
398     of any intermediate state $ v^{(\lambda)} $
399     may be obtained, using the intermediate Jacobian
400     (an $ n_{\lambda+1} \times n_{\lambda} $ matrix)
401     %
402     \[
403     M_{\lambda} \, = \,
404     \left(
405     \begin{array}{ccc}
406     \frac{\partial ({\cal M}_{\lambda})_1}{\partial v^{(\lambda)}_1}
407     & \ldots &
408     \frac{\partial ({\cal M}_{\lambda})_1}{\partial v^{(\lambda)}_{n_{\lambda}}} \\
409     \vdots & ~ & \vdots \\
410     \frac{\partial ({\cal M}_{\lambda})_{n_{\lambda+1}}}{\partial v^{(\lambda)}_1}
411     & \ldots &
412     \frac{\partial ({\cal M}_{\lambda})_{n_{\lambda+1}}}{\partial v^{(\lambda)}_{n_{\lambda}}} \\
413     \end{array}
414     \right)
415     \]
416     %
417     and the shorthand notation for the adjoint variables
418     $ \delta v^{(\lambda) \, \ast}_{j} = \frac{\partial}{\partial v^{(\lambda)}_{j}}
419     {\cal J}^T $, $ j = 1, \ldots , n_{\lambda} $,
420     for intermediate components, yielding
421 heimbach 1.4 \begin{equation}
422     \small
423     \begin{split}
424 adcroft 1.1 \left(
425     \begin{array}{c}
426     \delta v^{(\lambda) \, \ast}_1 \\
427     \vdots \\
428     \delta v^{(\lambda) \, \ast}_{n_{\lambda}} \\
429     \end{array}
430     \right)
431 heimbach 1.4 \, = &
432 adcroft 1.1 \left(
433     \begin{array}{ccc}
434     \frac{\partial ({\cal M}_{\lambda})_1}{\partial v^{(\lambda)}_1}
435 heimbach 1.4 & \ldots \,\, \ldots &
436 adcroft 1.1 \frac{\partial ({\cal M}_{\lambda})_{n_{\lambda+1}}}{\partial v^{(\lambda)}_1} \\
437     \vdots & ~ & \vdots \\
438     \frac{\partial ({\cal M}_{\lambda})_1}{\partial v^{(\lambda)}_{n_{\lambda}}}
439 heimbach 1.4 & \ldots \,\, \ldots &
440 adcroft 1.1 \frac{\partial ({\cal M}_{\lambda})_{n_{\lambda+1}}}{\partial v^{(\lambda)}_{n_{\lambda}}} \\
441     \end{array}
442     \right)
443 heimbach 1.4 \cdot
444 adcroft 1.1 %
445 heimbach 1.4 \\ ~ & ~
446     \\ ~ &
447 adcroft 1.1 %
448     \left(
449     \begin{array}{ccc}
450     \frac{\partial ({\cal M}_{\lambda+1})_1}{\partial v^{(\lambda+1)}_1}
451     & \ldots &
452     \frac{\partial ({\cal M}_{\lambda+1})_{n_{\lambda+2}}}{\partial v^{(\lambda+1)}_1} \\
453     \vdots & ~ & \vdots \\
454     \vdots & ~ & \vdots \\
455     \frac{\partial ({\cal M}_{\lambda+1})_1}{\partial v^{(\lambda+1)}_{n_{\lambda+1}}}
456     & \ldots &
457     \frac{\partial ({\cal M}_{\lambda+1})_{n_{\lambda+2}}}{\partial v^{(\lambda+1)}_{n_{\lambda+1}}} \\
458     \end{array}
459     \right)
460 heimbach 1.4 \cdot \, \ldots \, \cdot
461 adcroft 1.1 \left(
462     \begin{array}{c}
463     \delta v^{\ast}_1 \\
464     \vdots \\
465     \delta v^{\ast}_{n} \\
466     \end{array}
467     \right)
468 heimbach 1.4 \end{split}
469     \end{equation}
470 adcroft 1.1
471     Eq. (\ref{forward}) and (\ref{reverse}) are perhaps clearest in
472     showing the advantage of the reverse over the forward mode
473     if the gradient $\nabla _{u}{\cal J}$, i.e. the sensitivity of the
474     cost function $ {\cal J} $ with respect to {\it all} input
475     variables $u$
476     (or the sensitivity of the cost function with respect to
477     {\it all} intermediate states $ \vec{v}^{(\lambda)} $) are sought.
478     In order to be able to solve for each component of the gradient
479     $ \partial {\cal J} / \partial u_{i} $ in (\ref{forward})
480 cnh 1.7 a forward calculation has to be performed for each component separately,
481 adcroft 1.1 i.e. $ \delta \vec{u} = \delta u_{i} {\vec{e}_{i}} $
482     for the $i$-th forward calculation.
483     Then, (\ref{forward}) represents the
484     projection of $ \nabla_u {\cal J} $ onto the $i$-th component.
485     The full gradient is retrieved from the $ m $ forward calculations.
486     In contrast, eq. (\ref{reverse}) yields the full
487     gradient $\nabla _{u}{\cal J}$ (and all intermediate gradients
488     $\nabla _{v^{(\lambda)}}{\cal J}$) within a single reverse calculation.
489    
490 heimbach 1.4 Note, that if $ {\cal J} $ is a vector-valued function
491 adcroft 1.1 of dimension $ l > 1 $,
492     eq. (\ref{reverse}) has to be modified according to
493     \[
494     M^T \left( \nabla_v {\cal J}^T \left(\delta \vec{J}\right) \right)
495     \, = \,
496     \nabla_u {\cal J}^T \cdot \delta \vec{J}
497     \]
498 heimbach 1.4 where now $ \delta \vec{J} \in I\!\!R^l $ is a vector of
499 cnh 1.7 dimension $ l $.
500 adcroft 1.1 In this case $ l $ reverse simulations have to be performed
501     for each $ \delta J_{k}, \,\, k = 1, \ldots, l $.
502     Then, the reverse mode is more efficient as long as
503     $ l < n $, otherwise the forward mode is preferable.
504 cnh 1.7 Strictly, the reverse mode is called adjoint mode only for
505 adcroft 1.1 $ l = 1 $.
506    
507     A detailed analysis of the underlying numerical operations
508     shows that the computation of $\nabla _{u}{\cal J}$ in this way
509     requires about 2 to 5 times the computation of the cost function.
510     Alternatively, the gradient vector could be approximated
511     by finite differences, requiring $m$ computations
512     of the perturbed cost function.
513    
514     To conclude we give two examples of commonly used types
515     of cost functions:
516    
517     \paragraph{Example 1:
518     $ {\cal J} = v_{j} (T) $} ~ \\
519     The cost function consists of the $j$-th component of the model state
520     $ \vec{v} $ at time $T$.
521     Then $ \nabla_v {\cal J}^T = {\vec{f}_{j}} $ is just the $j$-th
522     unit vector. The $ \nabla_u {\cal J}^T $
523     is the projection of the adjoint
524     operator onto the $j$-th component ${\bf f_{j}}$,
525     \[
526     \nabla_u {\cal J}^T
527     \, = \, M^T \cdot \nabla_v {\cal J}^T
528     \, = \, \sum_{i} M^T_{ji} \, {\vec{e}_{i}}
529     \]
530    
531     \paragraph{Example 2:
532     $ {\cal J} = \langle \, {\cal H}(\vec{v}) - \vec{d} \, ,
533     \, {\cal H}(\vec{v}) - \vec{d} \, \rangle $} ~ \\
534 heimbach 1.4 The cost function represents the quadratic model vs. data misfit.
535 adcroft 1.1 Here, $ \vec{d} $ is the data vector and $ {\cal H} $ represents the
536     operator which maps the model state space onto the data space.
537     Then, $ \nabla_v {\cal J} $ takes the form
538     %
539     \begin{equation*}
540     \begin{split}
541     \nabla_v {\cal J}^T & = \, 2 \, \, H \cdot
542     \left( \, {\cal H}(\vec{v}) - \vec{d} \, \right) \\
543     ~ & = \, 2 \sum_{j} \left\{ \sum_k
544     \frac{\partial {\cal H}_k}{\partial v_{j}}
545     \left( {\cal H}_k (\vec{v}) - d_k \right)
546     \right\} \, {\vec{f}_{j}} \\
547     \end{split}
548     \end{equation*}
549     %
550     where $H_{kj} = \partial {\cal H}_k / \partial v_{j} $ is the
551     Jacobi matrix of the data projection operator.
552     Thus, the gradient $ \nabla_u {\cal J} $ is given by the
553     adjoint operator,
554     driven by the model vs. data misfit:
555     \[
556     \nabla_u {\cal J}^T \, = \, 2 \, M^T \cdot
557     H \cdot \left( {\cal H}(\vec{v}) - \vec{d} \, \right)
558     \]
559    
560     \subsection{Storing vs. recomputation in reverse mode}
561     \label{checkpointing}
562    
563     We note an important aspect of the forward vs. reverse
564     mode calculation.
565 heimbach 1.4 Because of the local character of the derivative
566     (a derivative is defined w.r.t. a point along the trajectory),
567 adcroft 1.1 the intermediate results of the model trajectory
568     $\vec{v}^{(\lambda+1)}={\cal M}_{\lambda}(v^{(\lambda)})$
569 heimbach 1.15 may be required to evaluate the intermediate Jacobian
570 adcroft 1.1 $M_{\lambda}|_{\vec{v}^{(\lambda)}} \, \delta \vec{v}^{(\lambda)} $.
571 heimbach 1.15 This is the case e.g. for nonlinear expressions
572     (momentum advection, nonlinear equation of state), state-dependent
573     conditional statements (parameterization schemes).
574 adcroft 1.1 In the forward mode, the intermediate results are required
575     in the same order as computed by the full forward model ${\cal M}$,
576 heimbach 1.4 but in the reverse mode they are required in the reverse order.
577 adcroft 1.1 Thus, in the reverse mode the trajectory of the forward model
578     integration ${\cal M}$ has to be stored to be available in the reverse
579 heimbach 1.4 calculation. Alternatively, the complete model state up to the
580     point of evaluation has to be recomputed whenever its value is required.
581 adcroft 1.1
582     A method to balance the amount of recomputations vs.
583     storage requirements is called {\sf checkpointing}
584 heimbach 1.15 (e.g. \cite{gri:92}, \cite{res-eta:98}).
585 adcroft 1.6 It is depicted in \ref{fig:3levelcheck} for a 3-level checkpointing
586 heimbach 1.4 [as an example, we give explicit numbers for a 3-day
587 adcroft 1.1 integration with a 1-hourly timestep in square brackets].
588     \begin{itemize}
589     %
590     \item [$lev3$]
591     In a first step, the model trajectory is subdivided into
592     $ {n}^{lev3} $ subsections [$ {n}^{lev3} $=3 1-day intervals],
593     with the label $lev3$ for this outermost loop.
594 heimbach 1.4 The model is then integrated along the full trajectory,
595 heimbach 1.15 and the model state stored to disk only at every $ k_{i}^{lev3} $-th timestep
596 adcroft 1.1 [i.e. 3 times, at
597     $ i = 0,1,2 $ corresponding to $ k_{i}^{lev3} = 0, 24, 48 $].
598 heimbach 1.15 In addition, the cost function is computed, if needed.
599 adcroft 1.1 %
600     \item [$lev2$]
601 heimbach 1.4 In a second step each subsection itself is divided into
602 heimbach 1.15 $ {n}^{lev2} $ subsections
603 adcroft 1.1 [$ {n}^{lev2} $=4 6-hour intervals per subsection].
604     The model picks up at the last outermost dumped state
605 heimbach 1.4 $ v_{k_{n}^{lev3}} $ and is integrated forward in time along
606 adcroft 1.1 the last subsection, with the label $lev2$ for this
607     intermediate loop.
608 heimbach 1.15 The model state is now stored to disk at every $ k_{i}^{lev2} $-th
609 adcroft 1.1 timestep
610     [i.e. 4 times, at
611     $ i = 0,1,2,3 $ corresponding to $ k_{i}^{lev2} = 48, 54, 60, 66 $].
612     %
613     \item [$lev1$]
614 heimbach 1.4 Finally, the model picks up at the last intermediate dump state
615     $ v_{k_{n}^{lev2}} $ and is integrated forward in time along
616 heimbach 1.15 the last subsection, with the label $lev1$ for this
617 adcroft 1.1 intermediate loop.
618 heimbach 1.15 Within this sub-subsection only, parts of the model state is stored
619     to memory at every timestep
620 adcroft 1.1 [i.e. every hour $ i=0,...,5$ corresponding to
621     $ k_{i}^{lev1} = 66, 67, \ldots, 71 $].
622 heimbach 1.15 The final state $ v_n = v_{k_{n}^{lev1}} $ is reached
623     and the model state of all preceding timesteps along the last
624     innermost subsection are available, enabling integration backwards
625     in time along the last subsection.
626     The adjoint can thus be computed along this last
627     subsection $k_{n}^{lev2}$.
628 adcroft 1.1 %
629     \end{itemize}
630     %
631     This procedure is repeated consecutively for each previous
632 heimbach 1.15 subsection $k_{n-1}^{lev2}, \ldots, k_{1}^{lev2} $
633 adcroft 1.1 carrying the adjoint computation to the initial time
634     of the subsection $k_{n}^{lev3}$.
635     Then, the procedure is repeated for the previous subsection
636     $k_{n-1}^{lev3}$
637     carrying the adjoint computation to the initial time
638     $k_{1}^{lev3}$.
639    
640     For the full model trajectory of
641     $ n^{lev3} \cdot n^{lev2} \cdot n^{lev1} $ timesteps
642     the required storing of the model state was significantly reduced to
643 heimbach 1.15 $ n^{lev2} + n^{lev3} $ to disk and roughly $ n^{lev1} $ to memory
644 adcroft 1.1 [i.e. for the 3-day integration with a total oof 72 timesteps
645 heimbach 1.15 the model state was stored 7 times to disk and roughly 6 times
646     to memory].
647 adcroft 1.1 This saving in memory comes at a cost of a required
648     3 full forward integrations of the model (one for each
649     checkpointing level).
650 heimbach 1.15 The optimal balance of storage vs. recomputation certainly depends
651     on the computing resources available and may be adjusted by
652     adjusting the partitioning among the
653     $ n^{lev3}, \,\, n^{lev2}, \,\, n^{lev1} $.
654 adcroft 1.1
655     \begin{figure}[t!]
656 adcroft 1.6 \begin{center}
657 adcroft 1.1 %\psdraft
658 adcroft 1.6 %\psfrag{v_k1^lev3}{\mathinfigure{v_{k_{1}^{lev3}}}}
659     %\psfrag{v_kn-1^lev3}{\mathinfigure{v_{k_{n-1}^{lev3}}}}
660     %\psfrag{v_kn^lev3}{\mathinfigure{v_{k_{n}^{lev3}}}}
661     %\psfrag{v_k1^lev2}{\mathinfigure{v_{k_{1}^{lev2}}}}
662     %\psfrag{v_kn-1^lev2}{\mathinfigure{v_{k_{n-1}^{lev2}}}}
663     %\psfrag{v_kn^lev2}{\mathinfigure{v_{k_{n}^{lev2}}}}
664     %\psfrag{v_k1^lev1}{\mathinfigure{v_{k_{1}^{lev1}}}}
665     %\psfrag{v_kn^lev1}{\mathinfigure{v_{k_{n}^{lev1}}}}
666 jmc 1.22 %\mbox{\epsfig{file=s_autodiff/figs/checkpointing.eps, width=0.8\textwidth}}
667     \resizebox{5.5in}{!}{\includegraphics{s_autodiff/figs/checkpointing.eps}}
668 adcroft 1.1 %\psfull
669 adcroft 1.6 \end{center}
670     \caption{
671     Schematic view of intermediate dump and restart for
672 adcroft 1.1 3-level checkpointing.}
673 heimbach 1.4 \label{fig:3levelcheck}
674 adcroft 1.1 \end{figure}
675    
676 heimbach 1.4 % \subsection{Optimal perturbations}
677     % \label{sec_optpert}
678 adcroft 1.1
679    
680 heimbach 1.4 % \subsection{Error covariance estimate and Hessian matrix}
681     % \label{sec_hessian}
682 adcroft 1.1
683     \newpage
684    
685     %**********************************************************************
686 heimbach 1.4 \section{TLM and ADM generation in general}
687 adcroft 1.1 \label{sec_ad_setup_gen}
688 edhill 1.17 \begin{rawhtml}
689     <!-- CMIREDIR:sec_ad_setup_gen: -->
690     \end{rawhtml}
691 adcroft 1.1 %**********************************************************************
692    
693     In this section we describe in a general fashion
694     the parts of the code that are relevant for automatic
695 heimbach 1.21 differentiation using the software tool TAF.
696     Modifications to use OpenAD are described in \ref{sec_ad_openad}.
697 adcroft 1.1
698 jmc 1.22 \input{s_autodiff/text/doc_ad_the_model}
699 heimbach 1.4
700 adcroft 1.6 The basic flow is depicted in \ref{fig:adthemodel}.
701 heimbach 1.19 If CPP option \texttt{ALLOW\_AUTODIFF\_TAMC} is defined,
702     the driver routine
703 heimbach 1.4 {\it the\_model\_main}, instead of calling {\it the\_main\_loop},
704 heimbach 1.19 invokes the adjoint of this routine, {\it adthe\_main\_loop}
705     (case \texttt{\#define ALLOW\_ADJOINT\_RUN}), or
706     the tangent linear of this routine {\it g\_the\_main\_loop}
707     (case \texttt{\#define ALLOW\_TANGENTLINEAR\_RUN}),
708     which are the toplevel routines in terms of automatic differentiation.
709     The routines {\it adthe\_main\_loop} or {\it g\_the\_main\_loop}
710     are generated by TAF.
711 heimbach 1.16 It contains both the forward integration of the full model, the
712     cost function calculation,
713 heimbach 1.4 any additional storing that is required for efficient checkpointing,
714     and the reverse integration of the adjoint model.
715 heimbach 1.16
716     [DESCRIBE IN A SEPARATE SECTION THE WORKING OF THE TLM]
717    
718     In Fig. \ref{fig:adthemodel}
719     the structure of {\it adthe\_main\_loop} has been strongly
720     simplified to focus on the essentials; in particular, no checkpointing
721 heimbach 1.4 procedures are shown here.
722     Prior to the call of {\it adthe\_main\_loop}, the routine
723 heimbach 1.15 {\it ctrl\_unpack} is invoked to unpack the control vector
724     or initialise the control variables.
725     Following the call of {\it adthe\_main\_loop},
726     the routine {\it ctrl\_pack}
727 heimbach 1.4 is invoked to pack the control vector
728     (cf. Section \ref{section_ctrl}).
729     If gradient checks are to be performed, the option
730     {\tt ALLOW\_GRADIENT\_CHECK} is defined. In this case
731     the driver routine {\it grdchk\_main} is called after
732     the gradient has been computed via the adjoint
733     (cf. Section \ref{section_grdchk}).
734    
735 heimbach 1.16 %------------------------------------------------------------------
736    
737     \subsection{General setup
738     \label{section_ad_setup}}
739    
740     In order to configure AD-related setups the following packages need
741     to be enabled:
742     {\it
743     \begin{table}[h!]
744     \begin{tabular}{l}
745     autodiff \\
746     ctrl \\
747     cost \\
748     grdchk \\
749     \end{tabular}
750     \end{table}
751     }
752     The packages are enabled by adding them to your experiment-specific
753     configuration file
754     {\it packages.conf} (see Section ???).
755    
756     The following AD-specific CPP option files need to be customized:
757     %
758     \begin{itemize}
759     %
760     \item {\it ECCO\_CPPOPTIONS.h} \\
761     This header file collects CPP options for the packages
762     {\it autodiff, cost, ctrl} as well as AD-unrelated options for
763     the external forcing package {\it exf}.
764     \footnote{NOTE: These options are not set in their package-specific
765     headers such as {\it COST\_CPPOPTIONS.h}, but are instead collected
766     in the single header file {\it ECCO\_CPPOPTIONS.h}.
767     The package-specific header files serve as simple
768     placeholders at this point.}
769     %
770     \item {\it tamc.h} \\
771     This header configures the splitting of the time stepping loop
772     w.r.t. the 3-level checkpointing (see section ???).
773    
774     %
775     \end{itemize}
776    
777     %------------------------------------------------------------------
778    
779 heimbach 1.21 \subsection{Building the AD code using TAF
780 heimbach 1.16 \label{section_ad_build}}
781    
782     The build process of an AD code is very similar to building
783     the forward model. However, depending on which AD code one wishes
784     to generate, and on which AD tool is available (TAF or TAMC),
785     the following {\tt make} targets are available:
786    
787     \begin{table}[h!]
788     {\footnotesize
789 heimbach 1.21 \begin{tabular}{|ccll|}
790     \hline
791 heimbach 1.16 ~ & {\it AD-target} & {\it output} & {\it description} \\
792     \hline
793     \hline
794     (1) & {\tt <MODE><TOOL>only} & {\tt <MODE>\_<TOOL>\_output.f} &
795     generates code for $<$MODE$>$ using $<$TOOL$>$ \\
796     ~ & ~ & ~ & no {\tt make} dependencies on {\tt .F .h} \\
797     ~ & ~ & ~ & useful for compiling on remote platforms \\
798     \hline
799     (2) & {\tt <MODE><TOOL>} & {\tt <MODE>\_<TOOL>\_output.f} &
800     generates code for $<$MODE$>$ using $<$TOOL$>$ \\
801     ~ & ~ & ~ & includes {\tt make} dependencies on {\tt .F .h} \\
802     ~ & ~ & ~ & i.e. input for $<$TOOL$>$ may be re-generated \\
803     \hline
804     (3) & {\tt <MODE>all} & {\tt mitgcmuv\_<MODE>} &
805     generates code for $<$MODE$>$ using $<$TOOL$>$ \\
806     ~ & ~ & ~ & and compiles all code \\
807     ~ & ~ & ~ & (use of TAF is set as default) \\
808     \hline
809     \end{tabular}
810     }
811     \end{table}
812     %
813     Here, the following placeholders are used
814     %
815     \begin{itemize}
816     %
817 heimbach 1.21 \item $<$TOOL$>$
818 heimbach 1.16 %
819     \begin{itemize}
820     %
821     \item {\tt TAF}
822     \item {\tt TAMC}
823     %
824     \end{itemize}
825     %
826 heimbach 1.21 \item $<$MODE$>$
827 heimbach 1.16 %
828     \begin{itemize}
829     %
830     \item {\tt ad} generates the adjoint model (ADM)
831     \item {\tt ftl} generates the tangent linear model (TLM)
832     \item {\tt svd} generates both ADM and TLM for \\
833     singular value decomposition (SVD) type calculations
834     %
835     \end{itemize}
836     %
837     \end{itemize}
838    
839     For example, to generate the adjoint model using TAF after routines ({\tt .F})
840     or headers ({\tt .h}) have been modified, but without compilation,
841     type {\tt make adtaf};
842     or, to generate the tangent linear model using TAMC without
843     re-generating the input code, type {\tt make ftltamconly}.
844    
845    
846     A typical full build process to generate the ADM via TAF would
847     look like follows:
848     \begin{verbatim}
849     % mkdir build
850     % cd build
851     % ../../../tools/genmake2 -mods=../code_ad
852     % make depend
853     % make adall
854     \end{verbatim}
855    
856     %------------------------------------------------------------------
857    
858     \subsection{The AD build process in detail
859     \label{section_ad_build_detail}}
860    
861     The {\tt make <MODE>all} target consists of the following procedures:
862    
863     \begin{enumerate}
864     %
865     \item
866     A header file {\tt AD\_CONFIG.h} is generated which contains a CPP option
867     on which code ought to be generated. Depending on the {\tt make} target,
868 heimbach 1.21 the contents is one of the following:
869 heimbach 1.16 \begin{itemize}
870     \item
871     {\tt \#define ALLOW\_ADJOINT\_RUN}
872     \item
873     {\tt \#define ALLOW\_TANGENTLINEAR\_RUN}
874     \item
875     {\tt \#define ALLOW\_ECCO\_OPTIMIZATION}
876     \end{itemize}
877     %
878     \item
879     A single file {\tt <MODE>\_input\_code.f} is concatenated
880     consisting of all {\tt .f} files that are part of the list {\bf AD\_FILES}
881     and all {\tt .flow} files that are part of the list {\bf AD\_FLOW\_FILES}.
882     %
883     \item
884 heimbach 1.21 The AD tool is invoked with the {\tt <MODE>\_<TOOL>\_FLAGS}.
885 heimbach 1.16 The default AD tool flags in {\tt genmake2} can be overrwritten by
886     an {\tt adjoint\_options} file (similar to the platform-specific
887     {\tt build\_options}, see Section ???.
888     The AD tool writes the resulting AD code into the file
889     {\tt <MODE>\_input\_code\_ad.f}
890     %
891     \item
892     A short sed script {\tt adjoint\_sed} is applied to
893     {\tt <MODE>\_input\_code\_ad.f}
894     to reinstate {\bf myThid} into the CALL argument list of active file I/O.
895     The result is written to file {\tt <MODE>\_<TOOL>\_output.f}.
896     %
897     \item
898     All routines are compiled and an executable is generated
899     (see Table ???).
900     %
901     \end{enumerate}
902    
903     \subsubsection{The list AD\_FILES and {\tt .list} files}
904    
905     Not all routines are presented to the AD tool.
906     Routines typically hidden are diagnostics routines which
907     do not influence the cost function, but may create
908     artificial flow dependencies such as I/O of active variables.
909    
910     {\tt genmake2} generates a list (or variable) {\bf AD\_FILES}
911     which contains all routines that are shown to the AD tool.
912     This list is put together from all files with suffix {\tt .list}
913     that {\tt genmake2} finds in its search directories.
914     The list file for the core MITgcm routines is in {\tt model/src/}
915     is called {\tt model\_ad\_diff.list}.
916     Note that no wrapper routine is shown to TAF. These are either
917     not visible at all to the AD code, or hand-written AD code
918     is available (see next section).
919    
920     Each package directory contains its package-specific
921     list file {\tt <PKG>\_ad\_diff.list}. For example,
922     {\tt pkg/ptracers/} contains the file {\tt ptracers\_ad\_diff.list}.
923     Thus, enabling a package will automatically extend the
924     {\bf AD\_FILES} list of {\tt genmake2} to incorporate the
925     package-specific routines.
926     Note that you will need to regenerate the {\tt Makefile} if
927     you enable a package (e.g. by adding it to {\tt packages.conf})
928     and a {\tt Makefile} already exists.
929    
930     \subsubsection{The list AD\_FLOW\_FILES and {\tt .flow} files}
931    
932     TAMC and TAF can evaluate user-specified directives
933     that start with a specific syntax ({\tt CADJ}, {\tt C\$TAF}, {\tt !\$TAF}).
934     The main categories of directives are STORE directives and
935     FLOW directives. Here, we are concerned with flow directives,
936     store directives are treated elsewhere.
937    
938     Flow directives enable the AD tool to evaluate how it should treat
939     routines that are 'hidden' by the user, i.e. routines which are
940     not contained in the {\bf AD\_FILES} list (see previous section),
941     but which are called in part of the code that the AD tool does see.
942     The flow directive tell the AD tool
943     %
944     \begin{itemize}
945     %
946     \item which subroutine arguments are input/output
947     \item which subroutine arguments are active
948     \item which subroutine arguments are required to compute the cost
949     \item which subroutine arguments are dependent
950     %
951     \end{itemize}
952     %
953     The syntax for the flow directives can be found in the
954     AD tool manuals.
955    
956     {\tt genmake2} generates a list (or variable) {\bf AD\_FLOW\_FILES}
957     which contains all files with suffix{\tt .flow} that it finds
958     in its search directories.
959     The flow directives for the core MITgcm routines of
960     {\tt eesupp/src/} and {\tt model/src/}
961     reside in {\tt pkg/autodiff/}.
962     This directory also contains hand-written adjoint code
963 edhill 1.20 for the MITgcm WRAPPER (section \ref{chap:sarch}).
964 heimbach 1.16
965     Flow directives for package-specific routines are contained in
966     the corresponding package directories in the file
967     {\tt <PKG>\_ad.flow}, e.g. ptracers-specific directives are in
968     {\tt ptracers\_ad.flow}.
969    
970     \subsubsection{Store directives for 3-level checkpointing}
971    
972     The storing that is required at each period of the
973     3-level checkpointing is controled by three
974     top-level headers.
975    
976     \begin{verbatim}
977     do ilev_3 = 1, nchklev_3
978     # include ``checkpoint_lev3.h''
979     do ilev_2 = 1, nchklev_2
980     # include ``checkpoint_lev2.h''
981     do ilev_1 = 1, nchklev_1
982     # include ``checkpoint_lev1.h''
983    
984     ...
985    
986     end do
987     end do
988     end do
989     \end{verbatim}
990    
991     All files {\tt checkpoint\_lev?.h} are contained in directory
992     {\tt pkg/autodiff/}.
993    
994    
995     \subsubsection{Changing the default AD tool flags: ad\_options files}
996    
997    
998     \subsubsection{Hand-written adjoint code}
999    
1000     %------------------------------------------------------------------
1001    
1002 heimbach 1.4 \subsection{The cost function (dependent variable)
1003     \label{section_cost}}
1004 adcroft 1.1
1005     The cost function $ {\cal J} $ is referred to as the {\sf dependent variable}.
1006     It is a function of the input variables $ \vec{u} $ via the composition
1007     $ {\cal J}(\vec{u}) \, = \, {\cal J}(M(\vec{u})) $.
1008 heimbach 1.15 The input are referred to as the
1009 adcroft 1.1 {\sf independent variables} or {\sf control variables}.
1010     All aspects relevant to the treatment of the cost function $ {\cal J} $
1011 cnh 1.7 (parameter setting, initialization, accumulation,
1012 heimbach 1.4 final evaluation), are controlled by the package {\it pkg/cost}.
1013 heimbach 1.15 The aspects relevant to the treatment of the independent variables
1014     are controlled by the package {\it pkg/ctrl} and will be treated
1015     in the next section.
1016 heimbach 1.4
1017 jmc 1.22 \input{s_autodiff/text/doc_cost_flow}
1018 adcroft 1.1
1019 heimbach 1.16 \subsubsection{Enabling the package}
1020    
1021 adcroft 1.1 \fbox{
1022     \begin{minipage}{12cm}
1023 heimbach 1.16 {\it packages.conf}, {\it ECCO\_CPPOPTIONS.h}
1024 adcroft 1.1 \end{minipage}
1025     }
1026 heimbach 1.16 \begin{itemize}
1027 adcroft 1.1 %
1028 heimbach 1.16 \item
1029     The package is enabled by adding {\it cost} to your file {\it packages.conf}
1030     (see Section ???)
1031 adcroft 1.1 %
1032 heimbach 1.16 \item
1033    
1034    
1035     \end{itemize}
1036 adcroft 1.1 %
1037 heimbach 1.16
1038 heimbach 1.15 N.B.: In general the following packages ought to be enabled
1039     simultaneously: {\it autodiff, cost, ctrl}.
1040 heimbach 1.4 The basic CPP option to enable the cost function is {\bf ALLOW\_COST}.
1041     Each specific cost function contribution has its own option.
1042     For the present example the option is {\bf ALLOW\_COST\_TRACER}.
1043     All cost-specific options are set in {\it ECCO\_CPPOPTIONS.h}
1044 adcroft 1.1 Since the cost function is usually used in conjunction with
1045     automatic differentiation, the CPP option
1046 heimbach 1.15 {\bf ALLOW\_ADJOINT\_RUN} (file {\it CPP\_OPTIONS.h}) and
1047     {\bf ALLOW\_AUTODIFF\_TAMC} (file {\it ECCO\_CPPOPTIONS.h})
1048     should be defined.
1049 adcroft 1.1
1050 cnh 1.7 \subsubsection{Initialization}
1051 adcroft 1.1 %
1052 cnh 1.7 The initialization of the {\it cost} package is readily enabled
1053 heimbach 1.15 as soon as the CPP option {\bf ALLOW\_COST} is defined.
1054 adcroft 1.1 %
1055     \begin{itemize}
1056     %
1057     \item
1058     \fbox{
1059     \begin{minipage}{12cm}
1060     Parameters: {\it cost\_readparms}
1061     \end{minipage}
1062     }
1063     \\
1064     This S/R
1065     reads runtime flags and parameters from file {\it data.cost}.
1066     For the present example the only relevant parameter read
1067     is {\bf mult\_tracer}. This multiplier enables different
1068     cost function contributions to be switched on
1069     ( = 1.) or off ( = 0.) at runtime.
1070     For more complex cost functions which involve model vs. data
1071     misfits, the corresponding data filenames and data
1072     specifications (start date and time, period, ...) are read
1073     in this S/R.
1074     %
1075     \item
1076     \fbox{
1077     \begin{minipage}{12cm}
1078     Variables: {\it cost\_init}
1079     \end{minipage}
1080     }
1081     \\
1082     This S/R
1083 cnh 1.7 initializes the different cost function contributions.
1084     The contribution for the present example is {\bf objf\_tracer}
1085 adcroft 1.1 which is defined on each tile (bi,bj).
1086     %
1087     \end{itemize}
1088     %
1089 heimbach 1.4 \subsubsection{Accumulation}
1090 adcroft 1.1 %
1091     \begin{itemize}
1092     %
1093     \item
1094     \fbox{
1095     \begin{minipage}{12cm}
1096     {\it cost\_tile}, {\it cost\_tracer}
1097     \end{minipage}
1098     }
1099     \end{itemize}
1100     %
1101     The 'driver' routine
1102     {\it cost\_tile} is called at the end of each time step.
1103     Within this 'driver' routine, S/R are called for each of
1104     the chosen cost function contributions.
1105     In the present example ({\bf ALLOW\_COST\_TRACER}),
1106     S/R {\it cost\_tracer} is called.
1107     It accumulates {\bf objf\_tracer} according to eqn. (\ref{???}).
1108     %
1109     \subsubsection{Finalize all contributions}
1110     %
1111     \begin{itemize}
1112     %
1113     \item
1114     \fbox{
1115     \begin{minipage}{12cm}
1116     {\it cost\_final}
1117     \end{minipage}
1118     }
1119     \end{itemize}
1120     %
1121     At the end of the forward integration S/R {\it cost\_final}
1122     is called. It accumulates the total cost function {\bf fc}
1123     from each contribution and sums over all tiles:
1124     \begin{equation}
1125     {\cal J} \, = \,
1126     {\rm fc} \, = \,
1127 heimbach 1.15 {\rm mult\_tracer} \sum_{\text{global sum}} \sum_{bi,\,bj}^{nSx,\,nSy}
1128 adcroft 1.1 {\rm objf\_tracer}(bi,bj) \, + \, ...
1129     \end{equation}
1130     %
1131     The total cost function {\bf fc} will be the
1132 heimbach 1.21 'dependent' variable in the argument list for TAF, i.e.
1133 adcroft 1.1 \begin{verbatim}
1134 heimbach 1.21 taf -output 'fc' ...
1135 adcroft 1.1 \end{verbatim}
1136    
1137 cnh 1.3 %%%% \end{document}
1138 adcroft 1.1
1139 jmc 1.22 \input{s_autodiff/text/doc_ad_the_main}
1140 adcroft 1.1
1141 heimbach 1.4 \subsection{The control variables (independent variables)
1142     \label{section_ctrl}}
1143 adcroft 1.1
1144     The control variables are a subset of the model input
1145     (initial conditions, boundary conditions, model parameters).
1146     Here we identify them with the variable $ \vec{u} $.
1147     All intermediate variables whose derivative w.r.t. control
1148 heimbach 1.4 variables do not vanish are called {\sf active variables}.
1149 adcroft 1.1 All subroutines whose derivative w.r.t. the control variables
1150     don't vanish are called {\sf active routines}.
1151     Read and write operations from and to file can be viewed
1152     as variable assignments. Therefore, files to which
1153     active variables are written and from which active variables
1154     are read are called {\sf active files}.
1155     All aspects relevant to the treatment of the control variables
1156 cnh 1.7 (parameter setting, initialization, perturbation)
1157     are controlled by the package {\it pkg/ctrl}.
1158 adcroft 1.1
1159 jmc 1.22 \input{s_autodiff/text/doc_ctrl_flow}
1160 heimbach 1.4
1161 adcroft 1.1 \subsubsection{genmake and CPP options}
1162     %
1163     \begin{itemize}
1164     %
1165     \item
1166     \fbox{
1167     \begin{minipage}{12cm}
1168     {\it genmake}, {\it CPP\_OPTIONS.h}, {\it ECCO\_CPPOPTIONS.h}
1169     \end{minipage}
1170     }
1171     \end{itemize}
1172     %
1173     To enable the directory to be included to the compile list,
1174     {\bf ctrl} has to be added to the {\bf enable} list in
1175 heimbach 1.15 {\it .genmakerc} or in {\it genmake} itself (analogous to {\it cost}
1176     package, cf. previous section).
1177 adcroft 1.1 Each control variable is enabled via its own CPP option
1178     in {\it ECCO\_CPPOPTIONS.h}.
1179    
1180 cnh 1.7 \subsubsection{Initialization}
1181 adcroft 1.1 %
1182     \begin{itemize}
1183     %
1184     \item
1185     \fbox{
1186     \begin{minipage}{12cm}
1187     Parameters: {\it ctrl\_readparms}
1188     \end{minipage}
1189     }
1190     \\
1191     %
1192     This S/R
1193     reads runtime flags and parameters from file {\it data.ctrl}.
1194     For the present example the file contains the file names
1195     of each control variable that is used.
1196     In addition, the number of wet points for each control
1197     variable and the net dimension of the space of control
1198     variables (counting wet points only) {\bf nvarlength}
1199     is determined.
1200     Masks for wet points for each tile {\bf (bi,\,bj)}
1201     and vertical layer {\bf k} are generated for the three
1202     relevant categories on the C-grid:
1203     {\bf nWetCtile} for tracer fields,
1204     {\bf nWetWtile} for zonal velocity fields,
1205     {\bf nWetStile} for meridional velocity fields.
1206     %
1207     \item
1208     \fbox{
1209     \begin{minipage}{12cm}
1210     Control variables, control vector,
1211     and their gradients: {\it ctrl\_unpack}
1212     \end{minipage}
1213     }
1214     \\
1215     %
1216     Two important issues related to the handling of the control
1217 edhill 1.20 variables in MITgcm need to be addressed.
1218 adcroft 1.1 First, in order to save memory, the control variable arrays
1219     are not kept in memory, but rather read from file and added
1220 cnh 1.7 to the initial fields during the model initialization phase.
1221 adcroft 1.1 Similarly, the corresponding adjoint fields which represent
1222     the gradient of the cost function w.r.t. the control variables
1223 heimbach 1.4 are written to file at the end of the adjoint integration.
1224 adcroft 1.1 Second, in addition to the files holding the 2-dim. and 3-dim.
1225 heimbach 1.4 control variables and the corresponding cost gradients,
1226     a 1-dim. {\sf control vector}
1227 adcroft 1.1 and {\sf gradient vector} are written to file. They contain
1228     only the wet points of the control variables and the corresponding
1229     gradient.
1230     This leads to a significant data compression.
1231 heimbach 1.4 Furthermore, an option is available
1232     ({\tt ALLOW\_NONDIMENSIONAL\_CONTROL\_IO}) to
1233     non-dimensionalise the control and gradient vector,
1234     which otherwise would contain different pieces of different
1235     magnitudes and units.
1236     Finally, the control and gradient vector can be passed to a
1237 adcroft 1.1 minimization routine if an update of the control variables
1238     is sought as part of a minimization exercise.
1239    
1240     The files holding fields and vectors of the control variables
1241     and gradient are generated and initialised in S/R {\it ctrl\_unpack}.
1242     %
1243     \end{itemize}
1244    
1245     \subsubsection{Perturbation of the independent variables}
1246     %
1247 heimbach 1.4 The dependency flow for differentiation w.r.t. the controls
1248     starts with adding a perturbation onto the input variable,
1249 heimbach 1.21 thus defining the independent or control variables for TAF.
1250 heimbach 1.4 Three types of controls may be considered:
1251 adcroft 1.1 %
1252     \begin{itemize}
1253     %
1254     \item
1255     \fbox{
1256     \begin{minipage}{12cm}
1257     {\it ctrl\_map\_ini} (initial value sensitivity):
1258     \end{minipage}
1259     }
1260     \\
1261     %
1262     Consider as an example the initial tracer distribution
1263     {\bf tr1} as control variable.
1264     After {\bf tr1} has been initialised in
1265 heimbach 1.4 {\it ini\_tr1} (dynamical variables such as
1266 adcroft 1.1 temperature and salinity are initialised in {\it ini\_fields}),
1267     a perturbation anomaly is added to the field in S/R
1268     {\it ctrl\_map\_ini}
1269     %
1270     \begin{equation}
1271     \begin{split}
1272     u & = \, u_{[0]} \, + \, \Delta u \\
1273     {\bf tr1}(...) & = \, {\bf tr1_{ini}}(...) \, + \, {\bf xx\_tr1}(...)
1274     \label{perturb}
1275     \end{split}
1276     \end{equation}
1277     %
1278 heimbach 1.4 {\bf xx\_tr1} is a 3-dim. global array
1279 adcroft 1.1 holding the perturbation. In the case of a simple
1280     sensitivity study this array is identical to zero.
1281 heimbach 1.4 However, it's specification is essential in the context
1282 heimbach 1.21 of automatic differentiation since TAF
1283 adcroft 1.1 treats the corresponding line in the code symbolically
1284     when determining the differentiation chain and its origin.
1285     Thus, the variable names are part of the argument list
1286 heimbach 1.21 when calling TAF:
1287 adcroft 1.1 %
1288     \begin{verbatim}
1289 heimbach 1.21 taf -input 'xx_tr1 ...' ...
1290 adcroft 1.1 \end{verbatim}
1291     %
1292 edhill 1.20 Now, as mentioned above, MITgcm avoids maintaining
1293 adcroft 1.1 an array for each control variable by reading the
1294     perturbation to a temporary array from file.
1295 heimbach 1.21 To ensure the symbolic link to be recognized by TAF, a scalar
1296 adcroft 1.1 dummy variable {\bf xx\_tr1\_dummy} is introduced
1297     and an 'active read' routine of the adjoint support
1298     package {\it pkg/autodiff} is invoked.
1299     The read-procedure is tagged with the variable
1300 heimbach 1.21 {\bf xx\_tr1\_dummy} enabling TAF to recognize the
1301 cnh 1.7 initialization of the perturbation.
1302 heimbach 1.21 The modified call of TAF thus reads
1303 adcroft 1.1 %
1304     \begin{verbatim}
1305 heimbach 1.21 taf -input 'xx_tr1_dummy ...' ...
1306 adcroft 1.1 \end{verbatim}
1307     %
1308     and the modified operation to (\ref{perturb})
1309     in the code takes on the form
1310     %
1311     \begin{verbatim}
1312     call active_read_xyz(
1313     & ..., tmpfld3d, ..., xx_tr1_dummy, ... )
1314    
1315     tr1(...) = tr1(...) + tmpfld3d(...)
1316     \end{verbatim}
1317     %
1318     Note, that reading an active variable corresponds
1319     to a variable assignment. Its derivative corresponds
1320 heimbach 1.15 to a write statement of the adjoint variable, followed by
1321     a reset.
1322 adcroft 1.1 The 'active file' routines have been designed
1323 heimbach 1.4 to support active read and corresponding adjoint active write
1324     operations (and vice versa).
1325 adcroft 1.1 %
1326     \item
1327     \fbox{
1328     \begin{minipage}{12cm}
1329     {\it ctrl\_map\_forcing} (boundary value sensitivity):
1330     \end{minipage}
1331     }
1332     \\
1333     %
1334     The handling of boundary values as control variables
1335     proceeds exactly analogous to the initial values
1336     with the symbolic perturbation taking place in S/R
1337     {\it ctrl\_map\_forcing}.
1338     Note however an important difference:
1339     Since the boundary values are time dependent with a new
1340     forcing field applied at each time steps,
1341 heimbach 1.4 the general problem may be thought of as
1342     a new control variable at each time step
1343     (or, if the perturbation is averaged over a certain period,
1344     at each $ N $ timesteps), i.e.
1345 adcroft 1.1 \[
1346     u_{\rm forcing} \, = \,
1347     \{ \, u_{\rm forcing} ( t_n ) \, \}_{
1348     n \, = \, 1, \ldots , {\rm nTimeSteps} }
1349     \]
1350     %
1351     In the current example an equilibrium state is considered,
1352     and only an initial perturbation to
1353     surface forcing is applied with respect to the
1354     equilibrium state.
1355     A time dependent treatment of the surface forcing is
1356     implemented in the ECCO environment, involving the
1357     calendar ({\it cal}~) and external forcing ({\it exf}~) packages.
1358     %
1359     \item
1360     \fbox{
1361     \begin{minipage}{12cm}
1362     {\it ctrl\_map\_params} (parameter sensitivity):
1363     \end{minipage}
1364     }
1365     \\
1366     %
1367     This routine is not yet implemented, but would proceed
1368     proceed along the same lines as the initial value sensitivity.
1369 heimbach 1.4 The mixing parameters {\bf diffkr} and {\bf kapgm}
1370     are currently added as controls in {\it ctrl\_map\_ini.F}.
1371 adcroft 1.1 %
1372     \end{itemize}
1373     %
1374    
1375     \subsubsection{Output of adjoint variables and gradient}
1376     %
1377 heimbach 1.4 Several ways exist to generate output of adjoint fields.
1378 adcroft 1.1 %
1379     \begin{itemize}
1380     %
1381     \item
1382     \fbox{
1383     \begin{minipage}{12cm}
1384 heimbach 1.4 {\it ctrl\_map\_ini, ctrl\_map\_forcing}:
1385 adcroft 1.1 \end{minipage}
1386     }
1387     \\
1388     \begin{itemize}
1389     %
1390 heimbach 1.4 \item {\bf xx\_...}: the control variable fields \\
1391     Before the forward integration, the control
1392     variables are read from file {\bf xx\_ ...} and added to
1393     the model field.
1394 adcroft 1.1 %
1395     \item {\bf adxx\_...}: the adjoint variable fields, i.e. the gradient
1396 heimbach 1.4 $ \nabla _{u}{\cal J} $ for each control variable \\
1397     After the adjoint integration the corresponding adjoint
1398     variables are written to {\bf adxx\_ ...}.
1399 adcroft 1.1 %
1400 heimbach 1.4 \end{itemize}
1401 adcroft 1.1 %
1402 heimbach 1.4 \item
1403     \fbox{
1404     \begin{minipage}{12cm}
1405     {\it ctrl\_unpack, ctrl\_pack}:
1406     \end{minipage}
1407     }
1408     \\
1409     %
1410     \begin{itemize}
1411     %
1412     \item {\bf vector\_ctrl}: the control vector \\
1413 cnh 1.7 At the very beginning of the model initialization,
1414 heimbach 1.4 the updated compressed control vector is read (or initialised)
1415     and distributed to 2-dim. and 3-dim. control variable fields.
1416     %
1417     \item {\bf vector\_grad}: the gradient vector \\
1418     At the very end of the adjoint integration,
1419     the 2-dim. and 3-dim. adjoint variables are read,
1420     compressed to a single vector and written to file.
1421 adcroft 1.1 %
1422     \end{itemize}
1423     %
1424     \item
1425     \fbox{
1426     \begin{minipage}{12cm}
1427     {\it addummy\_in\_stepping}:
1428     \end{minipage}
1429     }
1430     \\
1431     In addition to writing the gradient at the end of the
1432 heimbach 1.4 forward/adjoint integration, many more adjoint variables
1433     of the model state
1434     at intermediate times can be written using S/R
1435 adcroft 1.1 {\it addummy\_in\_stepping}.
1436     This routine is part of the adjoint support package
1437     {\it pkg/autodiff} (cf.f. below).
1438 heimbach 1.15 The procedure is enabled using via the CPP-option
1439     {\bf ALLOW\_AUTODIFF\_MONITOR} (file {\it ECCO\_CPPOPTIONS.h}).
1440 adcroft 1.1 To be part of the adjoint code, the corresponding S/R
1441     {\it dummy\_in\_stepping} has to be called in the forward
1442     model (S/R {\it the\_main\_loop}) at the appropriate place.
1443 heimbach 1.15 The adjoint common blocks are extracted from the adjoint code
1444     via the header file {\it adcommon.h}.
1445 adcroft 1.1
1446     {\it dummy\_in\_stepping} is essentially empty,
1447     the corresponding adjoint routine is hand-written rather
1448     than generated automatically.
1449     Appropriate flow directives ({\it dummy\_in\_stepping.flow})
1450     ensure that TAMC does not automatically
1451     generate {\it addummy\_in\_stepping} by trying to differentiate
1452 heimbach 1.4 {\it dummy\_in\_stepping}, but instead refers to
1453     the hand-written routine.
1454 adcroft 1.1
1455     {\it dummy\_in\_stepping} is called in the forward code
1456     at the beginning of each
1457     timestep, before the call to {\it dynamics}, thus ensuring
1458     that {\it addummy\_in\_stepping} is called at the end of
1459     each timestep in the adjoint calculation, after the call to
1460     {\it addynamics}.
1461    
1462     {\it addummy\_in\_stepping} includes the header files
1463 heimbach 1.4 {\it adcommon.h}.
1464     This header file is also hand-written. It contains
1465     the common blocks
1466     {\bf /addynvars\_r/}, {\bf /addynvars\_cd/},
1467     {\bf /addynvars\_diffkr/}, {\bf /addynvars\_kapgm/},
1468 adcroft 1.1 {\bf /adtr1\_r/}, {\bf /adffields/},
1469     which have been extracted from the adjoint code to enable
1470     access to the adjoint variables.
1471 heimbach 1.15
1472     {\bf WARNING:} If the structure of the common blocks
1473     {\bf /dynvars\_r/}, {\bf /dynvars\_cd/}, etc., changes
1474     similar changes will occur in the adjoint common blocks.
1475     Therefore, consistency between the TAMC-generated common blocks
1476     and those in {\it adcommon.h} have to be checked.
1477 adcroft 1.1 %
1478     \end{itemize}
1479    
1480    
1481     \subsubsection{Control variable handling for
1482     optimization applications}
1483    
1484     In optimization mode the cost function $ {\cal J}(u) $ is sought
1485     to be minimized with respect to a set of control variables
1486     $ \delta {\cal J} \, = \, 0 $, in an iterative manner.
1487     The gradient $ \nabla _{u}{\cal J} |_{u_{[k]}} $ together
1488     with the value of the cost function itself $ {\cal J}(u_{[k]}) $
1489     at iteration step $ k $ serve
1490     as input to a minimization routine (e.g. quasi-Newton method,
1491 heimbach 1.9 conjugate gradient, ... \cite{gil-lem:89})
1492 heimbach 1.4 to compute an update in the
1493 adcroft 1.1 control variable for iteration step $k+1$
1494     \[
1495     u_{[k+1]} \, = \, u_{[0]} \, + \, \Delta u_{[k+1]}
1496     \quad \mbox{satisfying} \quad
1497     {\cal J} \left( u_{[k+1]} \right) \, < \, {\cal J} \left( u_{[k]} \right)
1498     \]
1499     $ u_{[k+1]} $ then serves as input for a forward/adjoint run
1500     to determine $ {\cal J} $ and $ \nabla _{u}{\cal J} $ at iteration step
1501     $ k+1 $.
1502     Tab. \ref{???} sketches the flow between forward/adjoint model
1503     and the minimization routine.
1504    
1505     \begin{eqnarray*}
1506 heimbach 1.4 \scriptsize
1507 adcroft 1.1 \begin{array}{ccccc}
1508     u_{[0]} \,\, , \,\, \Delta u_{[k]} & ~ & ~ & ~ & ~ \\
1509     {\Big\downarrow}
1510     & ~ & ~ & ~ & ~ \\
1511     ~ & ~ & ~ & ~ & ~ \\
1512     \hline
1513     \multicolumn{1}{|c}{~} & ~ & ~ & ~ & \multicolumn{1}{c|}{~} \\
1514     \multicolumn{1}{|c}{
1515     u_{[k]} = u_{[0]} + \Delta u_{[k]}} &
1516     \stackrel{\bf forward}{\bf \longrightarrow} &
1517     v_{[k]} = M \left( u_{[k]} \right) &
1518     \stackrel{\bf forward}{\bf \longrightarrow} &
1519     \multicolumn{1}{c|}{
1520     {\cal J}_{[k]} = {\cal J} \left( M \left( u_{[k]} \right) \right)} \\
1521     \multicolumn{1}{|c}{~} & ~ & ~ & ~ & \multicolumn{1}{c|}{~} \\
1522     \hline
1523 heimbach 1.4 \multicolumn{1}{|c}{~} & ~ & ~ & ~ & \multicolumn{1}{c|}{~} \\
1524     \multicolumn{1}{|c}{~} & ~ & ~ & ~ & \multicolumn{1}{c|}{{\Big\downarrow}} \\
1525     \multicolumn{1}{|c}{~} & ~ & ~ & ~ & \multicolumn{1}{c|}{~} \\
1526 adcroft 1.1 \hline
1527     \multicolumn{1}{|c}{~} & ~ & ~ & ~ & \multicolumn{1}{c|}{~} \\
1528     \multicolumn{1}{|c}{
1529     \nabla_u {\cal J}_{[k]} (\delta {\cal J}) =
1530 heimbach 1.4 T^{\ast} \cdot \nabla_v {\cal J} |_{v_{[k]}} (\delta {\cal J})} &
1531 adcroft 1.1 \stackrel{\bf adjoint}{\mathbf \longleftarrow} &
1532     ad \, v_{[k]} (\delta {\cal J}) =
1533     \nabla_v {\cal J} |_{v_{[k]}} (\delta {\cal J}) &
1534     \stackrel{\bf adjoint}{\mathbf \longleftarrow} &
1535     \multicolumn{1}{c|}{ ad \, {\cal J} = \delta {\cal J}} \\
1536     \multicolumn{1}{|c}{~} & ~ & ~ & ~ & \multicolumn{1}{c|}{~} \\
1537     \hline
1538     ~ & ~ & ~ & ~ & ~ \\
1539 heimbach 1.4 \hspace*{15ex}{\Bigg\downarrow}
1540     \quad {\cal J}_{[k]}, \quad \nabla_u {\cal J}_{[k]}
1541     & ~ & ~ & ~ & ~ \\
1542 adcroft 1.1 ~ & ~ & ~ & ~ & ~ \\
1543     \hline
1544     \multicolumn{1}{|c}{~} & ~ & ~ & ~ & \multicolumn{1}{c|}{~} \\
1545     \multicolumn{1}{|c}{
1546     {\cal J}_{[k]} \,\, , \,\, \nabla_u {\cal J}_{[k]}} &
1547     {\mathbf \longrightarrow} & \text{\bf minimisation} &
1548     {\mathbf \longrightarrow} &
1549     \multicolumn{1}{c|}{ \Delta u_{[k+1]}} \\
1550     \multicolumn{1}{|c}{~} & ~ & ~ & ~ & \multicolumn{1}{c|}{~} \\
1551     \hline
1552     ~ & ~ & ~ & ~ & ~ \\
1553     ~ & ~ & ~ & ~ & \Big\downarrow \\
1554     ~ & ~ & ~ & ~ & \Delta u_{[k+1]} \\
1555     \end{array}
1556     \end{eqnarray*}
1557    
1558     The routines {\it ctrl\_unpack} and {\it ctrl\_pack} provide
1559     the link between the model and the minimization routine.
1560     As described in Section \ref{???}
1561     the {\it unpack} and {\it pack} routines read and write
1562     control and gradient {\it vectors} which are compressed
1563     to contain only wet points, in addition to the full
1564     2-dim. and 3-dim. fields.
1565     The corresponding I/O flow looks as follows:
1566    
1567     \vspace*{0.5cm}
1568    
1569 heimbach 1.4 {\scriptsize
1570 adcroft 1.1 \begin{tabular}{ccccc}
1571     {\bf vector\_ctrl\_$<$k$>$ } & ~ & ~ & ~ & ~ \\
1572     {\big\downarrow} & ~ & ~ & ~ & ~ \\
1573     \cline{1-1}
1574     \multicolumn{1}{|c|}{\it ctrl\_unpack} & ~ & ~ & ~ & ~ \\
1575     \cline{1-1}
1576     {\big\downarrow} & ~ & ~ & ~ & ~ \\
1577     \cline{3-3}
1578     \multicolumn{1}{l}{\bf xx\_theta0...$<$k$>$} & ~ &
1579     \multicolumn{1}{|c|}{~} & ~ & ~ \\
1580 heimbach 1.4 \multicolumn{1}{l}{\bf xx\_salt0...$<$k$>$} &
1581     $\stackrel{\mbox{read}}{\longrightarrow}$ &
1582 adcroft 1.1 \multicolumn{1}{|c|}{forward integration} & ~ & ~ \\
1583     \multicolumn{1}{l}{\bf \vdots} & ~ & \multicolumn{1}{|c|}{~}
1584     & ~ & ~ \\
1585     \cline{3-3}
1586 heimbach 1.4 ~ & ~ & $\downarrow$ & ~ & ~ \\
1587 adcroft 1.1 \cline{3-3}
1588     ~ & ~ &
1589     \multicolumn{1}{|c|}{~} & ~ &
1590     \multicolumn{1}{l}{\bf adxx\_theta0...$<$k$>$} \\
1591     ~ & ~ & \multicolumn{1}{|c|}{adjoint integration} &
1592 heimbach 1.4 $\stackrel{\mbox{write}}{\longrightarrow}$ &
1593 adcroft 1.1 \multicolumn{1}{l}{\bf adxx\_salt0...$<$k$>$} \\
1594     ~ & ~ & \multicolumn{1}{|c|}{~}
1595     & ~ & \multicolumn{1}{l}{\bf \vdots} \\
1596     \cline{3-3}
1597     ~ & ~ & ~ & ~ & {\big\downarrow} \\
1598     \cline{5-5}
1599     ~ & ~ & ~ & ~ & \multicolumn{1}{|c|}{\it ctrl\_pack} \\
1600     \cline{5-5}
1601     ~ & ~ & ~ & ~ & {\big\downarrow} \\
1602     ~ & ~ & ~ & ~ & {\bf vector\_grad\_$<$k$>$ } \\
1603     \end{tabular}
1604 heimbach 1.4 }
1605 adcroft 1.1
1606     \vspace*{0.5cm}
1607    
1608    
1609 heimbach 1.4 {\it ctrl\_unpack} reads the updated control vector
1610 adcroft 1.1 {\bf vector\_ctrl\_$<$k$>$}.
1611     It distributes the different control variables to
1612     2-dim. and 3-dim. files {\it xx\_...$<$k$>$}.
1613 heimbach 1.4 At the start of the forward integration the control variables
1614     are read from {\it xx\_...$<$k$>$} and added to the
1615     field.
1616     Correspondingly, at the end of the adjoint integration
1617     the adjoint fields are written
1618 adcroft 1.1 to {\it adxx\_...$<$k$>$}, again via the active file routines.
1619 heimbach 1.4 Finally, {\it ctrl\_pack} collects all adjoint files
1620 adcroft 1.1 and writes them to the compressed vector file
1621     {\bf vector\_grad\_$<$k$>$}.

  ViewVC Help
Powered by ViewVC 1.1.22