/[MITgcm]/manual/s_autodiff/text/doc_ad_2.tex
ViewVC logotype

Annotation of /manual/s_autodiff/text/doc_ad_2.tex

Parent Directory Parent Directory | Revision Log Revision Log | View Revision Graph Revision Graph


Revision 1.19 - (hide annotations) (download) (as text)
Tue Aug 2 22:26:58 2005 UTC (19 years, 11 months ago) by heimbach
Branch: MAIN
Changes since 1.18: +10 -5 lines
File MIME type: application/x-tex
Updating

1 heimbach 1.19 % $Header: /u/gcmpack/manual/part5/doc_ad_2.tex,v 1.18 2005/08/01 22:31:36 heimbach Exp $
2 heimbach 1.2 % $Name: $
3 adcroft 1.1
4 heimbach 1.18 Author: Patrick Heimbach
5    
6 adcroft 1.1 {\sf Automatic differentiation} (AD), also referred to as algorithmic
7     (or, more loosely, computational) differentiation, involves
8     automatically deriving code to calculate
9     partial derivatives from an existing fully non-linear prognostic code.
10     (see \cite{gri:00}).
11     A software tool is used that parses and transforms source files
12     according to a set of linguistic and mathematical rules.
13     AD tools are like source-to-source translators in that
14     they parse a program code as input and produce a new program code
15     as output.
16     However, unlike a pure source-to-source translation, the output program
17     represents a new algorithm, such as the evaluation of the
18     Jacobian, the Hessian, or higher derivative operators.
19     In principle, a variety of derived algorithms
20     can be generated automatically in this way.
21    
22     The MITGCM has been adapted for use with the
23 heimbach 1.4 Tangent linear and Adjoint Model Compiler (TAMC) and its successor TAF
24 adcroft 1.1 (Transformation of Algorithms in Fortran), developed
25     by Ralf Giering (\cite{gie-kam:98}, \cite{gie:99,gie:00}).
26 cnh 1.7 The first application of the adjoint of the MITGCM for sensitivity
27 adcroft 1.1 studies has been published by \cite{maro-eta:99}.
28     \cite{sta-eta:97,sta-eta:01} use the MITGCM and its adjoint
29     for ocean state estimation studies.
30 heimbach 1.4 In the following we shall refer to TAMC and TAF synonymously,
31     except were explicitly stated otherwise.
32 adcroft 1.1
33     TAMC exploits the chain rule for computing the first
34     derivative of a function with
35     respect to a set of input variables.
36     Treating a given forward code as a composition of operations --
37 heimbach 1.4 each line representing a compositional element, the chain rule is
38 adcroft 1.1 rigorously applied to the code, line by line. The resulting
39     tangent linear or adjoint code,
40     then, may be thought of as the composition in
41     forward or reverse order, respectively, of the
42 heimbach 1.4 Jacobian matrices of the forward code's compositional elements.
43 adcroft 1.1
44     %**********************************************************************
45     \section{Some basic algebra}
46     \label{sec_ad_algebra}
47 edhill 1.17 \begin{rawhtml}
48     <!-- CMIREDIR:sec_ad_algebra: -->
49     \end{rawhtml}
50 adcroft 1.1 %**********************************************************************
51    
52     Let $ \cal{M} $ be a general nonlinear, model, i.e. a
53     mapping from the $m$-dimensional space
54     $U \subset I\!\!R^m$ of input variables
55     $\vec{u}=(u_1,\ldots,u_m)$
56     (model parameters, initial conditions, boundary conditions
57     such as forcing functions) to the $n$-dimensional space
58     $V \subset I\!\!R^n$ of
59     model output variable $\vec{v}=(v_1,\ldots,v_n)$
60 cnh 1.7 (model state, model diagnostics, objective function, ...)
61 adcroft 1.1 under consideration,
62     %
63     \begin{equation}
64     \begin{split}
65     {\cal M} \, : & \, U \,\, \longrightarrow \, V \\
66     ~ & \, \vec{u} \,\, \longmapsto \, \vec{v} \, = \,
67     {\cal M}(\vec{u})
68     \label{fulloperator}
69     \end{split}
70     \end{equation}
71     %
72     The vectors $ \vec{u} \in U $ and $ v \in V $ may be represented w.r.t.
73     some given basis vectors
74     $ {\rm span} (U) = \{ {\vec{e}_i} \}_{i = 1, \ldots , m} $ and
75     $ {\rm span} (V) = \{ {\vec{f}_j} \}_{j = 1, \ldots , n} $ as
76     \[
77     \vec{u} \, = \, \sum_{i=1}^{m} u_i \, {\vec{e}_i},
78     \qquad
79     \vec{v} \, = \, \sum_{j=1}^{n} v_j \, {\vec{f}_j}
80     \]
81    
82     Two routes may be followed to determine the sensitivity of the
83     output variable $\vec{v}$ to its input $\vec{u}$.
84    
85     \subsection{Forward or direct sensitivity}
86     %
87     Consider a perturbation to the input variables $\delta \vec{u}$
88     (typically a single component
89     $\delta \vec{u} = \delta u_{i} \, {\vec{e}_{i}}$).
90     Their effect on the output may be obtained via the linear
91     approximation of the model $ {\cal M}$ in terms of its Jacobian matrix
92     $ M $, evaluated in the point $u^{(0)}$ according to
93     %
94     \begin{equation}
95     \delta \vec{v} \, = \, M |_{\vec{u}^{(0)}} \, \delta \vec{u}
96     \label{tangent_linear}
97     \end{equation}
98     with resulting output perturbation $\delta \vec{v}$.
99     In components
100     $M_{j i} \, = \, \partial {\cal M}_{j} / \partial u_{i} $,
101     it reads
102     %
103     \begin{equation}
104     \delta v_{j} \, = \, \sum_{i}
105     \left. \frac{\partial {\cal M}_{j}}{\partial u_{i}} \right|_{u^{(0)}} \,
106     \delta u_{i}
107     \label{jacobi_matrix}
108     \end{equation}
109     %
110     Eq. (\ref{tangent_linear}) is the {\sf tangent linear model (TLM)}.
111     In contrast to the full nonlinear model $ {\cal M} $, the operator
112     $ M $ is just a matrix
113     which can readily be used to find the forward sensitivity of $\vec{v}$ to
114     perturbations in $u$,
115 heimbach 1.4 but if there are very many input variables $(\gg O(10^{6})$ for
116 adcroft 1.1 large-scale oceanographic application), it quickly becomes
117     prohibitive to proceed directly as in (\ref{tangent_linear}),
118     if the impact of each component $ {\bf e_{i}} $ is to be assessed.
119    
120     \subsection{Reverse or adjoint sensitivity}
121     %
122     Let us consider the special case of a
123     scalar objective function ${\cal J}(\vec{v})$ of the model output (e.g.
124     the total meridional heat transport,
125     the total uptake of $CO_{2}$ in the Southern
126     Ocean over a time interval,
127     or a measure of some model-to-data misfit)
128     %
129     \begin{eqnarray}
130     \begin{array}{cccccc}
131     {\cal J} \, : & U &
132     \longrightarrow & V &
133     \longrightarrow & I \!\! R \\
134     ~ & \vec{u} & \longmapsto & \vec{v}={\cal M}(\vec{u}) &
135     \longmapsto & {\cal J}(\vec{u}) = {\cal J}({\cal M}(\vec{u}))
136     \end{array}
137     \label{compo}
138     \end{eqnarray}
139     %
140 heimbach 1.4 The perturbation of $ {\cal J} $ around a fixed point $ {\cal J}_0 $,
141 adcroft 1.1 \[
142 heimbach 1.4 {\cal J} \, = \, {\cal J}_0 \, + \, \delta {\cal J}
143 adcroft 1.1 \]
144     can be expressed in both bases of $ \vec{u} $ and $ \vec{v} $
145     w.r.t. their corresponding inner product
146     $\left\langle \,\, , \,\, \right\rangle $
147     %
148     \begin{equation}
149     \begin{split}
150     {\cal J} & = \,
151     {\cal J} |_{\vec{u}^{(0)}} \, + \,
152     \left\langle \, \nabla _{u}{\cal J}^T |_{\vec{u}^{(0)}} \, , \, \delta \vec{u} \, \right\rangle
153     \, + \, O(\delta \vec{u}^2) \\
154     ~ & = \,
155     {\cal J} |_{\vec{v}^{(0)}} \, + \,
156     \left\langle \, \nabla _{v}{\cal J}^T |_{\vec{v}^{(0)}} \, , \, \delta \vec{v} \, \right\rangle
157     \, + \, O(\delta \vec{v}^2)
158     \end{split}
159     \label{deljidentity}
160     \end{equation}
161     %
162 heimbach 1.2 (note, that the gradient $ \nabla f $ is a co-vector, therefore
163 adcroft 1.1 its transpose is required in the above inner product).
164     Then, using the representation of
165     $ \delta {\cal J} =
166     \left\langle \, \nabla _{v}{\cal J}^T \, , \, \delta \vec{v} \, \right\rangle $,
167     the definition
168     of an adjoint operator $ A^{\ast} $ of a given operator $ A $,
169     \[
170     \left\langle \, A^{\ast} \vec{x} \, , \, \vec{y} \, \right\rangle =
171     \left\langle \, \vec{x} \, , \, A \vec{y} \, \right\rangle
172     \]
173     which for finite-dimensional vector spaces is just the
174     transpose of $ A $,
175     \[
176     A^{\ast} \, = \, A^T
177     \]
178 heimbach 1.4 and from eq. (\ref{tangent_linear}), (\ref{deljidentity}),
179     we note that
180 adcroft 1.1 (omitting $|$'s):
181     %
182     \begin{equation}
183     \delta {\cal J}
184     \, = \,
185     \left\langle \, \nabla _{v}{\cal J}^T \, , \, \delta \vec{v} \, \right\rangle
186     \, = \,
187     \left\langle \, \nabla _{v}{\cal J}^T \, , \, M \, \delta \vec{u} \, \right\rangle
188     \, = \,
189     \left\langle \, M^T \, \nabla _{v}{\cal J}^T \, , \,
190     \delta \vec{u} \, \right\rangle
191     \label{inner}
192     \end{equation}
193     %
194     With the identity (\ref{deljidentity}), we then find that
195     the gradient $ \nabla _{u}{\cal J} $ can be readily inferred by
196     invoking the adjoint $ M^{\ast } $ of the tangent linear model $ M $
197     %
198     \begin{equation}
199     \begin{split}
200     \nabla _{u}{\cal J}^T |_{\vec{u}} &
201     = \, M^T |_{\vec{u}} \cdot \nabla _{v}{\cal J}^T |_{\vec{v}} \\
202     ~ & = \, M^T |_{\vec{u}} \cdot \delta \vec{v}^{\ast} \\
203     ~ & = \, \delta \vec{u}^{\ast}
204     \end{split}
205     \label{adjoint}
206     \end{equation}
207     %
208     Eq. (\ref{adjoint}) is the {\sf adjoint model (ADM)},
209     in which $M^T$ is the adjoint (here, the transpose) of the
210     tangent linear operator $M$, $ \delta \vec{v}^{\ast} $
211     the adjoint variable of the model state $ \vec{v} $, and
212     $ \delta \vec{u}^{\ast} $ the adjoint variable of the control variable $ \vec{u} $.
213    
214     The {\sf reverse} nature of the adjoint calculation can be readily
215 heimbach 1.4 seen as follows.
216     Consider a model integration which consists of $ \Lambda $
217     consecutive operations
218     $ {\cal M}_{\Lambda} ( {\cal M}_{\Lambda-1} (
219     ...... ( {\cal M}_{\lambda} (
220     ......
221     ( {\cal M}_{1} ( {\cal M}_{0}(\vec{u}) )))) $,
222     where the ${\cal M}$'s could be the elementary steps, i.e. single lines
223     in the code of the model, or successive time steps of the
224     model integration,
225     starting at step 0 and moving up to step $\Lambda$, with intermediate
226     ${\cal M}_{\lambda} (\vec{u}) = \vec{v}^{(\lambda+1)}$ and final
227     ${\cal M}_{\Lambda} (\vec{u}) = \vec{v}^{(\Lambda+1)} = \vec{v}$.
228 cnh 1.7 Let ${\cal J}$ be a cost function which explicitly depends on the
229 heimbach 1.4 final state $\vec{v}$ only
230     (this restriction is for clarity reasons only).
231     %
232     ${\cal J}(u)$ may be decomposed according to:
233 adcroft 1.1 %
234     \begin{equation}
235     {\cal J}({\cal M}(\vec{u})) \, = \,
236     {\cal J} ( {\cal M}_{\Lambda} ( {\cal M}_{\Lambda-1} (
237     ...... ( {\cal M}_{\lambda} (
238     ......
239     ( {\cal M}_{1} ( {\cal M}_{0}(\vec{u}) )))))
240     \label{compos}
241     \end{equation}
242     %
243 heimbach 1.4 Then, according to the chain rule, the forward calculation reads,
244     in terms of the Jacobi matrices
245 adcroft 1.1 (we've omitted the $ | $'s which, nevertheless are important
246     to the aspect of {\it tangent} linearity;
247 heimbach 1.4 note also that by definition
248 adcroft 1.1 $ \langle \, \nabla _{v}{\cal J}^T \, , \, \delta \vec{v} \, \rangle
249     = \nabla_v {\cal J} \cdot \delta \vec{v} $ )
250     %
251     \begin{equation}
252     \begin{split}
253     \nabla_v {\cal J} (M(\delta \vec{u})) & = \,
254     \nabla_v {\cal J} \cdot M_{\Lambda}
255     \cdot ...... \cdot M_{\lambda} \cdot ...... \cdot
256     M_{1} \cdot M_{0} \cdot \delta \vec{u} \\
257     ~ & = \, \nabla_v {\cal J} \cdot \delta \vec{v} \\
258     \end{split}
259     \label{forward}
260     \end{equation}
261     %
262     whereas in reverse mode we have
263     %
264     \begin{equation}
265     \boxed{
266     \begin{split}
267     M^T ( \nabla_v {\cal J}^T) & = \,
268     M_{0}^T \cdot M_{1}^T
269     \cdot ...... \cdot M_{\lambda}^T \cdot ...... \cdot
270     M_{\Lambda}^T \cdot \nabla_v {\cal J}^T \\
271     ~ & = \, M_{0}^T \cdot M_{1}^T
272     \cdot ...... \cdot
273     \nabla_{v^{(\lambda)}} {\cal J}^T \\
274     ~ & = \, \nabla_u {\cal J}^T
275     \end{split}
276     }
277     \label{reverse}
278     \end{equation}
279     %
280     clearly expressing the reverse nature of the calculation.
281     Eq. (\ref{reverse}) is at the heart of automatic adjoint compilers.
282 heimbach 1.4 If the intermediate steps $\lambda$ in
283 adcroft 1.1 eqn. (\ref{compos}) -- (\ref{reverse})
284 heimbach 1.4 represent the model state (forward or adjoint) at each
285     intermediate time step as noted above, then correspondingly,
286     $ M^T (\delta \vec{v}^{(\lambda) \, \ast}) =
287     \delta \vec{v}^{(\lambda-1) \, \ast} $ for the adjoint variables.
288     It thus becomes evident that the adjoint calculation also
289     yields the adjoint of each model state component
290     $ \vec{v}^{(\lambda)} $ at each intermediate step $ \lambda $, namely
291 adcroft 1.1 %
292     \begin{equation}
293     \boxed{
294     \begin{split}
295     \nabla_{v^{(\lambda)}} {\cal J}^T |_{\vec{v}^{(\lambda)}}
296     & = \,
297     M_{\lambda}^T |_{\vec{v}^{(\lambda)}} \cdot ...... \cdot
298     M_{\Lambda}^T |_{\vec{v}^{(\lambda)}} \cdot \delta \vec{v}^{\ast} \\
299     ~ & = \, \delta \vec{v}^{(\lambda) \, \ast}
300     \end{split}
301     }
302     \end{equation}
303     %
304     in close analogy to eq. (\ref{adjoint})
305     We note in passing that that the $\delta \vec{v}^{(\lambda) \, \ast}$
306 heimbach 1.4 are the Lagrange multipliers of the model equations which determine
307     $ \vec{v}^{(\lambda)}$.
308 adcroft 1.1
309 cnh 1.7 In components, eq. (\ref{adjoint}) reads as follows.
310 adcroft 1.1 Let
311     \[
312     \begin{array}{rclcrcl}
313     \delta \vec{u} & = &
314     \left( \delta u_1,\ldots, \delta u_m \right)^T , & \qquad &
315     \delta \vec{u}^{\ast} \,\, = \,\, \nabla_u {\cal J}^T & = &
316     \left(
317     \frac{\partial {\cal J}}{\partial u_1},\ldots,
318     \frac{\partial {\cal J}}{\partial u_m}
319     \right)^T \\
320     \delta \vec{v} & = &
321     \left( \delta v_1,\ldots, \delta u_n \right)^T , & \qquad &
322     \delta \vec{v}^{\ast} \,\, = \,\, \nabla_v {\cal J}^T & = &
323     \left(
324     \frac{\partial {\cal J}}{\partial v_1},\ldots,
325     \frac{\partial {\cal J}}{\partial v_n}
326     \right)^T \\
327     \end{array}
328     \]
329     denote the perturbations in $\vec{u}$ and $\vec{v}$, respectively,
330 cnh 1.7 and their adjoint variables;
331 adcroft 1.1 further
332     \[
333     M \, = \, \left(
334     \begin{array}{ccc}
335     \frac{\partial {\cal M}_1}{\partial u_1} & \ldots &
336     \frac{\partial {\cal M}_1}{\partial u_m} \\
337     \vdots & ~ & \vdots \\
338     \frac{\partial {\cal M}_n}{\partial u_1} & \ldots &
339     \frac{\partial {\cal M}_n}{\partial u_m} \\
340     \end{array}
341     \right)
342     \]
343     is the Jacobi matrix of $ {\cal M} $
344     (an $ n \times m $ matrix)
345     such that $ \delta \vec{v} = M \cdot \delta \vec{u} $, or
346     \[
347     \delta v_{j}
348     \, = \, \sum_{i=1}^m M_{ji} \, \delta u_{i}
349     \, = \, \sum_{i=1}^m \, \frac{\partial {\cal M}_{j}}{\partial u_{i}}
350     \delta u_{i}
351     \]
352     %
353     Then eq. (\ref{adjoint}) takes the form
354     \[
355     \delta u_{i}^{\ast}
356     \, = \, \sum_{j=1}^n M_{ji} \, \delta v_{j}^{\ast}
357     \, = \, \sum_{j=1}^n \, \frac{\partial {\cal M}_{j}}{\partial u_{i}}
358     \delta v_{j}^{\ast}
359     \]
360     %
361     or
362     %
363     \[
364     \left(
365     \begin{array}{c}
366     \left. \frac{\partial}{\partial u_1} {\cal J} \right|_{\vec{u}^{(0)}} \\
367     \vdots \\
368     \left. \frac{\partial}{\partial u_m} {\cal J} \right|_{\vec{u}^{(0)}} \\
369     \end{array}
370     \right)
371     \, = \,
372     \left(
373     \begin{array}{ccc}
374     \left. \frac{\partial {\cal M}_1}{\partial u_1} \right|_{\vec{u}^{(0)}}
375     & \ldots &
376     \left. \frac{\partial {\cal M}_n}{\partial u_1} \right|_{\vec{u}^{(0)}} \\
377     \vdots & ~ & \vdots \\
378     \left. \frac{\partial {\cal M}_1}{\partial u_m} \right|_{\vec{u}^{(0)}}
379     & \ldots &
380     \left. \frac{\partial {\cal M}_n}{\partial u_m} \right|_{\vec{u}^{(0)}} \\
381     \end{array}
382     \right)
383     \cdot
384     \left(
385     \begin{array}{c}
386     \left. \frac{\partial}{\partial v_1} {\cal J} \right|_{\vec{v}} \\
387     \vdots \\
388     \left. \frac{\partial}{\partial v_n} {\cal J} \right|_{\vec{v}} \\
389     \end{array}
390     \right)
391     \]
392     %
393     Furthermore, the adjoint $ \delta v^{(\lambda) \, \ast} $
394     of any intermediate state $ v^{(\lambda)} $
395     may be obtained, using the intermediate Jacobian
396     (an $ n_{\lambda+1} \times n_{\lambda} $ matrix)
397     %
398     \[
399     M_{\lambda} \, = \,
400     \left(
401     \begin{array}{ccc}
402     \frac{\partial ({\cal M}_{\lambda})_1}{\partial v^{(\lambda)}_1}
403     & \ldots &
404     \frac{\partial ({\cal M}_{\lambda})_1}{\partial v^{(\lambda)}_{n_{\lambda}}} \\
405     \vdots & ~ & \vdots \\
406     \frac{\partial ({\cal M}_{\lambda})_{n_{\lambda+1}}}{\partial v^{(\lambda)}_1}
407     & \ldots &
408     \frac{\partial ({\cal M}_{\lambda})_{n_{\lambda+1}}}{\partial v^{(\lambda)}_{n_{\lambda}}} \\
409     \end{array}
410     \right)
411     \]
412     %
413     and the shorthand notation for the adjoint variables
414     $ \delta v^{(\lambda) \, \ast}_{j} = \frac{\partial}{\partial v^{(\lambda)}_{j}}
415     {\cal J}^T $, $ j = 1, \ldots , n_{\lambda} $,
416     for intermediate components, yielding
417 heimbach 1.4 \begin{equation}
418     \small
419     \begin{split}
420 adcroft 1.1 \left(
421     \begin{array}{c}
422     \delta v^{(\lambda) \, \ast}_1 \\
423     \vdots \\
424     \delta v^{(\lambda) \, \ast}_{n_{\lambda}} \\
425     \end{array}
426     \right)
427 heimbach 1.4 \, = &
428 adcroft 1.1 \left(
429     \begin{array}{ccc}
430     \frac{\partial ({\cal M}_{\lambda})_1}{\partial v^{(\lambda)}_1}
431 heimbach 1.4 & \ldots \,\, \ldots &
432 adcroft 1.1 \frac{\partial ({\cal M}_{\lambda})_{n_{\lambda+1}}}{\partial v^{(\lambda)}_1} \\
433     \vdots & ~ & \vdots \\
434     \frac{\partial ({\cal M}_{\lambda})_1}{\partial v^{(\lambda)}_{n_{\lambda}}}
435 heimbach 1.4 & \ldots \,\, \ldots &
436 adcroft 1.1 \frac{\partial ({\cal M}_{\lambda})_{n_{\lambda+1}}}{\partial v^{(\lambda)}_{n_{\lambda}}} \\
437     \end{array}
438     \right)
439 heimbach 1.4 \cdot
440 adcroft 1.1 %
441 heimbach 1.4 \\ ~ & ~
442     \\ ~ &
443 adcroft 1.1 %
444     \left(
445     \begin{array}{ccc}
446     \frac{\partial ({\cal M}_{\lambda+1})_1}{\partial v^{(\lambda+1)}_1}
447     & \ldots &
448     \frac{\partial ({\cal M}_{\lambda+1})_{n_{\lambda+2}}}{\partial v^{(\lambda+1)}_1} \\
449     \vdots & ~ & \vdots \\
450     \vdots & ~ & \vdots \\
451     \frac{\partial ({\cal M}_{\lambda+1})_1}{\partial v^{(\lambda+1)}_{n_{\lambda+1}}}
452     & \ldots &
453     \frac{\partial ({\cal M}_{\lambda+1})_{n_{\lambda+2}}}{\partial v^{(\lambda+1)}_{n_{\lambda+1}}} \\
454     \end{array}
455     \right)
456 heimbach 1.4 \cdot \, \ldots \, \cdot
457 adcroft 1.1 \left(
458     \begin{array}{c}
459     \delta v^{\ast}_1 \\
460     \vdots \\
461     \delta v^{\ast}_{n} \\
462     \end{array}
463     \right)
464 heimbach 1.4 \end{split}
465     \end{equation}
466 adcroft 1.1
467     Eq. (\ref{forward}) and (\ref{reverse}) are perhaps clearest in
468     showing the advantage of the reverse over the forward mode
469     if the gradient $\nabla _{u}{\cal J}$, i.e. the sensitivity of the
470     cost function $ {\cal J} $ with respect to {\it all} input
471     variables $u$
472     (or the sensitivity of the cost function with respect to
473     {\it all} intermediate states $ \vec{v}^{(\lambda)} $) are sought.
474     In order to be able to solve for each component of the gradient
475     $ \partial {\cal J} / \partial u_{i} $ in (\ref{forward})
476 cnh 1.7 a forward calculation has to be performed for each component separately,
477 adcroft 1.1 i.e. $ \delta \vec{u} = \delta u_{i} {\vec{e}_{i}} $
478     for the $i$-th forward calculation.
479     Then, (\ref{forward}) represents the
480     projection of $ \nabla_u {\cal J} $ onto the $i$-th component.
481     The full gradient is retrieved from the $ m $ forward calculations.
482     In contrast, eq. (\ref{reverse}) yields the full
483     gradient $\nabla _{u}{\cal J}$ (and all intermediate gradients
484     $\nabla _{v^{(\lambda)}}{\cal J}$) within a single reverse calculation.
485    
486 heimbach 1.4 Note, that if $ {\cal J} $ is a vector-valued function
487 adcroft 1.1 of dimension $ l > 1 $,
488     eq. (\ref{reverse}) has to be modified according to
489     \[
490     M^T \left( \nabla_v {\cal J}^T \left(\delta \vec{J}\right) \right)
491     \, = \,
492     \nabla_u {\cal J}^T \cdot \delta \vec{J}
493     \]
494 heimbach 1.4 where now $ \delta \vec{J} \in I\!\!R^l $ is a vector of
495 cnh 1.7 dimension $ l $.
496 adcroft 1.1 In this case $ l $ reverse simulations have to be performed
497     for each $ \delta J_{k}, \,\, k = 1, \ldots, l $.
498     Then, the reverse mode is more efficient as long as
499     $ l < n $, otherwise the forward mode is preferable.
500 cnh 1.7 Strictly, the reverse mode is called adjoint mode only for
501 adcroft 1.1 $ l = 1 $.
502    
503     A detailed analysis of the underlying numerical operations
504     shows that the computation of $\nabla _{u}{\cal J}$ in this way
505     requires about 2 to 5 times the computation of the cost function.
506     Alternatively, the gradient vector could be approximated
507     by finite differences, requiring $m$ computations
508     of the perturbed cost function.
509    
510     To conclude we give two examples of commonly used types
511     of cost functions:
512    
513     \paragraph{Example 1:
514     $ {\cal J} = v_{j} (T) $} ~ \\
515     The cost function consists of the $j$-th component of the model state
516     $ \vec{v} $ at time $T$.
517     Then $ \nabla_v {\cal J}^T = {\vec{f}_{j}} $ is just the $j$-th
518     unit vector. The $ \nabla_u {\cal J}^T $
519     is the projection of the adjoint
520     operator onto the $j$-th component ${\bf f_{j}}$,
521     \[
522     \nabla_u {\cal J}^T
523     \, = \, M^T \cdot \nabla_v {\cal J}^T
524     \, = \, \sum_{i} M^T_{ji} \, {\vec{e}_{i}}
525     \]
526    
527     \paragraph{Example 2:
528     $ {\cal J} = \langle \, {\cal H}(\vec{v}) - \vec{d} \, ,
529     \, {\cal H}(\vec{v}) - \vec{d} \, \rangle $} ~ \\
530 heimbach 1.4 The cost function represents the quadratic model vs. data misfit.
531 adcroft 1.1 Here, $ \vec{d} $ is the data vector and $ {\cal H} $ represents the
532     operator which maps the model state space onto the data space.
533     Then, $ \nabla_v {\cal J} $ takes the form
534     %
535     \begin{equation*}
536     \begin{split}
537     \nabla_v {\cal J}^T & = \, 2 \, \, H \cdot
538     \left( \, {\cal H}(\vec{v}) - \vec{d} \, \right) \\
539     ~ & = \, 2 \sum_{j} \left\{ \sum_k
540     \frac{\partial {\cal H}_k}{\partial v_{j}}
541     \left( {\cal H}_k (\vec{v}) - d_k \right)
542     \right\} \, {\vec{f}_{j}} \\
543     \end{split}
544     \end{equation*}
545     %
546     where $H_{kj} = \partial {\cal H}_k / \partial v_{j} $ is the
547     Jacobi matrix of the data projection operator.
548     Thus, the gradient $ \nabla_u {\cal J} $ is given by the
549     adjoint operator,
550     driven by the model vs. data misfit:
551     \[
552     \nabla_u {\cal J}^T \, = \, 2 \, M^T \cdot
553     H \cdot \left( {\cal H}(\vec{v}) - \vec{d} \, \right)
554     \]
555    
556     \subsection{Storing vs. recomputation in reverse mode}
557     \label{checkpointing}
558    
559     We note an important aspect of the forward vs. reverse
560     mode calculation.
561 heimbach 1.4 Because of the local character of the derivative
562     (a derivative is defined w.r.t. a point along the trajectory),
563 adcroft 1.1 the intermediate results of the model trajectory
564     $\vec{v}^{(\lambda+1)}={\cal M}_{\lambda}(v^{(\lambda)})$
565 heimbach 1.15 may be required to evaluate the intermediate Jacobian
566 adcroft 1.1 $M_{\lambda}|_{\vec{v}^{(\lambda)}} \, \delta \vec{v}^{(\lambda)} $.
567 heimbach 1.15 This is the case e.g. for nonlinear expressions
568     (momentum advection, nonlinear equation of state), state-dependent
569     conditional statements (parameterization schemes).
570 adcroft 1.1 In the forward mode, the intermediate results are required
571     in the same order as computed by the full forward model ${\cal M}$,
572 heimbach 1.4 but in the reverse mode they are required in the reverse order.
573 adcroft 1.1 Thus, in the reverse mode the trajectory of the forward model
574     integration ${\cal M}$ has to be stored to be available in the reverse
575 heimbach 1.4 calculation. Alternatively, the complete model state up to the
576     point of evaluation has to be recomputed whenever its value is required.
577 adcroft 1.1
578     A method to balance the amount of recomputations vs.
579     storage requirements is called {\sf checkpointing}
580 heimbach 1.15 (e.g. \cite{gri:92}, \cite{res-eta:98}).
581 adcroft 1.6 It is depicted in \ref{fig:3levelcheck} for a 3-level checkpointing
582 heimbach 1.4 [as an example, we give explicit numbers for a 3-day
583 adcroft 1.1 integration with a 1-hourly timestep in square brackets].
584     \begin{itemize}
585     %
586     \item [$lev3$]
587     In a first step, the model trajectory is subdivided into
588     $ {n}^{lev3} $ subsections [$ {n}^{lev3} $=3 1-day intervals],
589     with the label $lev3$ for this outermost loop.
590 heimbach 1.4 The model is then integrated along the full trajectory,
591 heimbach 1.15 and the model state stored to disk only at every $ k_{i}^{lev3} $-th timestep
592 adcroft 1.1 [i.e. 3 times, at
593     $ i = 0,1,2 $ corresponding to $ k_{i}^{lev3} = 0, 24, 48 $].
594 heimbach 1.15 In addition, the cost function is computed, if needed.
595 adcroft 1.1 %
596     \item [$lev2$]
597 heimbach 1.4 In a second step each subsection itself is divided into
598 heimbach 1.15 $ {n}^{lev2} $ subsections
599 adcroft 1.1 [$ {n}^{lev2} $=4 6-hour intervals per subsection].
600     The model picks up at the last outermost dumped state
601 heimbach 1.4 $ v_{k_{n}^{lev3}} $ and is integrated forward in time along
602 adcroft 1.1 the last subsection, with the label $lev2$ for this
603     intermediate loop.
604 heimbach 1.15 The model state is now stored to disk at every $ k_{i}^{lev2} $-th
605 adcroft 1.1 timestep
606     [i.e. 4 times, at
607     $ i = 0,1,2,3 $ corresponding to $ k_{i}^{lev2} = 48, 54, 60, 66 $].
608     %
609     \item [$lev1$]
610 heimbach 1.4 Finally, the model picks up at the last intermediate dump state
611     $ v_{k_{n}^{lev2}} $ and is integrated forward in time along
612 heimbach 1.15 the last subsection, with the label $lev1$ for this
613 adcroft 1.1 intermediate loop.
614 heimbach 1.15 Within this sub-subsection only, parts of the model state is stored
615     to memory at every timestep
616 adcroft 1.1 [i.e. every hour $ i=0,...,5$ corresponding to
617     $ k_{i}^{lev1} = 66, 67, \ldots, 71 $].
618 heimbach 1.15 The final state $ v_n = v_{k_{n}^{lev1}} $ is reached
619     and the model state of all preceding timesteps along the last
620     innermost subsection are available, enabling integration backwards
621     in time along the last subsection.
622     The adjoint can thus be computed along this last
623     subsection $k_{n}^{lev2}$.
624 adcroft 1.1 %
625     \end{itemize}
626     %
627     This procedure is repeated consecutively for each previous
628 heimbach 1.15 subsection $k_{n-1}^{lev2}, \ldots, k_{1}^{lev2} $
629 adcroft 1.1 carrying the adjoint computation to the initial time
630     of the subsection $k_{n}^{lev3}$.
631     Then, the procedure is repeated for the previous subsection
632     $k_{n-1}^{lev3}$
633     carrying the adjoint computation to the initial time
634     $k_{1}^{lev3}$.
635    
636     For the full model trajectory of
637     $ n^{lev3} \cdot n^{lev2} \cdot n^{lev1} $ timesteps
638     the required storing of the model state was significantly reduced to
639 heimbach 1.15 $ n^{lev2} + n^{lev3} $ to disk and roughly $ n^{lev1} $ to memory
640 adcroft 1.1 [i.e. for the 3-day integration with a total oof 72 timesteps
641 heimbach 1.15 the model state was stored 7 times to disk and roughly 6 times
642     to memory].
643 adcroft 1.1 This saving in memory comes at a cost of a required
644     3 full forward integrations of the model (one for each
645     checkpointing level).
646 heimbach 1.15 The optimal balance of storage vs. recomputation certainly depends
647     on the computing resources available and may be adjusted by
648     adjusting the partitioning among the
649     $ n^{lev3}, \,\, n^{lev2}, \,\, n^{lev1} $.
650 adcroft 1.1
651     \begin{figure}[t!]
652 adcroft 1.6 \begin{center}
653 adcroft 1.1 %\psdraft
654 adcroft 1.6 %\psfrag{v_k1^lev3}{\mathinfigure{v_{k_{1}^{lev3}}}}
655     %\psfrag{v_kn-1^lev3}{\mathinfigure{v_{k_{n-1}^{lev3}}}}
656     %\psfrag{v_kn^lev3}{\mathinfigure{v_{k_{n}^{lev3}}}}
657     %\psfrag{v_k1^lev2}{\mathinfigure{v_{k_{1}^{lev2}}}}
658     %\psfrag{v_kn-1^lev2}{\mathinfigure{v_{k_{n-1}^{lev2}}}}
659     %\psfrag{v_kn^lev2}{\mathinfigure{v_{k_{n}^{lev2}}}}
660     %\psfrag{v_k1^lev1}{\mathinfigure{v_{k_{1}^{lev1}}}}
661     %\psfrag{v_kn^lev1}{\mathinfigure{v_{k_{n}^{lev1}}}}
662     %\mbox{\epsfig{file=part5/checkpointing.eps, width=0.8\textwidth}}
663     \resizebox{5.5in}{!}{\includegraphics{part5/checkpointing.eps}}
664 adcroft 1.1 %\psfull
665 adcroft 1.6 \end{center}
666     \caption{
667     Schematic view of intermediate dump and restart for
668 adcroft 1.1 3-level checkpointing.}
669 heimbach 1.4 \label{fig:3levelcheck}
670 adcroft 1.1 \end{figure}
671    
672 heimbach 1.4 % \subsection{Optimal perturbations}
673     % \label{sec_optpert}
674 adcroft 1.1
675    
676 heimbach 1.4 % \subsection{Error covariance estimate and Hessian matrix}
677     % \label{sec_hessian}
678 adcroft 1.1
679     \newpage
680    
681     %**********************************************************************
682 heimbach 1.4 \section{TLM and ADM generation in general}
683 adcroft 1.1 \label{sec_ad_setup_gen}
684 edhill 1.17 \begin{rawhtml}
685     <!-- CMIREDIR:sec_ad_setup_gen: -->
686     \end{rawhtml}
687 adcroft 1.1 %**********************************************************************
688    
689     In this section we describe in a general fashion
690     the parts of the code that are relevant for automatic
691 heimbach 1.16 differentiation using the software tool TAF.
692 adcroft 1.1
693 heimbach 1.4 \input{part5/doc_ad_the_model}
694    
695 adcroft 1.6 The basic flow is depicted in \ref{fig:adthemodel}.
696 heimbach 1.19 If CPP option \texttt{ALLOW\_AUTODIFF\_TAMC} is defined,
697     the driver routine
698 heimbach 1.4 {\it the\_model\_main}, instead of calling {\it the\_main\_loop},
699 heimbach 1.19 invokes the adjoint of this routine, {\it adthe\_main\_loop}
700     (case \texttt{\#define ALLOW\_ADJOINT\_RUN}), or
701     the tangent linear of this routine {\it g\_the\_main\_loop}
702     (case \texttt{\#define ALLOW\_TANGENTLINEAR\_RUN}),
703     which are the toplevel routines in terms of automatic differentiation.
704     The routines {\it adthe\_main\_loop} or {\it g\_the\_main\_loop}
705     are generated by TAF.
706 heimbach 1.16 It contains both the forward integration of the full model, the
707     cost function calculation,
708 heimbach 1.4 any additional storing that is required for efficient checkpointing,
709     and the reverse integration of the adjoint model.
710 heimbach 1.16
711     [DESCRIBE IN A SEPARATE SECTION THE WORKING OF THE TLM]
712    
713     In Fig. \ref{fig:adthemodel}
714     the structure of {\it adthe\_main\_loop} has been strongly
715     simplified to focus on the essentials; in particular, no checkpointing
716 heimbach 1.4 procedures are shown here.
717     Prior to the call of {\it adthe\_main\_loop}, the routine
718 heimbach 1.15 {\it ctrl\_unpack} is invoked to unpack the control vector
719     or initialise the control variables.
720     Following the call of {\it adthe\_main\_loop},
721     the routine {\it ctrl\_pack}
722 heimbach 1.4 is invoked to pack the control vector
723     (cf. Section \ref{section_ctrl}).
724     If gradient checks are to be performed, the option
725     {\tt ALLOW\_GRADIENT\_CHECK} is defined. In this case
726     the driver routine {\it grdchk\_main} is called after
727     the gradient has been computed via the adjoint
728     (cf. Section \ref{section_grdchk}).
729    
730 heimbach 1.16 %------------------------------------------------------------------
731    
732     \subsection{General setup
733     \label{section_ad_setup}}
734    
735     In order to configure AD-related setups the following packages need
736     to be enabled:
737     {\it
738     \begin{table}[h!]
739     \begin{tabular}{l}
740     autodiff \\
741     ctrl \\
742     cost \\
743     grdchk \\
744     \end{tabular}
745     \end{table}
746     }
747     The packages are enabled by adding them to your experiment-specific
748     configuration file
749     {\it packages.conf} (see Section ???).
750    
751     The following AD-specific CPP option files need to be customized:
752     %
753     \begin{itemize}
754     %
755     \item {\it ECCO\_CPPOPTIONS.h} \\
756     This header file collects CPP options for the packages
757     {\it autodiff, cost, ctrl} as well as AD-unrelated options for
758     the external forcing package {\it exf}.
759     \footnote{NOTE: These options are not set in their package-specific
760     headers such as {\it COST\_CPPOPTIONS.h}, but are instead collected
761     in the single header file {\it ECCO\_CPPOPTIONS.h}.
762     The package-specific header files serve as simple
763     placeholders at this point.}
764     %
765     \item {\it tamc.h} \\
766     This header configures the splitting of the time stepping loop
767     w.r.t. the 3-level checkpointing (see section ???).
768    
769     %
770     \end{itemize}
771    
772     %------------------------------------------------------------------
773    
774     \subsection{Building the AD code
775     \label{section_ad_build}}
776    
777     The build process of an AD code is very similar to building
778     the forward model. However, depending on which AD code one wishes
779     to generate, and on which AD tool is available (TAF or TAMC),
780     the following {\tt make} targets are available:
781    
782     \begin{table}[h!]
783     {\footnotesize
784     \begin{tabular}{ccll}
785     ~ & {\it AD-target} & {\it output} & {\it description} \\
786     \hline
787     \hline
788     (1) & {\tt <MODE><TOOL>only} & {\tt <MODE>\_<TOOL>\_output.f} &
789     generates code for $<$MODE$>$ using $<$TOOL$>$ \\
790     ~ & ~ & ~ & no {\tt make} dependencies on {\tt .F .h} \\
791     ~ & ~ & ~ & useful for compiling on remote platforms \\
792     \hline
793     (2) & {\tt <MODE><TOOL>} & {\tt <MODE>\_<TOOL>\_output.f} &
794     generates code for $<$MODE$>$ using $<$TOOL$>$ \\
795     ~ & ~ & ~ & includes {\tt make} dependencies on {\tt .F .h} \\
796     ~ & ~ & ~ & i.e. input for $<$TOOL$>$ may be re-generated \\
797     \hline
798     (3) & {\tt <MODE>all} & {\tt mitgcmuv\_<MODE>} &
799     generates code for $<$MODE$>$ using $<$TOOL$>$ \\
800     ~ & ~ & ~ & and compiles all code \\
801     ~ & ~ & ~ & (use of TAF is set as default) \\
802     \hline
803     \hline
804     \end{tabular}
805     }
806     \end{table}
807     %
808     Here, the following placeholders are used
809     %
810     \begin{itemize}
811     %
812     \item [$<$TOOL$>$]
813     %
814     \begin{itemize}
815     %
816     \item {\tt TAF}
817     \item {\tt TAMC}
818     %
819     \end{itemize}
820     %
821     \item [$<$MODE$>$]
822     %
823     \begin{itemize}
824     %
825     \item {\tt ad} generates the adjoint model (ADM)
826     \item {\tt ftl} generates the tangent linear model (TLM)
827     \item {\tt svd} generates both ADM and TLM for \\
828     singular value decomposition (SVD) type calculations
829     %
830     \end{itemize}
831     %
832     \end{itemize}
833    
834     For example, to generate the adjoint model using TAF after routines ({\tt .F})
835     or headers ({\tt .h}) have been modified, but without compilation,
836     type {\tt make adtaf};
837     or, to generate the tangent linear model using TAMC without
838     re-generating the input code, type {\tt make ftltamconly}.
839    
840    
841     A typical full build process to generate the ADM via TAF would
842     look like follows:
843     \begin{verbatim}
844     % mkdir build
845     % cd build
846     % ../../../tools/genmake2 -mods=../code_ad
847     % make depend
848     % make adall
849     \end{verbatim}
850    
851     %------------------------------------------------------------------
852    
853     \subsection{The AD build process in detail
854     \label{section_ad_build_detail}}
855    
856     The {\tt make <MODE>all} target consists of the following procedures:
857    
858     \begin{enumerate}
859     %
860     \item
861     A header file {\tt AD\_CONFIG.h} is generated which contains a CPP option
862     on which code ought to be generated. Depending on the {\tt make} target,
863     the contents is
864     \begin{itemize}
865     \item
866     {\tt \#define ALLOW\_ADJOINT\_RUN}
867     \item
868     {\tt \#define ALLOW\_TANGENTLINEAR\_RUN}
869     \item
870     {\tt \#define ALLOW\_ECCO\_OPTIMIZATION}
871     \end{itemize}
872     %
873     \item
874     A single file {\tt <MODE>\_input\_code.f} is concatenated
875     consisting of all {\tt .f} files that are part of the list {\bf AD\_FILES}
876     and all {\tt .flow} files that are part of the list {\bf AD\_FLOW\_FILES}.
877     %
878     \item
879     The AD tool is invoked with the {\bf <MODE>\_<TOOL>\_FLAGS}.
880     The default AD tool flags in {\tt genmake2} can be overrwritten by
881     an {\tt adjoint\_options} file (similar to the platform-specific
882     {\tt build\_options}, see Section ???.
883     The AD tool writes the resulting AD code into the file
884     {\tt <MODE>\_input\_code\_ad.f}
885     %
886     \item
887     A short sed script {\tt adjoint\_sed} is applied to
888     {\tt <MODE>\_input\_code\_ad.f}
889     to reinstate {\bf myThid} into the CALL argument list of active file I/O.
890     The result is written to file {\tt <MODE>\_<TOOL>\_output.f}.
891     %
892     \item
893     All routines are compiled and an executable is generated
894     (see Table ???).
895     %
896     \end{enumerate}
897    
898     \subsubsection{The list AD\_FILES and {\tt .list} files}
899    
900     Not all routines are presented to the AD tool.
901     Routines typically hidden are diagnostics routines which
902     do not influence the cost function, but may create
903     artificial flow dependencies such as I/O of active variables.
904    
905     {\tt genmake2} generates a list (or variable) {\bf AD\_FILES}
906     which contains all routines that are shown to the AD tool.
907     This list is put together from all files with suffix {\tt .list}
908     that {\tt genmake2} finds in its search directories.
909     The list file for the core MITgcm routines is in {\tt model/src/}
910     is called {\tt model\_ad\_diff.list}.
911     Note that no wrapper routine is shown to TAF. These are either
912     not visible at all to the AD code, or hand-written AD code
913     is available (see next section).
914    
915     Each package directory contains its package-specific
916     list file {\tt <PKG>\_ad\_diff.list}. For example,
917     {\tt pkg/ptracers/} contains the file {\tt ptracers\_ad\_diff.list}.
918     Thus, enabling a package will automatically extend the
919     {\bf AD\_FILES} list of {\tt genmake2} to incorporate the
920     package-specific routines.
921     Note that you will need to regenerate the {\tt Makefile} if
922     you enable a package (e.g. by adding it to {\tt packages.conf})
923     and a {\tt Makefile} already exists.
924    
925     \subsubsection{The list AD\_FLOW\_FILES and {\tt .flow} files}
926    
927     TAMC and TAF can evaluate user-specified directives
928     that start with a specific syntax ({\tt CADJ}, {\tt C\$TAF}, {\tt !\$TAF}).
929     The main categories of directives are STORE directives and
930     FLOW directives. Here, we are concerned with flow directives,
931     store directives are treated elsewhere.
932    
933     Flow directives enable the AD tool to evaluate how it should treat
934     routines that are 'hidden' by the user, i.e. routines which are
935     not contained in the {\bf AD\_FILES} list (see previous section),
936     but which are called in part of the code that the AD tool does see.
937     The flow directive tell the AD tool
938     %
939     \begin{itemize}
940     %
941     \item which subroutine arguments are input/output
942     \item which subroutine arguments are active
943     \item which subroutine arguments are required to compute the cost
944     \item which subroutine arguments are dependent
945     %
946     \end{itemize}
947     %
948     The syntax for the flow directives can be found in the
949     AD tool manuals.
950    
951     {\tt genmake2} generates a list (or variable) {\bf AD\_FLOW\_FILES}
952     which contains all files with suffix{\tt .flow} that it finds
953     in its search directories.
954     The flow directives for the core MITgcm routines of
955     {\tt eesupp/src/} and {\tt model/src/}
956     reside in {\tt pkg/autodiff/}.
957     This directory also contains hand-written adjoint code
958     for the MITgcm WRAPPER (see Section ???).
959    
960     Flow directives for package-specific routines are contained in
961     the corresponding package directories in the file
962     {\tt <PKG>\_ad.flow}, e.g. ptracers-specific directives are in
963     {\tt ptracers\_ad.flow}.
964    
965     \subsubsection{Store directives for 3-level checkpointing}
966    
967     The storing that is required at each period of the
968     3-level checkpointing is controled by three
969     top-level headers.
970    
971     \begin{verbatim}
972     do ilev_3 = 1, nchklev_3
973     # include ``checkpoint_lev3.h''
974     do ilev_2 = 1, nchklev_2
975     # include ``checkpoint_lev2.h''
976     do ilev_1 = 1, nchklev_1
977     # include ``checkpoint_lev1.h''
978    
979     ...
980    
981     end do
982     end do
983     end do
984     \end{verbatim}
985    
986     All files {\tt checkpoint\_lev?.h} are contained in directory
987     {\tt pkg/autodiff/}.
988    
989    
990     \subsubsection{Changing the default AD tool flags: ad\_options files}
991    
992    
993     \subsubsection{Hand-written adjoint code}
994    
995     %------------------------------------------------------------------
996    
997 heimbach 1.4 \subsection{The cost function (dependent variable)
998     \label{section_cost}}
999 adcroft 1.1
1000     The cost function $ {\cal J} $ is referred to as the {\sf dependent variable}.
1001     It is a function of the input variables $ \vec{u} $ via the composition
1002     $ {\cal J}(\vec{u}) \, = \, {\cal J}(M(\vec{u})) $.
1003 heimbach 1.15 The input are referred to as the
1004 adcroft 1.1 {\sf independent variables} or {\sf control variables}.
1005     All aspects relevant to the treatment of the cost function $ {\cal J} $
1006 cnh 1.7 (parameter setting, initialization, accumulation,
1007 heimbach 1.4 final evaluation), are controlled by the package {\it pkg/cost}.
1008 heimbach 1.15 The aspects relevant to the treatment of the independent variables
1009     are controlled by the package {\it pkg/ctrl} and will be treated
1010     in the next section.
1011 heimbach 1.4
1012     \input{part5/doc_cost_flow}
1013 adcroft 1.1
1014 heimbach 1.16 \subsubsection{Enabling the package}
1015    
1016 adcroft 1.1 \fbox{
1017     \begin{minipage}{12cm}
1018 heimbach 1.16 {\it packages.conf}, {\it ECCO\_CPPOPTIONS.h}
1019 adcroft 1.1 \end{minipage}
1020     }
1021 heimbach 1.16 \begin{itemize}
1022 adcroft 1.1 %
1023 heimbach 1.16 \item
1024     The package is enabled by adding {\it cost} to your file {\it packages.conf}
1025     (see Section ???)
1026 adcroft 1.1 %
1027 heimbach 1.16 \item
1028    
1029    
1030     \end{itemize}
1031 adcroft 1.1 %
1032 heimbach 1.16
1033 heimbach 1.15 N.B.: In general the following packages ought to be enabled
1034     simultaneously: {\it autodiff, cost, ctrl}.
1035 heimbach 1.4 The basic CPP option to enable the cost function is {\bf ALLOW\_COST}.
1036     Each specific cost function contribution has its own option.
1037     For the present example the option is {\bf ALLOW\_COST\_TRACER}.
1038     All cost-specific options are set in {\it ECCO\_CPPOPTIONS.h}
1039 adcroft 1.1 Since the cost function is usually used in conjunction with
1040     automatic differentiation, the CPP option
1041 heimbach 1.15 {\bf ALLOW\_ADJOINT\_RUN} (file {\it CPP\_OPTIONS.h}) and
1042     {\bf ALLOW\_AUTODIFF\_TAMC} (file {\it ECCO\_CPPOPTIONS.h})
1043     should be defined.
1044 adcroft 1.1
1045 cnh 1.7 \subsubsection{Initialization}
1046 adcroft 1.1 %
1047 cnh 1.7 The initialization of the {\it cost} package is readily enabled
1048 heimbach 1.15 as soon as the CPP option {\bf ALLOW\_COST} is defined.
1049 adcroft 1.1 %
1050     \begin{itemize}
1051     %
1052     \item
1053     \fbox{
1054     \begin{minipage}{12cm}
1055     Parameters: {\it cost\_readparms}
1056     \end{minipage}
1057     }
1058     \\
1059     This S/R
1060     reads runtime flags and parameters from file {\it data.cost}.
1061     For the present example the only relevant parameter read
1062     is {\bf mult\_tracer}. This multiplier enables different
1063     cost function contributions to be switched on
1064     ( = 1.) or off ( = 0.) at runtime.
1065     For more complex cost functions which involve model vs. data
1066     misfits, the corresponding data filenames and data
1067     specifications (start date and time, period, ...) are read
1068     in this S/R.
1069     %
1070     \item
1071     \fbox{
1072     \begin{minipage}{12cm}
1073     Variables: {\it cost\_init}
1074     \end{minipage}
1075     }
1076     \\
1077     This S/R
1078 cnh 1.7 initializes the different cost function contributions.
1079     The contribution for the present example is {\bf objf\_tracer}
1080 adcroft 1.1 which is defined on each tile (bi,bj).
1081     %
1082     \end{itemize}
1083     %
1084 heimbach 1.4 \subsubsection{Accumulation}
1085 adcroft 1.1 %
1086     \begin{itemize}
1087     %
1088     \item
1089     \fbox{
1090     \begin{minipage}{12cm}
1091     {\it cost\_tile}, {\it cost\_tracer}
1092     \end{minipage}
1093     }
1094     \end{itemize}
1095     %
1096     The 'driver' routine
1097     {\it cost\_tile} is called at the end of each time step.
1098     Within this 'driver' routine, S/R are called for each of
1099     the chosen cost function contributions.
1100     In the present example ({\bf ALLOW\_COST\_TRACER}),
1101     S/R {\it cost\_tracer} is called.
1102     It accumulates {\bf objf\_tracer} according to eqn. (\ref{???}).
1103     %
1104     \subsubsection{Finalize all contributions}
1105     %
1106     \begin{itemize}
1107     %
1108     \item
1109     \fbox{
1110     \begin{minipage}{12cm}
1111     {\it cost\_final}
1112     \end{minipage}
1113     }
1114     \end{itemize}
1115     %
1116     At the end of the forward integration S/R {\it cost\_final}
1117     is called. It accumulates the total cost function {\bf fc}
1118     from each contribution and sums over all tiles:
1119     \begin{equation}
1120     {\cal J} \, = \,
1121     {\rm fc} \, = \,
1122 heimbach 1.15 {\rm mult\_tracer} \sum_{\text{global sum}} \sum_{bi,\,bj}^{nSx,\,nSy}
1123 adcroft 1.1 {\rm objf\_tracer}(bi,bj) \, + \, ...
1124     \end{equation}
1125     %
1126     The total cost function {\bf fc} will be the
1127     'dependent' variable in the argument list for TAMC, i.e.
1128     \begin{verbatim}
1129     tamc -output 'fc' ...
1130     \end{verbatim}
1131    
1132 cnh 1.3 %%%% \end{document}
1133 adcroft 1.1
1134     \input{part5/doc_ad_the_main}
1135    
1136 heimbach 1.4 \subsection{The control variables (independent variables)
1137     \label{section_ctrl}}
1138 adcroft 1.1
1139     The control variables are a subset of the model input
1140     (initial conditions, boundary conditions, model parameters).
1141     Here we identify them with the variable $ \vec{u} $.
1142     All intermediate variables whose derivative w.r.t. control
1143 heimbach 1.4 variables do not vanish are called {\sf active variables}.
1144 adcroft 1.1 All subroutines whose derivative w.r.t. the control variables
1145     don't vanish are called {\sf active routines}.
1146     Read and write operations from and to file can be viewed
1147     as variable assignments. Therefore, files to which
1148     active variables are written and from which active variables
1149     are read are called {\sf active files}.
1150     All aspects relevant to the treatment of the control variables
1151 cnh 1.7 (parameter setting, initialization, perturbation)
1152     are controlled by the package {\it pkg/ctrl}.
1153 adcroft 1.1
1154 heimbach 1.4 \input{part5/doc_ctrl_flow}
1155    
1156 adcroft 1.1 \subsubsection{genmake and CPP options}
1157     %
1158     \begin{itemize}
1159     %
1160     \item
1161     \fbox{
1162     \begin{minipage}{12cm}
1163     {\it genmake}, {\it CPP\_OPTIONS.h}, {\it ECCO\_CPPOPTIONS.h}
1164     \end{minipage}
1165     }
1166     \end{itemize}
1167     %
1168     To enable the directory to be included to the compile list,
1169     {\bf ctrl} has to be added to the {\bf enable} list in
1170 heimbach 1.15 {\it .genmakerc} or in {\it genmake} itself (analogous to {\it cost}
1171     package, cf. previous section).
1172 adcroft 1.1 Each control variable is enabled via its own CPP option
1173     in {\it ECCO\_CPPOPTIONS.h}.
1174    
1175 cnh 1.7 \subsubsection{Initialization}
1176 adcroft 1.1 %
1177     \begin{itemize}
1178     %
1179     \item
1180     \fbox{
1181     \begin{minipage}{12cm}
1182     Parameters: {\it ctrl\_readparms}
1183     \end{minipage}
1184     }
1185     \\
1186     %
1187     This S/R
1188     reads runtime flags and parameters from file {\it data.ctrl}.
1189     For the present example the file contains the file names
1190     of each control variable that is used.
1191     In addition, the number of wet points for each control
1192     variable and the net dimension of the space of control
1193     variables (counting wet points only) {\bf nvarlength}
1194     is determined.
1195     Masks for wet points for each tile {\bf (bi,\,bj)}
1196     and vertical layer {\bf k} are generated for the three
1197     relevant categories on the C-grid:
1198     {\bf nWetCtile} for tracer fields,
1199     {\bf nWetWtile} for zonal velocity fields,
1200     {\bf nWetStile} for meridional velocity fields.
1201     %
1202     \item
1203     \fbox{
1204     \begin{minipage}{12cm}
1205     Control variables, control vector,
1206     and their gradients: {\it ctrl\_unpack}
1207     \end{minipage}
1208     }
1209     \\
1210     %
1211     Two important issues related to the handling of the control
1212     variables in the MITGCM need to be addressed.
1213     First, in order to save memory, the control variable arrays
1214     are not kept in memory, but rather read from file and added
1215 cnh 1.7 to the initial fields during the model initialization phase.
1216 adcroft 1.1 Similarly, the corresponding adjoint fields which represent
1217     the gradient of the cost function w.r.t. the control variables
1218 heimbach 1.4 are written to file at the end of the adjoint integration.
1219 adcroft 1.1 Second, in addition to the files holding the 2-dim. and 3-dim.
1220 heimbach 1.4 control variables and the corresponding cost gradients,
1221     a 1-dim. {\sf control vector}
1222 adcroft 1.1 and {\sf gradient vector} are written to file. They contain
1223     only the wet points of the control variables and the corresponding
1224     gradient.
1225     This leads to a significant data compression.
1226 heimbach 1.4 Furthermore, an option is available
1227     ({\tt ALLOW\_NONDIMENSIONAL\_CONTROL\_IO}) to
1228     non-dimensionalise the control and gradient vector,
1229     which otherwise would contain different pieces of different
1230     magnitudes and units.
1231     Finally, the control and gradient vector can be passed to a
1232 adcroft 1.1 minimization routine if an update of the control variables
1233     is sought as part of a minimization exercise.
1234    
1235     The files holding fields and vectors of the control variables
1236     and gradient are generated and initialised in S/R {\it ctrl\_unpack}.
1237     %
1238     \end{itemize}
1239    
1240     \subsubsection{Perturbation of the independent variables}
1241     %
1242 heimbach 1.4 The dependency flow for differentiation w.r.t. the controls
1243     starts with adding a perturbation onto the input variable,
1244 adcroft 1.1 thus defining the independent or control variables for TAMC.
1245 heimbach 1.4 Three types of controls may be considered:
1246 adcroft 1.1 %
1247     \begin{itemize}
1248     %
1249     \item
1250     \fbox{
1251     \begin{minipage}{12cm}
1252     {\it ctrl\_map\_ini} (initial value sensitivity):
1253     \end{minipage}
1254     }
1255     \\
1256     %
1257     Consider as an example the initial tracer distribution
1258     {\bf tr1} as control variable.
1259     After {\bf tr1} has been initialised in
1260 heimbach 1.4 {\it ini\_tr1} (dynamical variables such as
1261 adcroft 1.1 temperature and salinity are initialised in {\it ini\_fields}),
1262     a perturbation anomaly is added to the field in S/R
1263     {\it ctrl\_map\_ini}
1264     %
1265     \begin{equation}
1266     \begin{split}
1267     u & = \, u_{[0]} \, + \, \Delta u \\
1268     {\bf tr1}(...) & = \, {\bf tr1_{ini}}(...) \, + \, {\bf xx\_tr1}(...)
1269     \label{perturb}
1270     \end{split}
1271     \end{equation}
1272     %
1273 heimbach 1.4 {\bf xx\_tr1} is a 3-dim. global array
1274 adcroft 1.1 holding the perturbation. In the case of a simple
1275     sensitivity study this array is identical to zero.
1276 heimbach 1.4 However, it's specification is essential in the context
1277     of automatic differentiation since TAMC
1278 adcroft 1.1 treats the corresponding line in the code symbolically
1279     when determining the differentiation chain and its origin.
1280     Thus, the variable names are part of the argument list
1281     when calling TAMC:
1282     %
1283     \begin{verbatim}
1284     tamc -input 'xx_tr1 ...' ...
1285     \end{verbatim}
1286     %
1287     Now, as mentioned above, the MITGCM avoids maintaining
1288     an array for each control variable by reading the
1289     perturbation to a temporary array from file.
1290     To ensure the symbolic link to be recognized by TAMC, a scalar
1291     dummy variable {\bf xx\_tr1\_dummy} is introduced
1292     and an 'active read' routine of the adjoint support
1293     package {\it pkg/autodiff} is invoked.
1294     The read-procedure is tagged with the variable
1295 cnh 1.7 {\bf xx\_tr1\_dummy} enabling TAMC to recognize the
1296     initialization of the perturbation.
1297 adcroft 1.1 The modified call of TAMC thus reads
1298     %
1299     \begin{verbatim}
1300     tamc -input 'xx_tr1_dummy ...' ...
1301     \end{verbatim}
1302     %
1303     and the modified operation to (\ref{perturb})
1304     in the code takes on the form
1305     %
1306     \begin{verbatim}
1307     call active_read_xyz(
1308     & ..., tmpfld3d, ..., xx_tr1_dummy, ... )
1309    
1310     tr1(...) = tr1(...) + tmpfld3d(...)
1311     \end{verbatim}
1312     %
1313     Note, that reading an active variable corresponds
1314     to a variable assignment. Its derivative corresponds
1315 heimbach 1.15 to a write statement of the adjoint variable, followed by
1316     a reset.
1317 adcroft 1.1 The 'active file' routines have been designed
1318 heimbach 1.4 to support active read and corresponding adjoint active write
1319     operations (and vice versa).
1320 adcroft 1.1 %
1321     \item
1322     \fbox{
1323     \begin{minipage}{12cm}
1324     {\it ctrl\_map\_forcing} (boundary value sensitivity):
1325     \end{minipage}
1326     }
1327     \\
1328     %
1329     The handling of boundary values as control variables
1330     proceeds exactly analogous to the initial values
1331     with the symbolic perturbation taking place in S/R
1332     {\it ctrl\_map\_forcing}.
1333     Note however an important difference:
1334     Since the boundary values are time dependent with a new
1335     forcing field applied at each time steps,
1336 heimbach 1.4 the general problem may be thought of as
1337     a new control variable at each time step
1338     (or, if the perturbation is averaged over a certain period,
1339     at each $ N $ timesteps), i.e.
1340 adcroft 1.1 \[
1341     u_{\rm forcing} \, = \,
1342     \{ \, u_{\rm forcing} ( t_n ) \, \}_{
1343     n \, = \, 1, \ldots , {\rm nTimeSteps} }
1344     \]
1345     %
1346     In the current example an equilibrium state is considered,
1347     and only an initial perturbation to
1348     surface forcing is applied with respect to the
1349     equilibrium state.
1350     A time dependent treatment of the surface forcing is
1351     implemented in the ECCO environment, involving the
1352     calendar ({\it cal}~) and external forcing ({\it exf}~) packages.
1353     %
1354     \item
1355     \fbox{
1356     \begin{minipage}{12cm}
1357     {\it ctrl\_map\_params} (parameter sensitivity):
1358     \end{minipage}
1359     }
1360     \\
1361     %
1362     This routine is not yet implemented, but would proceed
1363     proceed along the same lines as the initial value sensitivity.
1364 heimbach 1.4 The mixing parameters {\bf diffkr} and {\bf kapgm}
1365     are currently added as controls in {\it ctrl\_map\_ini.F}.
1366 adcroft 1.1 %
1367     \end{itemize}
1368     %
1369    
1370     \subsubsection{Output of adjoint variables and gradient}
1371     %
1372 heimbach 1.4 Several ways exist to generate output of adjoint fields.
1373 adcroft 1.1 %
1374     \begin{itemize}
1375     %
1376     \item
1377     \fbox{
1378     \begin{minipage}{12cm}
1379 heimbach 1.4 {\it ctrl\_map\_ini, ctrl\_map\_forcing}:
1380 adcroft 1.1 \end{minipage}
1381     }
1382     \\
1383     \begin{itemize}
1384     %
1385 heimbach 1.4 \item {\bf xx\_...}: the control variable fields \\
1386     Before the forward integration, the control
1387     variables are read from file {\bf xx\_ ...} and added to
1388     the model field.
1389 adcroft 1.1 %
1390     \item {\bf adxx\_...}: the adjoint variable fields, i.e. the gradient
1391 heimbach 1.4 $ \nabla _{u}{\cal J} $ for each control variable \\
1392     After the adjoint integration the corresponding adjoint
1393     variables are written to {\bf adxx\_ ...}.
1394 adcroft 1.1 %
1395 heimbach 1.4 \end{itemize}
1396 adcroft 1.1 %
1397 heimbach 1.4 \item
1398     \fbox{
1399     \begin{minipage}{12cm}
1400     {\it ctrl\_unpack, ctrl\_pack}:
1401     \end{minipage}
1402     }
1403     \\
1404     %
1405     \begin{itemize}
1406     %
1407     \item {\bf vector\_ctrl}: the control vector \\
1408 cnh 1.7 At the very beginning of the model initialization,
1409 heimbach 1.4 the updated compressed control vector is read (or initialised)
1410     and distributed to 2-dim. and 3-dim. control variable fields.
1411     %
1412     \item {\bf vector\_grad}: the gradient vector \\
1413     At the very end of the adjoint integration,
1414     the 2-dim. and 3-dim. adjoint variables are read,
1415     compressed to a single vector and written to file.
1416 adcroft 1.1 %
1417     \end{itemize}
1418     %
1419     \item
1420     \fbox{
1421     \begin{minipage}{12cm}
1422     {\it addummy\_in\_stepping}:
1423     \end{minipage}
1424     }
1425     \\
1426     In addition to writing the gradient at the end of the
1427 heimbach 1.4 forward/adjoint integration, many more adjoint variables
1428     of the model state
1429     at intermediate times can be written using S/R
1430 adcroft 1.1 {\it addummy\_in\_stepping}.
1431     This routine is part of the adjoint support package
1432     {\it pkg/autodiff} (cf.f. below).
1433 heimbach 1.15 The procedure is enabled using via the CPP-option
1434     {\bf ALLOW\_AUTODIFF\_MONITOR} (file {\it ECCO\_CPPOPTIONS.h}).
1435 adcroft 1.1 To be part of the adjoint code, the corresponding S/R
1436     {\it dummy\_in\_stepping} has to be called in the forward
1437     model (S/R {\it the\_main\_loop}) at the appropriate place.
1438 heimbach 1.15 The adjoint common blocks are extracted from the adjoint code
1439     via the header file {\it adcommon.h}.
1440 adcroft 1.1
1441     {\it dummy\_in\_stepping} is essentially empty,
1442     the corresponding adjoint routine is hand-written rather
1443     than generated automatically.
1444     Appropriate flow directives ({\it dummy\_in\_stepping.flow})
1445     ensure that TAMC does not automatically
1446     generate {\it addummy\_in\_stepping} by trying to differentiate
1447 heimbach 1.4 {\it dummy\_in\_stepping}, but instead refers to
1448     the hand-written routine.
1449 adcroft 1.1
1450     {\it dummy\_in\_stepping} is called in the forward code
1451     at the beginning of each
1452     timestep, before the call to {\it dynamics}, thus ensuring
1453     that {\it addummy\_in\_stepping} is called at the end of
1454     each timestep in the adjoint calculation, after the call to
1455     {\it addynamics}.
1456    
1457     {\it addummy\_in\_stepping} includes the header files
1458 heimbach 1.4 {\it adcommon.h}.
1459     This header file is also hand-written. It contains
1460     the common blocks
1461     {\bf /addynvars\_r/}, {\bf /addynvars\_cd/},
1462     {\bf /addynvars\_diffkr/}, {\bf /addynvars\_kapgm/},
1463 adcroft 1.1 {\bf /adtr1\_r/}, {\bf /adffields/},
1464     which have been extracted from the adjoint code to enable
1465     access to the adjoint variables.
1466 heimbach 1.15
1467     {\bf WARNING:} If the structure of the common blocks
1468     {\bf /dynvars\_r/}, {\bf /dynvars\_cd/}, etc., changes
1469     similar changes will occur in the adjoint common blocks.
1470     Therefore, consistency between the TAMC-generated common blocks
1471     and those in {\it adcommon.h} have to be checked.
1472 adcroft 1.1 %
1473     \end{itemize}
1474    
1475    
1476     \subsubsection{Control variable handling for
1477     optimization applications}
1478    
1479     In optimization mode the cost function $ {\cal J}(u) $ is sought
1480     to be minimized with respect to a set of control variables
1481     $ \delta {\cal J} \, = \, 0 $, in an iterative manner.
1482     The gradient $ \nabla _{u}{\cal J} |_{u_{[k]}} $ together
1483     with the value of the cost function itself $ {\cal J}(u_{[k]}) $
1484     at iteration step $ k $ serve
1485     as input to a minimization routine (e.g. quasi-Newton method,
1486 heimbach 1.9 conjugate gradient, ... \cite{gil-lem:89})
1487 heimbach 1.4 to compute an update in the
1488 adcroft 1.1 control variable for iteration step $k+1$
1489     \[
1490     u_{[k+1]} \, = \, u_{[0]} \, + \, \Delta u_{[k+1]}
1491     \quad \mbox{satisfying} \quad
1492     {\cal J} \left( u_{[k+1]} \right) \, < \, {\cal J} \left( u_{[k]} \right)
1493     \]
1494     $ u_{[k+1]} $ then serves as input for a forward/adjoint run
1495     to determine $ {\cal J} $ and $ \nabla _{u}{\cal J} $ at iteration step
1496     $ k+1 $.
1497     Tab. \ref{???} sketches the flow between forward/adjoint model
1498     and the minimization routine.
1499    
1500     \begin{eqnarray*}
1501 heimbach 1.4 \scriptsize
1502 adcroft 1.1 \begin{array}{ccccc}
1503     u_{[0]} \,\, , \,\, \Delta u_{[k]} & ~ & ~ & ~ & ~ \\
1504     {\Big\downarrow}
1505     & ~ & ~ & ~ & ~ \\
1506     ~ & ~ & ~ & ~ & ~ \\
1507     \hline
1508     \multicolumn{1}{|c}{~} & ~ & ~ & ~ & \multicolumn{1}{c|}{~} \\
1509     \multicolumn{1}{|c}{
1510     u_{[k]} = u_{[0]} + \Delta u_{[k]}} &
1511     \stackrel{\bf forward}{\bf \longrightarrow} &
1512     v_{[k]} = M \left( u_{[k]} \right) &
1513     \stackrel{\bf forward}{\bf \longrightarrow} &
1514     \multicolumn{1}{c|}{
1515     {\cal J}_{[k]} = {\cal J} \left( M \left( u_{[k]} \right) \right)} \\
1516     \multicolumn{1}{|c}{~} & ~ & ~ & ~ & \multicolumn{1}{c|}{~} \\
1517     \hline
1518 heimbach 1.4 \multicolumn{1}{|c}{~} & ~ & ~ & ~ & \multicolumn{1}{c|}{~} \\
1519     \multicolumn{1}{|c}{~} & ~ & ~ & ~ & \multicolumn{1}{c|}{{\Big\downarrow}} \\
1520     \multicolumn{1}{|c}{~} & ~ & ~ & ~ & \multicolumn{1}{c|}{~} \\
1521 adcroft 1.1 \hline
1522     \multicolumn{1}{|c}{~} & ~ & ~ & ~ & \multicolumn{1}{c|}{~} \\
1523     \multicolumn{1}{|c}{
1524     \nabla_u {\cal J}_{[k]} (\delta {\cal J}) =
1525 heimbach 1.4 T^{\ast} \cdot \nabla_v {\cal J} |_{v_{[k]}} (\delta {\cal J})} &
1526 adcroft 1.1 \stackrel{\bf adjoint}{\mathbf \longleftarrow} &
1527     ad \, v_{[k]} (\delta {\cal J}) =
1528     \nabla_v {\cal J} |_{v_{[k]}} (\delta {\cal J}) &
1529     \stackrel{\bf adjoint}{\mathbf \longleftarrow} &
1530     \multicolumn{1}{c|}{ ad \, {\cal J} = \delta {\cal J}} \\
1531     \multicolumn{1}{|c}{~} & ~ & ~ & ~ & \multicolumn{1}{c|}{~} \\
1532     \hline
1533     ~ & ~ & ~ & ~ & ~ \\
1534 heimbach 1.4 \hspace*{15ex}{\Bigg\downarrow}
1535     \quad {\cal J}_{[k]}, \quad \nabla_u {\cal J}_{[k]}
1536     & ~ & ~ & ~ & ~ \\
1537 adcroft 1.1 ~ & ~ & ~ & ~ & ~ \\
1538     \hline
1539     \multicolumn{1}{|c}{~} & ~ & ~ & ~ & \multicolumn{1}{c|}{~} \\
1540     \multicolumn{1}{|c}{
1541     {\cal J}_{[k]} \,\, , \,\, \nabla_u {\cal J}_{[k]}} &
1542     {\mathbf \longrightarrow} & \text{\bf minimisation} &
1543     {\mathbf \longrightarrow} &
1544     \multicolumn{1}{c|}{ \Delta u_{[k+1]}} \\
1545     \multicolumn{1}{|c}{~} & ~ & ~ & ~ & \multicolumn{1}{c|}{~} \\
1546     \hline
1547     ~ & ~ & ~ & ~ & ~ \\
1548     ~ & ~ & ~ & ~ & \Big\downarrow \\
1549     ~ & ~ & ~ & ~ & \Delta u_{[k+1]} \\
1550     \end{array}
1551     \end{eqnarray*}
1552    
1553     The routines {\it ctrl\_unpack} and {\it ctrl\_pack} provide
1554     the link between the model and the minimization routine.
1555     As described in Section \ref{???}
1556     the {\it unpack} and {\it pack} routines read and write
1557     control and gradient {\it vectors} which are compressed
1558     to contain only wet points, in addition to the full
1559     2-dim. and 3-dim. fields.
1560     The corresponding I/O flow looks as follows:
1561    
1562     \vspace*{0.5cm}
1563    
1564 heimbach 1.4 {\scriptsize
1565 adcroft 1.1 \begin{tabular}{ccccc}
1566     {\bf vector\_ctrl\_$<$k$>$ } & ~ & ~ & ~ & ~ \\
1567     {\big\downarrow} & ~ & ~ & ~ & ~ \\
1568     \cline{1-1}
1569     \multicolumn{1}{|c|}{\it ctrl\_unpack} & ~ & ~ & ~ & ~ \\
1570     \cline{1-1}
1571     {\big\downarrow} & ~ & ~ & ~ & ~ \\
1572     \cline{3-3}
1573     \multicolumn{1}{l}{\bf xx\_theta0...$<$k$>$} & ~ &
1574     \multicolumn{1}{|c|}{~} & ~ & ~ \\
1575 heimbach 1.4 \multicolumn{1}{l}{\bf xx\_salt0...$<$k$>$} &
1576     $\stackrel{\mbox{read}}{\longrightarrow}$ &
1577 adcroft 1.1 \multicolumn{1}{|c|}{forward integration} & ~ & ~ \\
1578     \multicolumn{1}{l}{\bf \vdots} & ~ & \multicolumn{1}{|c|}{~}
1579     & ~ & ~ \\
1580     \cline{3-3}
1581 heimbach 1.4 ~ & ~ & $\downarrow$ & ~ & ~ \\
1582 adcroft 1.1 \cline{3-3}
1583     ~ & ~ &
1584     \multicolumn{1}{|c|}{~} & ~ &
1585     \multicolumn{1}{l}{\bf adxx\_theta0...$<$k$>$} \\
1586     ~ & ~ & \multicolumn{1}{|c|}{adjoint integration} &
1587 heimbach 1.4 $\stackrel{\mbox{write}}{\longrightarrow}$ &
1588 adcroft 1.1 \multicolumn{1}{l}{\bf adxx\_salt0...$<$k$>$} \\
1589     ~ & ~ & \multicolumn{1}{|c|}{~}
1590     & ~ & \multicolumn{1}{l}{\bf \vdots} \\
1591     \cline{3-3}
1592     ~ & ~ & ~ & ~ & {\big\downarrow} \\
1593     \cline{5-5}
1594     ~ & ~ & ~ & ~ & \multicolumn{1}{|c|}{\it ctrl\_pack} \\
1595     \cline{5-5}
1596     ~ & ~ & ~ & ~ & {\big\downarrow} \\
1597     ~ & ~ & ~ & ~ & {\bf vector\_grad\_$<$k$>$ } \\
1598     \end{tabular}
1599 heimbach 1.4 }
1600 adcroft 1.1
1601     \vspace*{0.5cm}
1602    
1603    
1604 heimbach 1.4 {\it ctrl\_unpack} reads the updated control vector
1605 adcroft 1.1 {\bf vector\_ctrl\_$<$k$>$}.
1606     It distributes the different control variables to
1607     2-dim. and 3-dim. files {\it xx\_...$<$k$>$}.
1608 heimbach 1.4 At the start of the forward integration the control variables
1609     are read from {\it xx\_...$<$k$>$} and added to the
1610     field.
1611     Correspondingly, at the end of the adjoint integration
1612     the adjoint fields are written
1613 adcroft 1.1 to {\it adxx\_...$<$k$>$}, again via the active file routines.
1614 heimbach 1.4 Finally, {\it ctrl\_pack} collects all adjoint files
1615 adcroft 1.1 and writes them to the compressed vector file
1616     {\bf vector\_grad\_$<$k$>$}.

  ViewVC Help
Powered by ViewVC 1.1.22