/[MITgcm]/manual/s_autodiff/text/doc_ad_2.tex
ViewVC logotype

Annotation of /manual/s_autodiff/text/doc_ad_2.tex

Parent Directory Parent Directory | Revision Log Revision Log | View Revision Graph Revision Graph


Revision 1.20 - (hide annotations) (download) (as text)
Wed Apr 5 02:27:33 2006 UTC (19 years, 3 months ago) by edhill
Branch: MAIN
Changes since 1.19: +33 -40 lines
File MIME type: application/x-tex
refer to the model uniformly as "MITgcm" and treat it as a proper noun

1 edhill 1.20 % $Header: /u/gcmpack/manual/part5/doc_ad_2.tex,v 1.19 2005/08/02 22:26:58 heimbach Exp $
2 heimbach 1.2 % $Name: $
3 adcroft 1.1
4 heimbach 1.18 Author: Patrick Heimbach
5    
6 adcroft 1.1 {\sf Automatic differentiation} (AD), also referred to as algorithmic
7 edhill 1.20 (or, more loosely, computational) differentiation, involves
8     automatically deriving code to calculate partial derivatives from an
9     existing fully non-linear prognostic code. (see \cite{gri:00}). A
10     software tool is used that parses and transforms source files
11     according to a set of linguistic and mathematical rules. AD tools are
12     like source-to-source translators in that they parse a program code as
13     input and produce a new program code as output. However, unlike a
14     pure source-to-source translation, the output program represents a new
15     algorithm, such as the evaluation of the Jacobian, the Hessian, or
16     higher derivative operators. In principle, a variety of derived
17     algorithms can be generated automatically in this way.
18    
19     MITgcm has been adapted for use with the Tangent linear and Adjoint
20     Model Compiler (TAMC) and its successor TAF (Transformation of
21     Algorithms in Fortran), developed by Ralf Giering (\cite{gie-kam:98},
22     \cite{gie:99,gie:00}). The first application of the adjoint of MITgcm
23     for sensitivity studies has been published by \cite{maro-eta:99}.
24     \cite{sta-eta:97,sta-eta:01} use MITgcm and its adjoint for ocean
25     state estimation studies. In the following we shall refer to TAMC and
26     TAF synonymously, except were explicitly stated otherwise.
27    
28     TAMC exploits the chain rule for computing the first derivative of a
29     function with respect to a set of input variables. Treating a given
30     forward code as a composition of operations -- each line representing
31     a compositional element, the chain rule is rigorously applied to the
32     code, line by line. The resulting tangent linear or adjoint code,
33     then, may be thought of as the composition in forward or reverse
34     order, respectively, of the Jacobian matrices of the forward code's
35     compositional elements.
36 adcroft 1.1
37     %**********************************************************************
38     \section{Some basic algebra}
39     \label{sec_ad_algebra}
40 edhill 1.17 \begin{rawhtml}
41     <!-- CMIREDIR:sec_ad_algebra: -->
42     \end{rawhtml}
43 adcroft 1.1 %**********************************************************************
44    
45     Let $ \cal{M} $ be a general nonlinear, model, i.e. a
46     mapping from the $m$-dimensional space
47     $U \subset I\!\!R^m$ of input variables
48     $\vec{u}=(u_1,\ldots,u_m)$
49     (model parameters, initial conditions, boundary conditions
50     such as forcing functions) to the $n$-dimensional space
51     $V \subset I\!\!R^n$ of
52     model output variable $\vec{v}=(v_1,\ldots,v_n)$
53 cnh 1.7 (model state, model diagnostics, objective function, ...)
54 adcroft 1.1 under consideration,
55     %
56     \begin{equation}
57     \begin{split}
58     {\cal M} \, : & \, U \,\, \longrightarrow \, V \\
59     ~ & \, \vec{u} \,\, \longmapsto \, \vec{v} \, = \,
60     {\cal M}(\vec{u})
61     \label{fulloperator}
62     \end{split}
63     \end{equation}
64     %
65     The vectors $ \vec{u} \in U $ and $ v \in V $ may be represented w.r.t.
66     some given basis vectors
67     $ {\rm span} (U) = \{ {\vec{e}_i} \}_{i = 1, \ldots , m} $ and
68     $ {\rm span} (V) = \{ {\vec{f}_j} \}_{j = 1, \ldots , n} $ as
69     \[
70     \vec{u} \, = \, \sum_{i=1}^{m} u_i \, {\vec{e}_i},
71     \qquad
72     \vec{v} \, = \, \sum_{j=1}^{n} v_j \, {\vec{f}_j}
73     \]
74    
75     Two routes may be followed to determine the sensitivity of the
76     output variable $\vec{v}$ to its input $\vec{u}$.
77    
78     \subsection{Forward or direct sensitivity}
79     %
80     Consider a perturbation to the input variables $\delta \vec{u}$
81     (typically a single component
82     $\delta \vec{u} = \delta u_{i} \, {\vec{e}_{i}}$).
83     Their effect on the output may be obtained via the linear
84     approximation of the model $ {\cal M}$ in terms of its Jacobian matrix
85     $ M $, evaluated in the point $u^{(0)}$ according to
86     %
87     \begin{equation}
88     \delta \vec{v} \, = \, M |_{\vec{u}^{(0)}} \, \delta \vec{u}
89     \label{tangent_linear}
90     \end{equation}
91     with resulting output perturbation $\delta \vec{v}$.
92     In components
93     $M_{j i} \, = \, \partial {\cal M}_{j} / \partial u_{i} $,
94     it reads
95     %
96     \begin{equation}
97     \delta v_{j} \, = \, \sum_{i}
98     \left. \frac{\partial {\cal M}_{j}}{\partial u_{i}} \right|_{u^{(0)}} \,
99     \delta u_{i}
100     \label{jacobi_matrix}
101     \end{equation}
102     %
103     Eq. (\ref{tangent_linear}) is the {\sf tangent linear model (TLM)}.
104     In contrast to the full nonlinear model $ {\cal M} $, the operator
105     $ M $ is just a matrix
106     which can readily be used to find the forward sensitivity of $\vec{v}$ to
107     perturbations in $u$,
108 heimbach 1.4 but if there are very many input variables $(\gg O(10^{6})$ for
109 adcroft 1.1 large-scale oceanographic application), it quickly becomes
110     prohibitive to proceed directly as in (\ref{tangent_linear}),
111     if the impact of each component $ {\bf e_{i}} $ is to be assessed.
112    
113     \subsection{Reverse or adjoint sensitivity}
114     %
115     Let us consider the special case of a
116     scalar objective function ${\cal J}(\vec{v})$ of the model output (e.g.
117     the total meridional heat transport,
118     the total uptake of $CO_{2}$ in the Southern
119     Ocean over a time interval,
120     or a measure of some model-to-data misfit)
121     %
122     \begin{eqnarray}
123     \begin{array}{cccccc}
124     {\cal J} \, : & U &
125     \longrightarrow & V &
126     \longrightarrow & I \!\! R \\
127     ~ & \vec{u} & \longmapsto & \vec{v}={\cal M}(\vec{u}) &
128     \longmapsto & {\cal J}(\vec{u}) = {\cal J}({\cal M}(\vec{u}))
129     \end{array}
130     \label{compo}
131     \end{eqnarray}
132     %
133 heimbach 1.4 The perturbation of $ {\cal J} $ around a fixed point $ {\cal J}_0 $,
134 adcroft 1.1 \[
135 heimbach 1.4 {\cal J} \, = \, {\cal J}_0 \, + \, \delta {\cal J}
136 adcroft 1.1 \]
137     can be expressed in both bases of $ \vec{u} $ and $ \vec{v} $
138     w.r.t. their corresponding inner product
139     $\left\langle \,\, , \,\, \right\rangle $
140     %
141     \begin{equation}
142     \begin{split}
143     {\cal J} & = \,
144     {\cal J} |_{\vec{u}^{(0)}} \, + \,
145     \left\langle \, \nabla _{u}{\cal J}^T |_{\vec{u}^{(0)}} \, , \, \delta \vec{u} \, \right\rangle
146     \, + \, O(\delta \vec{u}^2) \\
147     ~ & = \,
148     {\cal J} |_{\vec{v}^{(0)}} \, + \,
149     \left\langle \, \nabla _{v}{\cal J}^T |_{\vec{v}^{(0)}} \, , \, \delta \vec{v} \, \right\rangle
150     \, + \, O(\delta \vec{v}^2)
151     \end{split}
152     \label{deljidentity}
153     \end{equation}
154     %
155 heimbach 1.2 (note, that the gradient $ \nabla f $ is a co-vector, therefore
156 adcroft 1.1 its transpose is required in the above inner product).
157     Then, using the representation of
158     $ \delta {\cal J} =
159     \left\langle \, \nabla _{v}{\cal J}^T \, , \, \delta \vec{v} \, \right\rangle $,
160     the definition
161     of an adjoint operator $ A^{\ast} $ of a given operator $ A $,
162     \[
163     \left\langle \, A^{\ast} \vec{x} \, , \, \vec{y} \, \right\rangle =
164     \left\langle \, \vec{x} \, , \, A \vec{y} \, \right\rangle
165     \]
166     which for finite-dimensional vector spaces is just the
167     transpose of $ A $,
168     \[
169     A^{\ast} \, = \, A^T
170     \]
171 heimbach 1.4 and from eq. (\ref{tangent_linear}), (\ref{deljidentity}),
172     we note that
173 adcroft 1.1 (omitting $|$'s):
174     %
175     \begin{equation}
176     \delta {\cal J}
177     \, = \,
178     \left\langle \, \nabla _{v}{\cal J}^T \, , \, \delta \vec{v} \, \right\rangle
179     \, = \,
180     \left\langle \, \nabla _{v}{\cal J}^T \, , \, M \, \delta \vec{u} \, \right\rangle
181     \, = \,
182     \left\langle \, M^T \, \nabla _{v}{\cal J}^T \, , \,
183     \delta \vec{u} \, \right\rangle
184     \label{inner}
185     \end{equation}
186     %
187     With the identity (\ref{deljidentity}), we then find that
188     the gradient $ \nabla _{u}{\cal J} $ can be readily inferred by
189     invoking the adjoint $ M^{\ast } $ of the tangent linear model $ M $
190     %
191     \begin{equation}
192     \begin{split}
193     \nabla _{u}{\cal J}^T |_{\vec{u}} &
194     = \, M^T |_{\vec{u}} \cdot \nabla _{v}{\cal J}^T |_{\vec{v}} \\
195     ~ & = \, M^T |_{\vec{u}} \cdot \delta \vec{v}^{\ast} \\
196     ~ & = \, \delta \vec{u}^{\ast}
197     \end{split}
198     \label{adjoint}
199     \end{equation}
200     %
201     Eq. (\ref{adjoint}) is the {\sf adjoint model (ADM)},
202     in which $M^T$ is the adjoint (here, the transpose) of the
203     tangent linear operator $M$, $ \delta \vec{v}^{\ast} $
204     the adjoint variable of the model state $ \vec{v} $, and
205     $ \delta \vec{u}^{\ast} $ the adjoint variable of the control variable $ \vec{u} $.
206    
207     The {\sf reverse} nature of the adjoint calculation can be readily
208 heimbach 1.4 seen as follows.
209     Consider a model integration which consists of $ \Lambda $
210     consecutive operations
211     $ {\cal M}_{\Lambda} ( {\cal M}_{\Lambda-1} (
212     ...... ( {\cal M}_{\lambda} (
213     ......
214     ( {\cal M}_{1} ( {\cal M}_{0}(\vec{u}) )))) $,
215     where the ${\cal M}$'s could be the elementary steps, i.e. single lines
216     in the code of the model, or successive time steps of the
217     model integration,
218     starting at step 0 and moving up to step $\Lambda$, with intermediate
219     ${\cal M}_{\lambda} (\vec{u}) = \vec{v}^{(\lambda+1)}$ and final
220     ${\cal M}_{\Lambda} (\vec{u}) = \vec{v}^{(\Lambda+1)} = \vec{v}$.
221 cnh 1.7 Let ${\cal J}$ be a cost function which explicitly depends on the
222 heimbach 1.4 final state $\vec{v}$ only
223     (this restriction is for clarity reasons only).
224     %
225     ${\cal J}(u)$ may be decomposed according to:
226 adcroft 1.1 %
227     \begin{equation}
228     {\cal J}({\cal M}(\vec{u})) \, = \,
229     {\cal J} ( {\cal M}_{\Lambda} ( {\cal M}_{\Lambda-1} (
230     ...... ( {\cal M}_{\lambda} (
231     ......
232     ( {\cal M}_{1} ( {\cal M}_{0}(\vec{u}) )))))
233     \label{compos}
234     \end{equation}
235     %
236 heimbach 1.4 Then, according to the chain rule, the forward calculation reads,
237     in terms of the Jacobi matrices
238 adcroft 1.1 (we've omitted the $ | $'s which, nevertheless are important
239     to the aspect of {\it tangent} linearity;
240 heimbach 1.4 note also that by definition
241 adcroft 1.1 $ \langle \, \nabla _{v}{\cal J}^T \, , \, \delta \vec{v} \, \rangle
242     = \nabla_v {\cal J} \cdot \delta \vec{v} $ )
243     %
244     \begin{equation}
245     \begin{split}
246     \nabla_v {\cal J} (M(\delta \vec{u})) & = \,
247     \nabla_v {\cal J} \cdot M_{\Lambda}
248     \cdot ...... \cdot M_{\lambda} \cdot ...... \cdot
249     M_{1} \cdot M_{0} \cdot \delta \vec{u} \\
250     ~ & = \, \nabla_v {\cal J} \cdot \delta \vec{v} \\
251     \end{split}
252     \label{forward}
253     \end{equation}
254     %
255     whereas in reverse mode we have
256     %
257     \begin{equation}
258     \boxed{
259     \begin{split}
260     M^T ( \nabla_v {\cal J}^T) & = \,
261     M_{0}^T \cdot M_{1}^T
262     \cdot ...... \cdot M_{\lambda}^T \cdot ...... \cdot
263     M_{\Lambda}^T \cdot \nabla_v {\cal J}^T \\
264     ~ & = \, M_{0}^T \cdot M_{1}^T
265     \cdot ...... \cdot
266     \nabla_{v^{(\lambda)}} {\cal J}^T \\
267     ~ & = \, \nabla_u {\cal J}^T
268     \end{split}
269     }
270     \label{reverse}
271     \end{equation}
272     %
273     clearly expressing the reverse nature of the calculation.
274     Eq. (\ref{reverse}) is at the heart of automatic adjoint compilers.
275 heimbach 1.4 If the intermediate steps $\lambda$ in
276 adcroft 1.1 eqn. (\ref{compos}) -- (\ref{reverse})
277 heimbach 1.4 represent the model state (forward or adjoint) at each
278     intermediate time step as noted above, then correspondingly,
279     $ M^T (\delta \vec{v}^{(\lambda) \, \ast}) =
280     \delta \vec{v}^{(\lambda-1) \, \ast} $ for the adjoint variables.
281     It thus becomes evident that the adjoint calculation also
282     yields the adjoint of each model state component
283     $ \vec{v}^{(\lambda)} $ at each intermediate step $ \lambda $, namely
284 adcroft 1.1 %
285     \begin{equation}
286     \boxed{
287     \begin{split}
288     \nabla_{v^{(\lambda)}} {\cal J}^T |_{\vec{v}^{(\lambda)}}
289     & = \,
290     M_{\lambda}^T |_{\vec{v}^{(\lambda)}} \cdot ...... \cdot
291     M_{\Lambda}^T |_{\vec{v}^{(\lambda)}} \cdot \delta \vec{v}^{\ast} \\
292     ~ & = \, \delta \vec{v}^{(\lambda) \, \ast}
293     \end{split}
294     }
295     \end{equation}
296     %
297     in close analogy to eq. (\ref{adjoint})
298     We note in passing that that the $\delta \vec{v}^{(\lambda) \, \ast}$
299 heimbach 1.4 are the Lagrange multipliers of the model equations which determine
300     $ \vec{v}^{(\lambda)}$.
301 adcroft 1.1
302 cnh 1.7 In components, eq. (\ref{adjoint}) reads as follows.
303 adcroft 1.1 Let
304     \[
305     \begin{array}{rclcrcl}
306     \delta \vec{u} & = &
307     \left( \delta u_1,\ldots, \delta u_m \right)^T , & \qquad &
308     \delta \vec{u}^{\ast} \,\, = \,\, \nabla_u {\cal J}^T & = &
309     \left(
310     \frac{\partial {\cal J}}{\partial u_1},\ldots,
311     \frac{\partial {\cal J}}{\partial u_m}
312     \right)^T \\
313     \delta \vec{v} & = &
314     \left( \delta v_1,\ldots, \delta u_n \right)^T , & \qquad &
315     \delta \vec{v}^{\ast} \,\, = \,\, \nabla_v {\cal J}^T & = &
316     \left(
317     \frac{\partial {\cal J}}{\partial v_1},\ldots,
318     \frac{\partial {\cal J}}{\partial v_n}
319     \right)^T \\
320     \end{array}
321     \]
322     denote the perturbations in $\vec{u}$ and $\vec{v}$, respectively,
323 cnh 1.7 and their adjoint variables;
324 adcroft 1.1 further
325     \[
326     M \, = \, \left(
327     \begin{array}{ccc}
328     \frac{\partial {\cal M}_1}{\partial u_1} & \ldots &
329     \frac{\partial {\cal M}_1}{\partial u_m} \\
330     \vdots & ~ & \vdots \\
331     \frac{\partial {\cal M}_n}{\partial u_1} & \ldots &
332     \frac{\partial {\cal M}_n}{\partial u_m} \\
333     \end{array}
334     \right)
335     \]
336     is the Jacobi matrix of $ {\cal M} $
337     (an $ n \times m $ matrix)
338     such that $ \delta \vec{v} = M \cdot \delta \vec{u} $, or
339     \[
340     \delta v_{j}
341     \, = \, \sum_{i=1}^m M_{ji} \, \delta u_{i}
342     \, = \, \sum_{i=1}^m \, \frac{\partial {\cal M}_{j}}{\partial u_{i}}
343     \delta u_{i}
344     \]
345     %
346     Then eq. (\ref{adjoint}) takes the form
347     \[
348     \delta u_{i}^{\ast}
349     \, = \, \sum_{j=1}^n M_{ji} \, \delta v_{j}^{\ast}
350     \, = \, \sum_{j=1}^n \, \frac{\partial {\cal M}_{j}}{\partial u_{i}}
351     \delta v_{j}^{\ast}
352     \]
353     %
354     or
355     %
356     \[
357     \left(
358     \begin{array}{c}
359     \left. \frac{\partial}{\partial u_1} {\cal J} \right|_{\vec{u}^{(0)}} \\
360     \vdots \\
361     \left. \frac{\partial}{\partial u_m} {\cal J} \right|_{\vec{u}^{(0)}} \\
362     \end{array}
363     \right)
364     \, = \,
365     \left(
366     \begin{array}{ccc}
367     \left. \frac{\partial {\cal M}_1}{\partial u_1} \right|_{\vec{u}^{(0)}}
368     & \ldots &
369     \left. \frac{\partial {\cal M}_n}{\partial u_1} \right|_{\vec{u}^{(0)}} \\
370     \vdots & ~ & \vdots \\
371     \left. \frac{\partial {\cal M}_1}{\partial u_m} \right|_{\vec{u}^{(0)}}
372     & \ldots &
373     \left. \frac{\partial {\cal M}_n}{\partial u_m} \right|_{\vec{u}^{(0)}} \\
374     \end{array}
375     \right)
376     \cdot
377     \left(
378     \begin{array}{c}
379     \left. \frac{\partial}{\partial v_1} {\cal J} \right|_{\vec{v}} \\
380     \vdots \\
381     \left. \frac{\partial}{\partial v_n} {\cal J} \right|_{\vec{v}} \\
382     \end{array}
383     \right)
384     \]
385     %
386     Furthermore, the adjoint $ \delta v^{(\lambda) \, \ast} $
387     of any intermediate state $ v^{(\lambda)} $
388     may be obtained, using the intermediate Jacobian
389     (an $ n_{\lambda+1} \times n_{\lambda} $ matrix)
390     %
391     \[
392     M_{\lambda} \, = \,
393     \left(
394     \begin{array}{ccc}
395     \frac{\partial ({\cal M}_{\lambda})_1}{\partial v^{(\lambda)}_1}
396     & \ldots &
397     \frac{\partial ({\cal M}_{\lambda})_1}{\partial v^{(\lambda)}_{n_{\lambda}}} \\
398     \vdots & ~ & \vdots \\
399     \frac{\partial ({\cal M}_{\lambda})_{n_{\lambda+1}}}{\partial v^{(\lambda)}_1}
400     & \ldots &
401     \frac{\partial ({\cal M}_{\lambda})_{n_{\lambda+1}}}{\partial v^{(\lambda)}_{n_{\lambda}}} \\
402     \end{array}
403     \right)
404     \]
405     %
406     and the shorthand notation for the adjoint variables
407     $ \delta v^{(\lambda) \, \ast}_{j} = \frac{\partial}{\partial v^{(\lambda)}_{j}}
408     {\cal J}^T $, $ j = 1, \ldots , n_{\lambda} $,
409     for intermediate components, yielding
410 heimbach 1.4 \begin{equation}
411     \small
412     \begin{split}
413 adcroft 1.1 \left(
414     \begin{array}{c}
415     \delta v^{(\lambda) \, \ast}_1 \\
416     \vdots \\
417     \delta v^{(\lambda) \, \ast}_{n_{\lambda}} \\
418     \end{array}
419     \right)
420 heimbach 1.4 \, = &
421 adcroft 1.1 \left(
422     \begin{array}{ccc}
423     \frac{\partial ({\cal M}_{\lambda})_1}{\partial v^{(\lambda)}_1}
424 heimbach 1.4 & \ldots \,\, \ldots &
425 adcroft 1.1 \frac{\partial ({\cal M}_{\lambda})_{n_{\lambda+1}}}{\partial v^{(\lambda)}_1} \\
426     \vdots & ~ & \vdots \\
427     \frac{\partial ({\cal M}_{\lambda})_1}{\partial v^{(\lambda)}_{n_{\lambda}}}
428 heimbach 1.4 & \ldots \,\, \ldots &
429 adcroft 1.1 \frac{\partial ({\cal M}_{\lambda})_{n_{\lambda+1}}}{\partial v^{(\lambda)}_{n_{\lambda}}} \\
430     \end{array}
431     \right)
432 heimbach 1.4 \cdot
433 adcroft 1.1 %
434 heimbach 1.4 \\ ~ & ~
435     \\ ~ &
436 adcroft 1.1 %
437     \left(
438     \begin{array}{ccc}
439     \frac{\partial ({\cal M}_{\lambda+1})_1}{\partial v^{(\lambda+1)}_1}
440     & \ldots &
441     \frac{\partial ({\cal M}_{\lambda+1})_{n_{\lambda+2}}}{\partial v^{(\lambda+1)}_1} \\
442     \vdots & ~ & \vdots \\
443     \vdots & ~ & \vdots \\
444     \frac{\partial ({\cal M}_{\lambda+1})_1}{\partial v^{(\lambda+1)}_{n_{\lambda+1}}}
445     & \ldots &
446     \frac{\partial ({\cal M}_{\lambda+1})_{n_{\lambda+2}}}{\partial v^{(\lambda+1)}_{n_{\lambda+1}}} \\
447     \end{array}
448     \right)
449 heimbach 1.4 \cdot \, \ldots \, \cdot
450 adcroft 1.1 \left(
451     \begin{array}{c}
452     \delta v^{\ast}_1 \\
453     \vdots \\
454     \delta v^{\ast}_{n} \\
455     \end{array}
456     \right)
457 heimbach 1.4 \end{split}
458     \end{equation}
459 adcroft 1.1
460     Eq. (\ref{forward}) and (\ref{reverse}) are perhaps clearest in
461     showing the advantage of the reverse over the forward mode
462     if the gradient $\nabla _{u}{\cal J}$, i.e. the sensitivity of the
463     cost function $ {\cal J} $ with respect to {\it all} input
464     variables $u$
465     (or the sensitivity of the cost function with respect to
466     {\it all} intermediate states $ \vec{v}^{(\lambda)} $) are sought.
467     In order to be able to solve for each component of the gradient
468     $ \partial {\cal J} / \partial u_{i} $ in (\ref{forward})
469 cnh 1.7 a forward calculation has to be performed for each component separately,
470 adcroft 1.1 i.e. $ \delta \vec{u} = \delta u_{i} {\vec{e}_{i}} $
471     for the $i$-th forward calculation.
472     Then, (\ref{forward}) represents the
473     projection of $ \nabla_u {\cal J} $ onto the $i$-th component.
474     The full gradient is retrieved from the $ m $ forward calculations.
475     In contrast, eq. (\ref{reverse}) yields the full
476     gradient $\nabla _{u}{\cal J}$ (and all intermediate gradients
477     $\nabla _{v^{(\lambda)}}{\cal J}$) within a single reverse calculation.
478    
479 heimbach 1.4 Note, that if $ {\cal J} $ is a vector-valued function
480 adcroft 1.1 of dimension $ l > 1 $,
481     eq. (\ref{reverse}) has to be modified according to
482     \[
483     M^T \left( \nabla_v {\cal J}^T \left(\delta \vec{J}\right) \right)
484     \, = \,
485     \nabla_u {\cal J}^T \cdot \delta \vec{J}
486     \]
487 heimbach 1.4 where now $ \delta \vec{J} \in I\!\!R^l $ is a vector of
488 cnh 1.7 dimension $ l $.
489 adcroft 1.1 In this case $ l $ reverse simulations have to be performed
490     for each $ \delta J_{k}, \,\, k = 1, \ldots, l $.
491     Then, the reverse mode is more efficient as long as
492     $ l < n $, otherwise the forward mode is preferable.
493 cnh 1.7 Strictly, the reverse mode is called adjoint mode only for
494 adcroft 1.1 $ l = 1 $.
495    
496     A detailed analysis of the underlying numerical operations
497     shows that the computation of $\nabla _{u}{\cal J}$ in this way
498     requires about 2 to 5 times the computation of the cost function.
499     Alternatively, the gradient vector could be approximated
500     by finite differences, requiring $m$ computations
501     of the perturbed cost function.
502    
503     To conclude we give two examples of commonly used types
504     of cost functions:
505    
506     \paragraph{Example 1:
507     $ {\cal J} = v_{j} (T) $} ~ \\
508     The cost function consists of the $j$-th component of the model state
509     $ \vec{v} $ at time $T$.
510     Then $ \nabla_v {\cal J}^T = {\vec{f}_{j}} $ is just the $j$-th
511     unit vector. The $ \nabla_u {\cal J}^T $
512     is the projection of the adjoint
513     operator onto the $j$-th component ${\bf f_{j}}$,
514     \[
515     \nabla_u {\cal J}^T
516     \, = \, M^T \cdot \nabla_v {\cal J}^T
517     \, = \, \sum_{i} M^T_{ji} \, {\vec{e}_{i}}
518     \]
519    
520     \paragraph{Example 2:
521     $ {\cal J} = \langle \, {\cal H}(\vec{v}) - \vec{d} \, ,
522     \, {\cal H}(\vec{v}) - \vec{d} \, \rangle $} ~ \\
523 heimbach 1.4 The cost function represents the quadratic model vs. data misfit.
524 adcroft 1.1 Here, $ \vec{d} $ is the data vector and $ {\cal H} $ represents the
525     operator which maps the model state space onto the data space.
526     Then, $ \nabla_v {\cal J} $ takes the form
527     %
528     \begin{equation*}
529     \begin{split}
530     \nabla_v {\cal J}^T & = \, 2 \, \, H \cdot
531     \left( \, {\cal H}(\vec{v}) - \vec{d} \, \right) \\
532     ~ & = \, 2 \sum_{j} \left\{ \sum_k
533     \frac{\partial {\cal H}_k}{\partial v_{j}}
534     \left( {\cal H}_k (\vec{v}) - d_k \right)
535     \right\} \, {\vec{f}_{j}} \\
536     \end{split}
537     \end{equation*}
538     %
539     where $H_{kj} = \partial {\cal H}_k / \partial v_{j} $ is the
540     Jacobi matrix of the data projection operator.
541     Thus, the gradient $ \nabla_u {\cal J} $ is given by the
542     adjoint operator,
543     driven by the model vs. data misfit:
544     \[
545     \nabla_u {\cal J}^T \, = \, 2 \, M^T \cdot
546     H \cdot \left( {\cal H}(\vec{v}) - \vec{d} \, \right)
547     \]
548    
549     \subsection{Storing vs. recomputation in reverse mode}
550     \label{checkpointing}
551    
552     We note an important aspect of the forward vs. reverse
553     mode calculation.
554 heimbach 1.4 Because of the local character of the derivative
555     (a derivative is defined w.r.t. a point along the trajectory),
556 adcroft 1.1 the intermediate results of the model trajectory
557     $\vec{v}^{(\lambda+1)}={\cal M}_{\lambda}(v^{(\lambda)})$
558 heimbach 1.15 may be required to evaluate the intermediate Jacobian
559 adcroft 1.1 $M_{\lambda}|_{\vec{v}^{(\lambda)}} \, \delta \vec{v}^{(\lambda)} $.
560 heimbach 1.15 This is the case e.g. for nonlinear expressions
561     (momentum advection, nonlinear equation of state), state-dependent
562     conditional statements (parameterization schemes).
563 adcroft 1.1 In the forward mode, the intermediate results are required
564     in the same order as computed by the full forward model ${\cal M}$,
565 heimbach 1.4 but in the reverse mode they are required in the reverse order.
566 adcroft 1.1 Thus, in the reverse mode the trajectory of the forward model
567     integration ${\cal M}$ has to be stored to be available in the reverse
568 heimbach 1.4 calculation. Alternatively, the complete model state up to the
569     point of evaluation has to be recomputed whenever its value is required.
570 adcroft 1.1
571     A method to balance the amount of recomputations vs.
572     storage requirements is called {\sf checkpointing}
573 heimbach 1.15 (e.g. \cite{gri:92}, \cite{res-eta:98}).
574 adcroft 1.6 It is depicted in \ref{fig:3levelcheck} for a 3-level checkpointing
575 heimbach 1.4 [as an example, we give explicit numbers for a 3-day
576 adcroft 1.1 integration with a 1-hourly timestep in square brackets].
577     \begin{itemize}
578     %
579     \item [$lev3$]
580     In a first step, the model trajectory is subdivided into
581     $ {n}^{lev3} $ subsections [$ {n}^{lev3} $=3 1-day intervals],
582     with the label $lev3$ for this outermost loop.
583 heimbach 1.4 The model is then integrated along the full trajectory,
584 heimbach 1.15 and the model state stored to disk only at every $ k_{i}^{lev3} $-th timestep
585 adcroft 1.1 [i.e. 3 times, at
586     $ i = 0,1,2 $ corresponding to $ k_{i}^{lev3} = 0, 24, 48 $].
587 heimbach 1.15 In addition, the cost function is computed, if needed.
588 adcroft 1.1 %
589     \item [$lev2$]
590 heimbach 1.4 In a second step each subsection itself is divided into
591 heimbach 1.15 $ {n}^{lev2} $ subsections
592 adcroft 1.1 [$ {n}^{lev2} $=4 6-hour intervals per subsection].
593     The model picks up at the last outermost dumped state
594 heimbach 1.4 $ v_{k_{n}^{lev3}} $ and is integrated forward in time along
595 adcroft 1.1 the last subsection, with the label $lev2$ for this
596     intermediate loop.
597 heimbach 1.15 The model state is now stored to disk at every $ k_{i}^{lev2} $-th
598 adcroft 1.1 timestep
599     [i.e. 4 times, at
600     $ i = 0,1,2,3 $ corresponding to $ k_{i}^{lev2} = 48, 54, 60, 66 $].
601     %
602     \item [$lev1$]
603 heimbach 1.4 Finally, the model picks up at the last intermediate dump state
604     $ v_{k_{n}^{lev2}} $ and is integrated forward in time along
605 heimbach 1.15 the last subsection, with the label $lev1$ for this
606 adcroft 1.1 intermediate loop.
607 heimbach 1.15 Within this sub-subsection only, parts of the model state is stored
608     to memory at every timestep
609 adcroft 1.1 [i.e. every hour $ i=0,...,5$ corresponding to
610     $ k_{i}^{lev1} = 66, 67, \ldots, 71 $].
611 heimbach 1.15 The final state $ v_n = v_{k_{n}^{lev1}} $ is reached
612     and the model state of all preceding timesteps along the last
613     innermost subsection are available, enabling integration backwards
614     in time along the last subsection.
615     The adjoint can thus be computed along this last
616     subsection $k_{n}^{lev2}$.
617 adcroft 1.1 %
618     \end{itemize}
619     %
620     This procedure is repeated consecutively for each previous
621 heimbach 1.15 subsection $k_{n-1}^{lev2}, \ldots, k_{1}^{lev2} $
622 adcroft 1.1 carrying the adjoint computation to the initial time
623     of the subsection $k_{n}^{lev3}$.
624     Then, the procedure is repeated for the previous subsection
625     $k_{n-1}^{lev3}$
626     carrying the adjoint computation to the initial time
627     $k_{1}^{lev3}$.
628    
629     For the full model trajectory of
630     $ n^{lev3} \cdot n^{lev2} \cdot n^{lev1} $ timesteps
631     the required storing of the model state was significantly reduced to
632 heimbach 1.15 $ n^{lev2} + n^{lev3} $ to disk and roughly $ n^{lev1} $ to memory
633 adcroft 1.1 [i.e. for the 3-day integration with a total oof 72 timesteps
634 heimbach 1.15 the model state was stored 7 times to disk and roughly 6 times
635     to memory].
636 adcroft 1.1 This saving in memory comes at a cost of a required
637     3 full forward integrations of the model (one for each
638     checkpointing level).
639 heimbach 1.15 The optimal balance of storage vs. recomputation certainly depends
640     on the computing resources available and may be adjusted by
641     adjusting the partitioning among the
642     $ n^{lev3}, \,\, n^{lev2}, \,\, n^{lev1} $.
643 adcroft 1.1
644     \begin{figure}[t!]
645 adcroft 1.6 \begin{center}
646 adcroft 1.1 %\psdraft
647 adcroft 1.6 %\psfrag{v_k1^lev3}{\mathinfigure{v_{k_{1}^{lev3}}}}
648     %\psfrag{v_kn-1^lev3}{\mathinfigure{v_{k_{n-1}^{lev3}}}}
649     %\psfrag{v_kn^lev3}{\mathinfigure{v_{k_{n}^{lev3}}}}
650     %\psfrag{v_k1^lev2}{\mathinfigure{v_{k_{1}^{lev2}}}}
651     %\psfrag{v_kn-1^lev2}{\mathinfigure{v_{k_{n-1}^{lev2}}}}
652     %\psfrag{v_kn^lev2}{\mathinfigure{v_{k_{n}^{lev2}}}}
653     %\psfrag{v_k1^lev1}{\mathinfigure{v_{k_{1}^{lev1}}}}
654     %\psfrag{v_kn^lev1}{\mathinfigure{v_{k_{n}^{lev1}}}}
655     %\mbox{\epsfig{file=part5/checkpointing.eps, width=0.8\textwidth}}
656     \resizebox{5.5in}{!}{\includegraphics{part5/checkpointing.eps}}
657 adcroft 1.1 %\psfull
658 adcroft 1.6 \end{center}
659     \caption{
660     Schematic view of intermediate dump and restart for
661 adcroft 1.1 3-level checkpointing.}
662 heimbach 1.4 \label{fig:3levelcheck}
663 adcroft 1.1 \end{figure}
664    
665 heimbach 1.4 % \subsection{Optimal perturbations}
666     % \label{sec_optpert}
667 adcroft 1.1
668    
669 heimbach 1.4 % \subsection{Error covariance estimate and Hessian matrix}
670     % \label{sec_hessian}
671 adcroft 1.1
672     \newpage
673    
674     %**********************************************************************
675 heimbach 1.4 \section{TLM and ADM generation in general}
676 adcroft 1.1 \label{sec_ad_setup_gen}
677 edhill 1.17 \begin{rawhtml}
678     <!-- CMIREDIR:sec_ad_setup_gen: -->
679     \end{rawhtml}
680 adcroft 1.1 %**********************************************************************
681    
682     In this section we describe in a general fashion
683     the parts of the code that are relevant for automatic
684 heimbach 1.16 differentiation using the software tool TAF.
685 adcroft 1.1
686 heimbach 1.4 \input{part5/doc_ad_the_model}
687    
688 adcroft 1.6 The basic flow is depicted in \ref{fig:adthemodel}.
689 heimbach 1.19 If CPP option \texttt{ALLOW\_AUTODIFF\_TAMC} is defined,
690     the driver routine
691 heimbach 1.4 {\it the\_model\_main}, instead of calling {\it the\_main\_loop},
692 heimbach 1.19 invokes the adjoint of this routine, {\it adthe\_main\_loop}
693     (case \texttt{\#define ALLOW\_ADJOINT\_RUN}), or
694     the tangent linear of this routine {\it g\_the\_main\_loop}
695     (case \texttt{\#define ALLOW\_TANGENTLINEAR\_RUN}),
696     which are the toplevel routines in terms of automatic differentiation.
697     The routines {\it adthe\_main\_loop} or {\it g\_the\_main\_loop}
698     are generated by TAF.
699 heimbach 1.16 It contains both the forward integration of the full model, the
700     cost function calculation,
701 heimbach 1.4 any additional storing that is required for efficient checkpointing,
702     and the reverse integration of the adjoint model.
703 heimbach 1.16
704     [DESCRIBE IN A SEPARATE SECTION THE WORKING OF THE TLM]
705    
706     In Fig. \ref{fig:adthemodel}
707     the structure of {\it adthe\_main\_loop} has been strongly
708     simplified to focus on the essentials; in particular, no checkpointing
709 heimbach 1.4 procedures are shown here.
710     Prior to the call of {\it adthe\_main\_loop}, the routine
711 heimbach 1.15 {\it ctrl\_unpack} is invoked to unpack the control vector
712     or initialise the control variables.
713     Following the call of {\it adthe\_main\_loop},
714     the routine {\it ctrl\_pack}
715 heimbach 1.4 is invoked to pack the control vector
716     (cf. Section \ref{section_ctrl}).
717     If gradient checks are to be performed, the option
718     {\tt ALLOW\_GRADIENT\_CHECK} is defined. In this case
719     the driver routine {\it grdchk\_main} is called after
720     the gradient has been computed via the adjoint
721     (cf. Section \ref{section_grdchk}).
722    
723 heimbach 1.16 %------------------------------------------------------------------
724    
725     \subsection{General setup
726     \label{section_ad_setup}}
727    
728     In order to configure AD-related setups the following packages need
729     to be enabled:
730     {\it
731     \begin{table}[h!]
732     \begin{tabular}{l}
733     autodiff \\
734     ctrl \\
735     cost \\
736     grdchk \\
737     \end{tabular}
738     \end{table}
739     }
740     The packages are enabled by adding them to your experiment-specific
741     configuration file
742     {\it packages.conf} (see Section ???).
743    
744     The following AD-specific CPP option files need to be customized:
745     %
746     \begin{itemize}
747     %
748     \item {\it ECCO\_CPPOPTIONS.h} \\
749     This header file collects CPP options for the packages
750     {\it autodiff, cost, ctrl} as well as AD-unrelated options for
751     the external forcing package {\it exf}.
752     \footnote{NOTE: These options are not set in their package-specific
753     headers such as {\it COST\_CPPOPTIONS.h}, but are instead collected
754     in the single header file {\it ECCO\_CPPOPTIONS.h}.
755     The package-specific header files serve as simple
756     placeholders at this point.}
757     %
758     \item {\it tamc.h} \\
759     This header configures the splitting of the time stepping loop
760     w.r.t. the 3-level checkpointing (see section ???).
761    
762     %
763     \end{itemize}
764    
765     %------------------------------------------------------------------
766    
767     \subsection{Building the AD code
768     \label{section_ad_build}}
769    
770     The build process of an AD code is very similar to building
771     the forward model. However, depending on which AD code one wishes
772     to generate, and on which AD tool is available (TAF or TAMC),
773     the following {\tt make} targets are available:
774    
775     \begin{table}[h!]
776     {\footnotesize
777     \begin{tabular}{ccll}
778     ~ & {\it AD-target} & {\it output} & {\it description} \\
779     \hline
780     \hline
781     (1) & {\tt <MODE><TOOL>only} & {\tt <MODE>\_<TOOL>\_output.f} &
782     generates code for $<$MODE$>$ using $<$TOOL$>$ \\
783     ~ & ~ & ~ & no {\tt make} dependencies on {\tt .F .h} \\
784     ~ & ~ & ~ & useful for compiling on remote platforms \\
785     \hline
786     (2) & {\tt <MODE><TOOL>} & {\tt <MODE>\_<TOOL>\_output.f} &
787     generates code for $<$MODE$>$ using $<$TOOL$>$ \\
788     ~ & ~ & ~ & includes {\tt make} dependencies on {\tt .F .h} \\
789     ~ & ~ & ~ & i.e. input for $<$TOOL$>$ may be re-generated \\
790     \hline
791     (3) & {\tt <MODE>all} & {\tt mitgcmuv\_<MODE>} &
792     generates code for $<$MODE$>$ using $<$TOOL$>$ \\
793     ~ & ~ & ~ & and compiles all code \\
794     ~ & ~ & ~ & (use of TAF is set as default) \\
795     \hline
796     \hline
797     \end{tabular}
798     }
799     \end{table}
800     %
801     Here, the following placeholders are used
802     %
803     \begin{itemize}
804     %
805     \item [$<$TOOL$>$]
806     %
807     \begin{itemize}
808     %
809     \item {\tt TAF}
810     \item {\tt TAMC}
811     %
812     \end{itemize}
813     %
814     \item [$<$MODE$>$]
815     %
816     \begin{itemize}
817     %
818     \item {\tt ad} generates the adjoint model (ADM)
819     \item {\tt ftl} generates the tangent linear model (TLM)
820     \item {\tt svd} generates both ADM and TLM for \\
821     singular value decomposition (SVD) type calculations
822     %
823     \end{itemize}
824     %
825     \end{itemize}
826    
827     For example, to generate the adjoint model using TAF after routines ({\tt .F})
828     or headers ({\tt .h}) have been modified, but without compilation,
829     type {\tt make adtaf};
830     or, to generate the tangent linear model using TAMC without
831     re-generating the input code, type {\tt make ftltamconly}.
832    
833    
834     A typical full build process to generate the ADM via TAF would
835     look like follows:
836     \begin{verbatim}
837     % mkdir build
838     % cd build
839     % ../../../tools/genmake2 -mods=../code_ad
840     % make depend
841     % make adall
842     \end{verbatim}
843    
844     %------------------------------------------------------------------
845    
846     \subsection{The AD build process in detail
847     \label{section_ad_build_detail}}
848    
849     The {\tt make <MODE>all} target consists of the following procedures:
850    
851     \begin{enumerate}
852     %
853     \item
854     A header file {\tt AD\_CONFIG.h} is generated which contains a CPP option
855     on which code ought to be generated. Depending on the {\tt make} target,
856     the contents is
857     \begin{itemize}
858     \item
859     {\tt \#define ALLOW\_ADJOINT\_RUN}
860     \item
861     {\tt \#define ALLOW\_TANGENTLINEAR\_RUN}
862     \item
863     {\tt \#define ALLOW\_ECCO\_OPTIMIZATION}
864     \end{itemize}
865     %
866     \item
867     A single file {\tt <MODE>\_input\_code.f} is concatenated
868     consisting of all {\tt .f} files that are part of the list {\bf AD\_FILES}
869     and all {\tt .flow} files that are part of the list {\bf AD\_FLOW\_FILES}.
870     %
871     \item
872     The AD tool is invoked with the {\bf <MODE>\_<TOOL>\_FLAGS}.
873     The default AD tool flags in {\tt genmake2} can be overrwritten by
874     an {\tt adjoint\_options} file (similar to the platform-specific
875     {\tt build\_options}, see Section ???.
876     The AD tool writes the resulting AD code into the file
877     {\tt <MODE>\_input\_code\_ad.f}
878     %
879     \item
880     A short sed script {\tt adjoint\_sed} is applied to
881     {\tt <MODE>\_input\_code\_ad.f}
882     to reinstate {\bf myThid} into the CALL argument list of active file I/O.
883     The result is written to file {\tt <MODE>\_<TOOL>\_output.f}.
884     %
885     \item
886     All routines are compiled and an executable is generated
887     (see Table ???).
888     %
889     \end{enumerate}
890    
891     \subsubsection{The list AD\_FILES and {\tt .list} files}
892    
893     Not all routines are presented to the AD tool.
894     Routines typically hidden are diagnostics routines which
895     do not influence the cost function, but may create
896     artificial flow dependencies such as I/O of active variables.
897    
898     {\tt genmake2} generates a list (or variable) {\bf AD\_FILES}
899     which contains all routines that are shown to the AD tool.
900     This list is put together from all files with suffix {\tt .list}
901     that {\tt genmake2} finds in its search directories.
902     The list file for the core MITgcm routines is in {\tt model/src/}
903     is called {\tt model\_ad\_diff.list}.
904     Note that no wrapper routine is shown to TAF. These are either
905     not visible at all to the AD code, or hand-written AD code
906     is available (see next section).
907    
908     Each package directory contains its package-specific
909     list file {\tt <PKG>\_ad\_diff.list}. For example,
910     {\tt pkg/ptracers/} contains the file {\tt ptracers\_ad\_diff.list}.
911     Thus, enabling a package will automatically extend the
912     {\bf AD\_FILES} list of {\tt genmake2} to incorporate the
913     package-specific routines.
914     Note that you will need to regenerate the {\tt Makefile} if
915     you enable a package (e.g. by adding it to {\tt packages.conf})
916     and a {\tt Makefile} already exists.
917    
918     \subsubsection{The list AD\_FLOW\_FILES and {\tt .flow} files}
919    
920     TAMC and TAF can evaluate user-specified directives
921     that start with a specific syntax ({\tt CADJ}, {\tt C\$TAF}, {\tt !\$TAF}).
922     The main categories of directives are STORE directives and
923     FLOW directives. Here, we are concerned with flow directives,
924     store directives are treated elsewhere.
925    
926     Flow directives enable the AD tool to evaluate how it should treat
927     routines that are 'hidden' by the user, i.e. routines which are
928     not contained in the {\bf AD\_FILES} list (see previous section),
929     but which are called in part of the code that the AD tool does see.
930     The flow directive tell the AD tool
931     %
932     \begin{itemize}
933     %
934     \item which subroutine arguments are input/output
935     \item which subroutine arguments are active
936     \item which subroutine arguments are required to compute the cost
937     \item which subroutine arguments are dependent
938     %
939     \end{itemize}
940     %
941     The syntax for the flow directives can be found in the
942     AD tool manuals.
943    
944     {\tt genmake2} generates a list (or variable) {\bf AD\_FLOW\_FILES}
945     which contains all files with suffix{\tt .flow} that it finds
946     in its search directories.
947     The flow directives for the core MITgcm routines of
948     {\tt eesupp/src/} and {\tt model/src/}
949     reside in {\tt pkg/autodiff/}.
950     This directory also contains hand-written adjoint code
951 edhill 1.20 for the MITgcm WRAPPER (section \ref{chap:sarch}).
952 heimbach 1.16
953     Flow directives for package-specific routines are contained in
954     the corresponding package directories in the file
955     {\tt <PKG>\_ad.flow}, e.g. ptracers-specific directives are in
956     {\tt ptracers\_ad.flow}.
957    
958     \subsubsection{Store directives for 3-level checkpointing}
959    
960     The storing that is required at each period of the
961     3-level checkpointing is controled by three
962     top-level headers.
963    
964     \begin{verbatim}
965     do ilev_3 = 1, nchklev_3
966     # include ``checkpoint_lev3.h''
967     do ilev_2 = 1, nchklev_2
968     # include ``checkpoint_lev2.h''
969     do ilev_1 = 1, nchklev_1
970     # include ``checkpoint_lev1.h''
971    
972     ...
973    
974     end do
975     end do
976     end do
977     \end{verbatim}
978    
979     All files {\tt checkpoint\_lev?.h} are contained in directory
980     {\tt pkg/autodiff/}.
981    
982    
983     \subsubsection{Changing the default AD tool flags: ad\_options files}
984    
985    
986     \subsubsection{Hand-written adjoint code}
987    
988     %------------------------------------------------------------------
989    
990 heimbach 1.4 \subsection{The cost function (dependent variable)
991     \label{section_cost}}
992 adcroft 1.1
993     The cost function $ {\cal J} $ is referred to as the {\sf dependent variable}.
994     It is a function of the input variables $ \vec{u} $ via the composition
995     $ {\cal J}(\vec{u}) \, = \, {\cal J}(M(\vec{u})) $.
996 heimbach 1.15 The input are referred to as the
997 adcroft 1.1 {\sf independent variables} or {\sf control variables}.
998     All aspects relevant to the treatment of the cost function $ {\cal J} $
999 cnh 1.7 (parameter setting, initialization, accumulation,
1000 heimbach 1.4 final evaluation), are controlled by the package {\it pkg/cost}.
1001 heimbach 1.15 The aspects relevant to the treatment of the independent variables
1002     are controlled by the package {\it pkg/ctrl} and will be treated
1003     in the next section.
1004 heimbach 1.4
1005     \input{part5/doc_cost_flow}
1006 adcroft 1.1
1007 heimbach 1.16 \subsubsection{Enabling the package}
1008    
1009 adcroft 1.1 \fbox{
1010     \begin{minipage}{12cm}
1011 heimbach 1.16 {\it packages.conf}, {\it ECCO\_CPPOPTIONS.h}
1012 adcroft 1.1 \end{minipage}
1013     }
1014 heimbach 1.16 \begin{itemize}
1015 adcroft 1.1 %
1016 heimbach 1.16 \item
1017     The package is enabled by adding {\it cost} to your file {\it packages.conf}
1018     (see Section ???)
1019 adcroft 1.1 %
1020 heimbach 1.16 \item
1021    
1022    
1023     \end{itemize}
1024 adcroft 1.1 %
1025 heimbach 1.16
1026 heimbach 1.15 N.B.: In general the following packages ought to be enabled
1027     simultaneously: {\it autodiff, cost, ctrl}.
1028 heimbach 1.4 The basic CPP option to enable the cost function is {\bf ALLOW\_COST}.
1029     Each specific cost function contribution has its own option.
1030     For the present example the option is {\bf ALLOW\_COST\_TRACER}.
1031     All cost-specific options are set in {\it ECCO\_CPPOPTIONS.h}
1032 adcroft 1.1 Since the cost function is usually used in conjunction with
1033     automatic differentiation, the CPP option
1034 heimbach 1.15 {\bf ALLOW\_ADJOINT\_RUN} (file {\it CPP\_OPTIONS.h}) and
1035     {\bf ALLOW\_AUTODIFF\_TAMC} (file {\it ECCO\_CPPOPTIONS.h})
1036     should be defined.
1037 adcroft 1.1
1038 cnh 1.7 \subsubsection{Initialization}
1039 adcroft 1.1 %
1040 cnh 1.7 The initialization of the {\it cost} package is readily enabled
1041 heimbach 1.15 as soon as the CPP option {\bf ALLOW\_COST} is defined.
1042 adcroft 1.1 %
1043     \begin{itemize}
1044     %
1045     \item
1046     \fbox{
1047     \begin{minipage}{12cm}
1048     Parameters: {\it cost\_readparms}
1049     \end{minipage}
1050     }
1051     \\
1052     This S/R
1053     reads runtime flags and parameters from file {\it data.cost}.
1054     For the present example the only relevant parameter read
1055     is {\bf mult\_tracer}. This multiplier enables different
1056     cost function contributions to be switched on
1057     ( = 1.) or off ( = 0.) at runtime.
1058     For more complex cost functions which involve model vs. data
1059     misfits, the corresponding data filenames and data
1060     specifications (start date and time, period, ...) are read
1061     in this S/R.
1062     %
1063     \item
1064     \fbox{
1065     \begin{minipage}{12cm}
1066     Variables: {\it cost\_init}
1067     \end{minipage}
1068     }
1069     \\
1070     This S/R
1071 cnh 1.7 initializes the different cost function contributions.
1072     The contribution for the present example is {\bf objf\_tracer}
1073 adcroft 1.1 which is defined on each tile (bi,bj).
1074     %
1075     \end{itemize}
1076     %
1077 heimbach 1.4 \subsubsection{Accumulation}
1078 adcroft 1.1 %
1079     \begin{itemize}
1080     %
1081     \item
1082     \fbox{
1083     \begin{minipage}{12cm}
1084     {\it cost\_tile}, {\it cost\_tracer}
1085     \end{minipage}
1086     }
1087     \end{itemize}
1088     %
1089     The 'driver' routine
1090     {\it cost\_tile} is called at the end of each time step.
1091     Within this 'driver' routine, S/R are called for each of
1092     the chosen cost function contributions.
1093     In the present example ({\bf ALLOW\_COST\_TRACER}),
1094     S/R {\it cost\_tracer} is called.
1095     It accumulates {\bf objf\_tracer} according to eqn. (\ref{???}).
1096     %
1097     \subsubsection{Finalize all contributions}
1098     %
1099     \begin{itemize}
1100     %
1101     \item
1102     \fbox{
1103     \begin{minipage}{12cm}
1104     {\it cost\_final}
1105     \end{minipage}
1106     }
1107     \end{itemize}
1108     %
1109     At the end of the forward integration S/R {\it cost\_final}
1110     is called. It accumulates the total cost function {\bf fc}
1111     from each contribution and sums over all tiles:
1112     \begin{equation}
1113     {\cal J} \, = \,
1114     {\rm fc} \, = \,
1115 heimbach 1.15 {\rm mult\_tracer} \sum_{\text{global sum}} \sum_{bi,\,bj}^{nSx,\,nSy}
1116 adcroft 1.1 {\rm objf\_tracer}(bi,bj) \, + \, ...
1117     \end{equation}
1118     %
1119     The total cost function {\bf fc} will be the
1120     'dependent' variable in the argument list for TAMC, i.e.
1121     \begin{verbatim}
1122     tamc -output 'fc' ...
1123     \end{verbatim}
1124    
1125 cnh 1.3 %%%% \end{document}
1126 adcroft 1.1
1127     \input{part5/doc_ad_the_main}
1128    
1129 heimbach 1.4 \subsection{The control variables (independent variables)
1130     \label{section_ctrl}}
1131 adcroft 1.1
1132     The control variables are a subset of the model input
1133     (initial conditions, boundary conditions, model parameters).
1134     Here we identify them with the variable $ \vec{u} $.
1135     All intermediate variables whose derivative w.r.t. control
1136 heimbach 1.4 variables do not vanish are called {\sf active variables}.
1137 adcroft 1.1 All subroutines whose derivative w.r.t. the control variables
1138     don't vanish are called {\sf active routines}.
1139     Read and write operations from and to file can be viewed
1140     as variable assignments. Therefore, files to which
1141     active variables are written and from which active variables
1142     are read are called {\sf active files}.
1143     All aspects relevant to the treatment of the control variables
1144 cnh 1.7 (parameter setting, initialization, perturbation)
1145     are controlled by the package {\it pkg/ctrl}.
1146 adcroft 1.1
1147 heimbach 1.4 \input{part5/doc_ctrl_flow}
1148    
1149 adcroft 1.1 \subsubsection{genmake and CPP options}
1150     %
1151     \begin{itemize}
1152     %
1153     \item
1154     \fbox{
1155     \begin{minipage}{12cm}
1156     {\it genmake}, {\it CPP\_OPTIONS.h}, {\it ECCO\_CPPOPTIONS.h}
1157     \end{minipage}
1158     }
1159     \end{itemize}
1160     %
1161     To enable the directory to be included to the compile list,
1162     {\bf ctrl} has to be added to the {\bf enable} list in
1163 heimbach 1.15 {\it .genmakerc} or in {\it genmake} itself (analogous to {\it cost}
1164     package, cf. previous section).
1165 adcroft 1.1 Each control variable is enabled via its own CPP option
1166     in {\it ECCO\_CPPOPTIONS.h}.
1167    
1168 cnh 1.7 \subsubsection{Initialization}
1169 adcroft 1.1 %
1170     \begin{itemize}
1171     %
1172     \item
1173     \fbox{
1174     \begin{minipage}{12cm}
1175     Parameters: {\it ctrl\_readparms}
1176     \end{minipage}
1177     }
1178     \\
1179     %
1180     This S/R
1181     reads runtime flags and parameters from file {\it data.ctrl}.
1182     For the present example the file contains the file names
1183     of each control variable that is used.
1184     In addition, the number of wet points for each control
1185     variable and the net dimension of the space of control
1186     variables (counting wet points only) {\bf nvarlength}
1187     is determined.
1188     Masks for wet points for each tile {\bf (bi,\,bj)}
1189     and vertical layer {\bf k} are generated for the three
1190     relevant categories on the C-grid:
1191     {\bf nWetCtile} for tracer fields,
1192     {\bf nWetWtile} for zonal velocity fields,
1193     {\bf nWetStile} for meridional velocity fields.
1194     %
1195     \item
1196     \fbox{
1197     \begin{minipage}{12cm}
1198     Control variables, control vector,
1199     and their gradients: {\it ctrl\_unpack}
1200     \end{minipage}
1201     }
1202     \\
1203     %
1204     Two important issues related to the handling of the control
1205 edhill 1.20 variables in MITgcm need to be addressed.
1206 adcroft 1.1 First, in order to save memory, the control variable arrays
1207     are not kept in memory, but rather read from file and added
1208 cnh 1.7 to the initial fields during the model initialization phase.
1209 adcroft 1.1 Similarly, the corresponding adjoint fields which represent
1210     the gradient of the cost function w.r.t. the control variables
1211 heimbach 1.4 are written to file at the end of the adjoint integration.
1212 adcroft 1.1 Second, in addition to the files holding the 2-dim. and 3-dim.
1213 heimbach 1.4 control variables and the corresponding cost gradients,
1214     a 1-dim. {\sf control vector}
1215 adcroft 1.1 and {\sf gradient vector} are written to file. They contain
1216     only the wet points of the control variables and the corresponding
1217     gradient.
1218     This leads to a significant data compression.
1219 heimbach 1.4 Furthermore, an option is available
1220     ({\tt ALLOW\_NONDIMENSIONAL\_CONTROL\_IO}) to
1221     non-dimensionalise the control and gradient vector,
1222     which otherwise would contain different pieces of different
1223     magnitudes and units.
1224     Finally, the control and gradient vector can be passed to a
1225 adcroft 1.1 minimization routine if an update of the control variables
1226     is sought as part of a minimization exercise.
1227    
1228     The files holding fields and vectors of the control variables
1229     and gradient are generated and initialised in S/R {\it ctrl\_unpack}.
1230     %
1231     \end{itemize}
1232    
1233     \subsubsection{Perturbation of the independent variables}
1234     %
1235 heimbach 1.4 The dependency flow for differentiation w.r.t. the controls
1236     starts with adding a perturbation onto the input variable,
1237 adcroft 1.1 thus defining the independent or control variables for TAMC.
1238 heimbach 1.4 Three types of controls may be considered:
1239 adcroft 1.1 %
1240     \begin{itemize}
1241     %
1242     \item
1243     \fbox{
1244     \begin{minipage}{12cm}
1245     {\it ctrl\_map\_ini} (initial value sensitivity):
1246     \end{minipage}
1247     }
1248     \\
1249     %
1250     Consider as an example the initial tracer distribution
1251     {\bf tr1} as control variable.
1252     After {\bf tr1} has been initialised in
1253 heimbach 1.4 {\it ini\_tr1} (dynamical variables such as
1254 adcroft 1.1 temperature and salinity are initialised in {\it ini\_fields}),
1255     a perturbation anomaly is added to the field in S/R
1256     {\it ctrl\_map\_ini}
1257     %
1258     \begin{equation}
1259     \begin{split}
1260     u & = \, u_{[0]} \, + \, \Delta u \\
1261     {\bf tr1}(...) & = \, {\bf tr1_{ini}}(...) \, + \, {\bf xx\_tr1}(...)
1262     \label{perturb}
1263     \end{split}
1264     \end{equation}
1265     %
1266 heimbach 1.4 {\bf xx\_tr1} is a 3-dim. global array
1267 adcroft 1.1 holding the perturbation. In the case of a simple
1268     sensitivity study this array is identical to zero.
1269 heimbach 1.4 However, it's specification is essential in the context
1270     of automatic differentiation since TAMC
1271 adcroft 1.1 treats the corresponding line in the code symbolically
1272     when determining the differentiation chain and its origin.
1273     Thus, the variable names are part of the argument list
1274     when calling TAMC:
1275     %
1276     \begin{verbatim}
1277     tamc -input 'xx_tr1 ...' ...
1278     \end{verbatim}
1279     %
1280 edhill 1.20 Now, as mentioned above, MITgcm avoids maintaining
1281 adcroft 1.1 an array for each control variable by reading the
1282     perturbation to a temporary array from file.
1283     To ensure the symbolic link to be recognized by TAMC, a scalar
1284     dummy variable {\bf xx\_tr1\_dummy} is introduced
1285     and an 'active read' routine of the adjoint support
1286     package {\it pkg/autodiff} is invoked.
1287     The read-procedure is tagged with the variable
1288 cnh 1.7 {\bf xx\_tr1\_dummy} enabling TAMC to recognize the
1289     initialization of the perturbation.
1290 adcroft 1.1 The modified call of TAMC thus reads
1291     %
1292     \begin{verbatim}
1293     tamc -input 'xx_tr1_dummy ...' ...
1294     \end{verbatim}
1295     %
1296     and the modified operation to (\ref{perturb})
1297     in the code takes on the form
1298     %
1299     \begin{verbatim}
1300     call active_read_xyz(
1301     & ..., tmpfld3d, ..., xx_tr1_dummy, ... )
1302    
1303     tr1(...) = tr1(...) + tmpfld3d(...)
1304     \end{verbatim}
1305     %
1306     Note, that reading an active variable corresponds
1307     to a variable assignment. Its derivative corresponds
1308 heimbach 1.15 to a write statement of the adjoint variable, followed by
1309     a reset.
1310 adcroft 1.1 The 'active file' routines have been designed
1311 heimbach 1.4 to support active read and corresponding adjoint active write
1312     operations (and vice versa).
1313 adcroft 1.1 %
1314     \item
1315     \fbox{
1316     \begin{minipage}{12cm}
1317     {\it ctrl\_map\_forcing} (boundary value sensitivity):
1318     \end{minipage}
1319     }
1320     \\
1321     %
1322     The handling of boundary values as control variables
1323     proceeds exactly analogous to the initial values
1324     with the symbolic perturbation taking place in S/R
1325     {\it ctrl\_map\_forcing}.
1326     Note however an important difference:
1327     Since the boundary values are time dependent with a new
1328     forcing field applied at each time steps,
1329 heimbach 1.4 the general problem may be thought of as
1330     a new control variable at each time step
1331     (or, if the perturbation is averaged over a certain period,
1332     at each $ N $ timesteps), i.e.
1333 adcroft 1.1 \[
1334     u_{\rm forcing} \, = \,
1335     \{ \, u_{\rm forcing} ( t_n ) \, \}_{
1336     n \, = \, 1, \ldots , {\rm nTimeSteps} }
1337     \]
1338     %
1339     In the current example an equilibrium state is considered,
1340     and only an initial perturbation to
1341     surface forcing is applied with respect to the
1342     equilibrium state.
1343     A time dependent treatment of the surface forcing is
1344     implemented in the ECCO environment, involving the
1345     calendar ({\it cal}~) and external forcing ({\it exf}~) packages.
1346     %
1347     \item
1348     \fbox{
1349     \begin{minipage}{12cm}
1350     {\it ctrl\_map\_params} (parameter sensitivity):
1351     \end{minipage}
1352     }
1353     \\
1354     %
1355     This routine is not yet implemented, but would proceed
1356     proceed along the same lines as the initial value sensitivity.
1357 heimbach 1.4 The mixing parameters {\bf diffkr} and {\bf kapgm}
1358     are currently added as controls in {\it ctrl\_map\_ini.F}.
1359 adcroft 1.1 %
1360     \end{itemize}
1361     %
1362    
1363     \subsubsection{Output of adjoint variables and gradient}
1364     %
1365 heimbach 1.4 Several ways exist to generate output of adjoint fields.
1366 adcroft 1.1 %
1367     \begin{itemize}
1368     %
1369     \item
1370     \fbox{
1371     \begin{minipage}{12cm}
1372 heimbach 1.4 {\it ctrl\_map\_ini, ctrl\_map\_forcing}:
1373 adcroft 1.1 \end{minipage}
1374     }
1375     \\
1376     \begin{itemize}
1377     %
1378 heimbach 1.4 \item {\bf xx\_...}: the control variable fields \\
1379     Before the forward integration, the control
1380     variables are read from file {\bf xx\_ ...} and added to
1381     the model field.
1382 adcroft 1.1 %
1383     \item {\bf adxx\_...}: the adjoint variable fields, i.e. the gradient
1384 heimbach 1.4 $ \nabla _{u}{\cal J} $ for each control variable \\
1385     After the adjoint integration the corresponding adjoint
1386     variables are written to {\bf adxx\_ ...}.
1387 adcroft 1.1 %
1388 heimbach 1.4 \end{itemize}
1389 adcroft 1.1 %
1390 heimbach 1.4 \item
1391     \fbox{
1392     \begin{minipage}{12cm}
1393     {\it ctrl\_unpack, ctrl\_pack}:
1394     \end{minipage}
1395     }
1396     \\
1397     %
1398     \begin{itemize}
1399     %
1400     \item {\bf vector\_ctrl}: the control vector \\
1401 cnh 1.7 At the very beginning of the model initialization,
1402 heimbach 1.4 the updated compressed control vector is read (or initialised)
1403     and distributed to 2-dim. and 3-dim. control variable fields.
1404     %
1405     \item {\bf vector\_grad}: the gradient vector \\
1406     At the very end of the adjoint integration,
1407     the 2-dim. and 3-dim. adjoint variables are read,
1408     compressed to a single vector and written to file.
1409 adcroft 1.1 %
1410     \end{itemize}
1411     %
1412     \item
1413     \fbox{
1414     \begin{minipage}{12cm}
1415     {\it addummy\_in\_stepping}:
1416     \end{minipage}
1417     }
1418     \\
1419     In addition to writing the gradient at the end of the
1420 heimbach 1.4 forward/adjoint integration, many more adjoint variables
1421     of the model state
1422     at intermediate times can be written using S/R
1423 adcroft 1.1 {\it addummy\_in\_stepping}.
1424     This routine is part of the adjoint support package
1425     {\it pkg/autodiff} (cf.f. below).
1426 heimbach 1.15 The procedure is enabled using via the CPP-option
1427     {\bf ALLOW\_AUTODIFF\_MONITOR} (file {\it ECCO\_CPPOPTIONS.h}).
1428 adcroft 1.1 To be part of the adjoint code, the corresponding S/R
1429     {\it dummy\_in\_stepping} has to be called in the forward
1430     model (S/R {\it the\_main\_loop}) at the appropriate place.
1431 heimbach 1.15 The adjoint common blocks are extracted from the adjoint code
1432     via the header file {\it adcommon.h}.
1433 adcroft 1.1
1434     {\it dummy\_in\_stepping} is essentially empty,
1435     the corresponding adjoint routine is hand-written rather
1436     than generated automatically.
1437     Appropriate flow directives ({\it dummy\_in\_stepping.flow})
1438     ensure that TAMC does not automatically
1439     generate {\it addummy\_in\_stepping} by trying to differentiate
1440 heimbach 1.4 {\it dummy\_in\_stepping}, but instead refers to
1441     the hand-written routine.
1442 adcroft 1.1
1443     {\it dummy\_in\_stepping} is called in the forward code
1444     at the beginning of each
1445     timestep, before the call to {\it dynamics}, thus ensuring
1446     that {\it addummy\_in\_stepping} is called at the end of
1447     each timestep in the adjoint calculation, after the call to
1448     {\it addynamics}.
1449    
1450     {\it addummy\_in\_stepping} includes the header files
1451 heimbach 1.4 {\it adcommon.h}.
1452     This header file is also hand-written. It contains
1453     the common blocks
1454     {\bf /addynvars\_r/}, {\bf /addynvars\_cd/},
1455     {\bf /addynvars\_diffkr/}, {\bf /addynvars\_kapgm/},
1456 adcroft 1.1 {\bf /adtr1\_r/}, {\bf /adffields/},
1457     which have been extracted from the adjoint code to enable
1458     access to the adjoint variables.
1459 heimbach 1.15
1460     {\bf WARNING:} If the structure of the common blocks
1461     {\bf /dynvars\_r/}, {\bf /dynvars\_cd/}, etc., changes
1462     similar changes will occur in the adjoint common blocks.
1463     Therefore, consistency between the TAMC-generated common blocks
1464     and those in {\it adcommon.h} have to be checked.
1465 adcroft 1.1 %
1466     \end{itemize}
1467    
1468    
1469     \subsubsection{Control variable handling for
1470     optimization applications}
1471    
1472     In optimization mode the cost function $ {\cal J}(u) $ is sought
1473     to be minimized with respect to a set of control variables
1474     $ \delta {\cal J} \, = \, 0 $, in an iterative manner.
1475     The gradient $ \nabla _{u}{\cal J} |_{u_{[k]}} $ together
1476     with the value of the cost function itself $ {\cal J}(u_{[k]}) $
1477     at iteration step $ k $ serve
1478     as input to a minimization routine (e.g. quasi-Newton method,
1479 heimbach 1.9 conjugate gradient, ... \cite{gil-lem:89})
1480 heimbach 1.4 to compute an update in the
1481 adcroft 1.1 control variable for iteration step $k+1$
1482     \[
1483     u_{[k+1]} \, = \, u_{[0]} \, + \, \Delta u_{[k+1]}
1484     \quad \mbox{satisfying} \quad
1485     {\cal J} \left( u_{[k+1]} \right) \, < \, {\cal J} \left( u_{[k]} \right)
1486     \]
1487     $ u_{[k+1]} $ then serves as input for a forward/adjoint run
1488     to determine $ {\cal J} $ and $ \nabla _{u}{\cal J} $ at iteration step
1489     $ k+1 $.
1490     Tab. \ref{???} sketches the flow between forward/adjoint model
1491     and the minimization routine.
1492    
1493     \begin{eqnarray*}
1494 heimbach 1.4 \scriptsize
1495 adcroft 1.1 \begin{array}{ccccc}
1496     u_{[0]} \,\, , \,\, \Delta u_{[k]} & ~ & ~ & ~ & ~ \\
1497     {\Big\downarrow}
1498     & ~ & ~ & ~ & ~ \\
1499     ~ & ~ & ~ & ~ & ~ \\
1500     \hline
1501     \multicolumn{1}{|c}{~} & ~ & ~ & ~ & \multicolumn{1}{c|}{~} \\
1502     \multicolumn{1}{|c}{
1503     u_{[k]} = u_{[0]} + \Delta u_{[k]}} &
1504     \stackrel{\bf forward}{\bf \longrightarrow} &
1505     v_{[k]} = M \left( u_{[k]} \right) &
1506     \stackrel{\bf forward}{\bf \longrightarrow} &
1507     \multicolumn{1}{c|}{
1508     {\cal J}_{[k]} = {\cal J} \left( M \left( u_{[k]} \right) \right)} \\
1509     \multicolumn{1}{|c}{~} & ~ & ~ & ~ & \multicolumn{1}{c|}{~} \\
1510     \hline
1511 heimbach 1.4 \multicolumn{1}{|c}{~} & ~ & ~ & ~ & \multicolumn{1}{c|}{~} \\
1512     \multicolumn{1}{|c}{~} & ~ & ~ & ~ & \multicolumn{1}{c|}{{\Big\downarrow}} \\
1513     \multicolumn{1}{|c}{~} & ~ & ~ & ~ & \multicolumn{1}{c|}{~} \\
1514 adcroft 1.1 \hline
1515     \multicolumn{1}{|c}{~} & ~ & ~ & ~ & \multicolumn{1}{c|}{~} \\
1516     \multicolumn{1}{|c}{
1517     \nabla_u {\cal J}_{[k]} (\delta {\cal J}) =
1518 heimbach 1.4 T^{\ast} \cdot \nabla_v {\cal J} |_{v_{[k]}} (\delta {\cal J})} &
1519 adcroft 1.1 \stackrel{\bf adjoint}{\mathbf \longleftarrow} &
1520     ad \, v_{[k]} (\delta {\cal J}) =
1521     \nabla_v {\cal J} |_{v_{[k]}} (\delta {\cal J}) &
1522     \stackrel{\bf adjoint}{\mathbf \longleftarrow} &
1523     \multicolumn{1}{c|}{ ad \, {\cal J} = \delta {\cal J}} \\
1524     \multicolumn{1}{|c}{~} & ~ & ~ & ~ & \multicolumn{1}{c|}{~} \\
1525     \hline
1526     ~ & ~ & ~ & ~ & ~ \\
1527 heimbach 1.4 \hspace*{15ex}{\Bigg\downarrow}
1528     \quad {\cal J}_{[k]}, \quad \nabla_u {\cal J}_{[k]}
1529     & ~ & ~ & ~ & ~ \\
1530 adcroft 1.1 ~ & ~ & ~ & ~ & ~ \\
1531     \hline
1532     \multicolumn{1}{|c}{~} & ~ & ~ & ~ & \multicolumn{1}{c|}{~} \\
1533     \multicolumn{1}{|c}{
1534     {\cal J}_{[k]} \,\, , \,\, \nabla_u {\cal J}_{[k]}} &
1535     {\mathbf \longrightarrow} & \text{\bf minimisation} &
1536     {\mathbf \longrightarrow} &
1537     \multicolumn{1}{c|}{ \Delta u_{[k+1]}} \\
1538     \multicolumn{1}{|c}{~} & ~ & ~ & ~ & \multicolumn{1}{c|}{~} \\
1539     \hline
1540     ~ & ~ & ~ & ~ & ~ \\
1541     ~ & ~ & ~ & ~ & \Big\downarrow \\
1542     ~ & ~ & ~ & ~ & \Delta u_{[k+1]} \\
1543     \end{array}
1544     \end{eqnarray*}
1545    
1546     The routines {\it ctrl\_unpack} and {\it ctrl\_pack} provide
1547     the link between the model and the minimization routine.
1548     As described in Section \ref{???}
1549     the {\it unpack} and {\it pack} routines read and write
1550     control and gradient {\it vectors} which are compressed
1551     to contain only wet points, in addition to the full
1552     2-dim. and 3-dim. fields.
1553     The corresponding I/O flow looks as follows:
1554    
1555     \vspace*{0.5cm}
1556    
1557 heimbach 1.4 {\scriptsize
1558 adcroft 1.1 \begin{tabular}{ccccc}
1559     {\bf vector\_ctrl\_$<$k$>$ } & ~ & ~ & ~ & ~ \\
1560     {\big\downarrow} & ~ & ~ & ~ & ~ \\
1561     \cline{1-1}
1562     \multicolumn{1}{|c|}{\it ctrl\_unpack} & ~ & ~ & ~ & ~ \\
1563     \cline{1-1}
1564     {\big\downarrow} & ~ & ~ & ~ & ~ \\
1565     \cline{3-3}
1566     \multicolumn{1}{l}{\bf xx\_theta0...$<$k$>$} & ~ &
1567     \multicolumn{1}{|c|}{~} & ~ & ~ \\
1568 heimbach 1.4 \multicolumn{1}{l}{\bf xx\_salt0...$<$k$>$} &
1569     $\stackrel{\mbox{read}}{\longrightarrow}$ &
1570 adcroft 1.1 \multicolumn{1}{|c|}{forward integration} & ~ & ~ \\
1571     \multicolumn{1}{l}{\bf \vdots} & ~ & \multicolumn{1}{|c|}{~}
1572     & ~ & ~ \\
1573     \cline{3-3}
1574 heimbach 1.4 ~ & ~ & $\downarrow$ & ~ & ~ \\
1575 adcroft 1.1 \cline{3-3}
1576     ~ & ~ &
1577     \multicolumn{1}{|c|}{~} & ~ &
1578     \multicolumn{1}{l}{\bf adxx\_theta0...$<$k$>$} \\
1579     ~ & ~ & \multicolumn{1}{|c|}{adjoint integration} &
1580 heimbach 1.4 $\stackrel{\mbox{write}}{\longrightarrow}$ &
1581 adcroft 1.1 \multicolumn{1}{l}{\bf adxx\_salt0...$<$k$>$} \\
1582     ~ & ~ & \multicolumn{1}{|c|}{~}
1583     & ~ & \multicolumn{1}{l}{\bf \vdots} \\
1584     \cline{3-3}
1585     ~ & ~ & ~ & ~ & {\big\downarrow} \\
1586     \cline{5-5}
1587     ~ & ~ & ~ & ~ & \multicolumn{1}{|c|}{\it ctrl\_pack} \\
1588     \cline{5-5}
1589     ~ & ~ & ~ & ~ & {\big\downarrow} \\
1590     ~ & ~ & ~ & ~ & {\bf vector\_grad\_$<$k$>$ } \\
1591     \end{tabular}
1592 heimbach 1.4 }
1593 adcroft 1.1
1594     \vspace*{0.5cm}
1595    
1596    
1597 heimbach 1.4 {\it ctrl\_unpack} reads the updated control vector
1598 adcroft 1.1 {\bf vector\_ctrl\_$<$k$>$}.
1599     It distributes the different control variables to
1600     2-dim. and 3-dim. files {\it xx\_...$<$k$>$}.
1601 heimbach 1.4 At the start of the forward integration the control variables
1602     are read from {\it xx\_...$<$k$>$} and added to the
1603     field.
1604     Correspondingly, at the end of the adjoint integration
1605     the adjoint fields are written
1606 adcroft 1.1 to {\it adxx\_...$<$k$>$}, again via the active file routines.
1607 heimbach 1.4 Finally, {\it ctrl\_pack} collects all adjoint files
1608 adcroft 1.1 and writes them to the compressed vector file
1609     {\bf vector\_grad\_$<$k$>$}.

  ViewVC Help
Powered by ViewVC 1.1.22