/[MITgcm]/manual/s_autodiff/text/doc_ad_2.tex
ViewVC logotype

Annotation of /manual/s_autodiff/text/doc_ad_2.tex

Parent Directory Parent Directory | Revision Log Revision Log | View Revision Graph Revision Graph


Revision 1.18 - (hide annotations) (download) (as text)
Mon Aug 1 22:31:36 2005 UTC (19 years, 11 months ago) by heimbach
Branch: MAIN
Changes since 1.17: +3 -1 lines
File MIME type: application/x-tex
Updating (mostly kpp.tex, and corrections to exf.tex)

1 heimbach 1.18 % $Header: /u/gcmpack/manual/part5/doc_ad_2.tex,v 1.17 2004/10/16 03:40:17 edhill Exp $
2 heimbach 1.2 % $Name: $
3 adcroft 1.1
4 heimbach 1.18 Author: Patrick Heimbach
5    
6 adcroft 1.1 {\sf Automatic differentiation} (AD), also referred to as algorithmic
7     (or, more loosely, computational) differentiation, involves
8     automatically deriving code to calculate
9     partial derivatives from an existing fully non-linear prognostic code.
10     (see \cite{gri:00}).
11     A software tool is used that parses and transforms source files
12     according to a set of linguistic and mathematical rules.
13     AD tools are like source-to-source translators in that
14     they parse a program code as input and produce a new program code
15     as output.
16     However, unlike a pure source-to-source translation, the output program
17     represents a new algorithm, such as the evaluation of the
18     Jacobian, the Hessian, or higher derivative operators.
19     In principle, a variety of derived algorithms
20     can be generated automatically in this way.
21    
22     The MITGCM has been adapted for use with the
23 heimbach 1.4 Tangent linear and Adjoint Model Compiler (TAMC) and its successor TAF
24 adcroft 1.1 (Transformation of Algorithms in Fortran), developed
25     by Ralf Giering (\cite{gie-kam:98}, \cite{gie:99,gie:00}).
26 cnh 1.7 The first application of the adjoint of the MITGCM for sensitivity
27 adcroft 1.1 studies has been published by \cite{maro-eta:99}.
28     \cite{sta-eta:97,sta-eta:01} use the MITGCM and its adjoint
29     for ocean state estimation studies.
30 heimbach 1.4 In the following we shall refer to TAMC and TAF synonymously,
31     except were explicitly stated otherwise.
32 adcroft 1.1
33     TAMC exploits the chain rule for computing the first
34     derivative of a function with
35     respect to a set of input variables.
36     Treating a given forward code as a composition of operations --
37 heimbach 1.4 each line representing a compositional element, the chain rule is
38 adcroft 1.1 rigorously applied to the code, line by line. The resulting
39     tangent linear or adjoint code,
40     then, may be thought of as the composition in
41     forward or reverse order, respectively, of the
42 heimbach 1.4 Jacobian matrices of the forward code's compositional elements.
43 adcroft 1.1
44     %**********************************************************************
45     \section{Some basic algebra}
46     \label{sec_ad_algebra}
47 edhill 1.17 \begin{rawhtml}
48     <!-- CMIREDIR:sec_ad_algebra: -->
49     \end{rawhtml}
50 adcroft 1.1 %**********************************************************************
51    
52     Let $ \cal{M} $ be a general nonlinear, model, i.e. a
53     mapping from the $m$-dimensional space
54     $U \subset I\!\!R^m$ of input variables
55     $\vec{u}=(u_1,\ldots,u_m)$
56     (model parameters, initial conditions, boundary conditions
57     such as forcing functions) to the $n$-dimensional space
58     $V \subset I\!\!R^n$ of
59     model output variable $\vec{v}=(v_1,\ldots,v_n)$
60 cnh 1.7 (model state, model diagnostics, objective function, ...)
61 adcroft 1.1 under consideration,
62     %
63     \begin{equation}
64     \begin{split}
65     {\cal M} \, : & \, U \,\, \longrightarrow \, V \\
66     ~ & \, \vec{u} \,\, \longmapsto \, \vec{v} \, = \,
67     {\cal M}(\vec{u})
68     \label{fulloperator}
69     \end{split}
70     \end{equation}
71     %
72     The vectors $ \vec{u} \in U $ and $ v \in V $ may be represented w.r.t.
73     some given basis vectors
74     $ {\rm span} (U) = \{ {\vec{e}_i} \}_{i = 1, \ldots , m} $ and
75     $ {\rm span} (V) = \{ {\vec{f}_j} \}_{j = 1, \ldots , n} $ as
76     \[
77     \vec{u} \, = \, \sum_{i=1}^{m} u_i \, {\vec{e}_i},
78     \qquad
79     \vec{v} \, = \, \sum_{j=1}^{n} v_j \, {\vec{f}_j}
80     \]
81    
82     Two routes may be followed to determine the sensitivity of the
83     output variable $\vec{v}$ to its input $\vec{u}$.
84    
85     \subsection{Forward or direct sensitivity}
86     %
87     Consider a perturbation to the input variables $\delta \vec{u}$
88     (typically a single component
89     $\delta \vec{u} = \delta u_{i} \, {\vec{e}_{i}}$).
90     Their effect on the output may be obtained via the linear
91     approximation of the model $ {\cal M}$ in terms of its Jacobian matrix
92     $ M $, evaluated in the point $u^{(0)}$ according to
93     %
94     \begin{equation}
95     \delta \vec{v} \, = \, M |_{\vec{u}^{(0)}} \, \delta \vec{u}
96     \label{tangent_linear}
97     \end{equation}
98     with resulting output perturbation $\delta \vec{v}$.
99     In components
100     $M_{j i} \, = \, \partial {\cal M}_{j} / \partial u_{i} $,
101     it reads
102     %
103     \begin{equation}
104     \delta v_{j} \, = \, \sum_{i}
105     \left. \frac{\partial {\cal M}_{j}}{\partial u_{i}} \right|_{u^{(0)}} \,
106     \delta u_{i}
107     \label{jacobi_matrix}
108     \end{equation}
109     %
110     Eq. (\ref{tangent_linear}) is the {\sf tangent linear model (TLM)}.
111     In contrast to the full nonlinear model $ {\cal M} $, the operator
112     $ M $ is just a matrix
113     which can readily be used to find the forward sensitivity of $\vec{v}$ to
114     perturbations in $u$,
115 heimbach 1.4 but if there are very many input variables $(\gg O(10^{6})$ for
116 adcroft 1.1 large-scale oceanographic application), it quickly becomes
117     prohibitive to proceed directly as in (\ref{tangent_linear}),
118     if the impact of each component $ {\bf e_{i}} $ is to be assessed.
119    
120     \subsection{Reverse or adjoint sensitivity}
121     %
122     Let us consider the special case of a
123     scalar objective function ${\cal J}(\vec{v})$ of the model output (e.g.
124     the total meridional heat transport,
125     the total uptake of $CO_{2}$ in the Southern
126     Ocean over a time interval,
127     or a measure of some model-to-data misfit)
128     %
129     \begin{eqnarray}
130     \begin{array}{cccccc}
131     {\cal J} \, : & U &
132     \longrightarrow & V &
133     \longrightarrow & I \!\! R \\
134     ~ & \vec{u} & \longmapsto & \vec{v}={\cal M}(\vec{u}) &
135     \longmapsto & {\cal J}(\vec{u}) = {\cal J}({\cal M}(\vec{u}))
136     \end{array}
137     \label{compo}
138     \end{eqnarray}
139     %
140 heimbach 1.4 The perturbation of $ {\cal J} $ around a fixed point $ {\cal J}_0 $,
141 adcroft 1.1 \[
142 heimbach 1.4 {\cal J} \, = \, {\cal J}_0 \, + \, \delta {\cal J}
143 adcroft 1.1 \]
144     can be expressed in both bases of $ \vec{u} $ and $ \vec{v} $
145     w.r.t. their corresponding inner product
146     $\left\langle \,\, , \,\, \right\rangle $
147     %
148     \begin{equation}
149     \begin{split}
150     {\cal J} & = \,
151     {\cal J} |_{\vec{u}^{(0)}} \, + \,
152     \left\langle \, \nabla _{u}{\cal J}^T |_{\vec{u}^{(0)}} \, , \, \delta \vec{u} \, \right\rangle
153     \, + \, O(\delta \vec{u}^2) \\
154     ~ & = \,
155     {\cal J} |_{\vec{v}^{(0)}} \, + \,
156     \left\langle \, \nabla _{v}{\cal J}^T |_{\vec{v}^{(0)}} \, , \, \delta \vec{v} \, \right\rangle
157     \, + \, O(\delta \vec{v}^2)
158     \end{split}
159     \label{deljidentity}
160     \end{equation}
161     %
162 heimbach 1.2 (note, that the gradient $ \nabla f $ is a co-vector, therefore
163 adcroft 1.1 its transpose is required in the above inner product).
164     Then, using the representation of
165     $ \delta {\cal J} =
166     \left\langle \, \nabla _{v}{\cal J}^T \, , \, \delta \vec{v} \, \right\rangle $,
167     the definition
168     of an adjoint operator $ A^{\ast} $ of a given operator $ A $,
169     \[
170     \left\langle \, A^{\ast} \vec{x} \, , \, \vec{y} \, \right\rangle =
171     \left\langle \, \vec{x} \, , \, A \vec{y} \, \right\rangle
172     \]
173     which for finite-dimensional vector spaces is just the
174     transpose of $ A $,
175     \[
176     A^{\ast} \, = \, A^T
177     \]
178 heimbach 1.4 and from eq. (\ref{tangent_linear}), (\ref{deljidentity}),
179     we note that
180 adcroft 1.1 (omitting $|$'s):
181     %
182     \begin{equation}
183     \delta {\cal J}
184     \, = \,
185     \left\langle \, \nabla _{v}{\cal J}^T \, , \, \delta \vec{v} \, \right\rangle
186     \, = \,
187     \left\langle \, \nabla _{v}{\cal J}^T \, , \, M \, \delta \vec{u} \, \right\rangle
188     \, = \,
189     \left\langle \, M^T \, \nabla _{v}{\cal J}^T \, , \,
190     \delta \vec{u} \, \right\rangle
191     \label{inner}
192     \end{equation}
193     %
194     With the identity (\ref{deljidentity}), we then find that
195     the gradient $ \nabla _{u}{\cal J} $ can be readily inferred by
196     invoking the adjoint $ M^{\ast } $ of the tangent linear model $ M $
197     %
198     \begin{equation}
199     \begin{split}
200     \nabla _{u}{\cal J}^T |_{\vec{u}} &
201     = \, M^T |_{\vec{u}} \cdot \nabla _{v}{\cal J}^T |_{\vec{v}} \\
202     ~ & = \, M^T |_{\vec{u}} \cdot \delta \vec{v}^{\ast} \\
203     ~ & = \, \delta \vec{u}^{\ast}
204     \end{split}
205     \label{adjoint}
206     \end{equation}
207     %
208     Eq. (\ref{adjoint}) is the {\sf adjoint model (ADM)},
209     in which $M^T$ is the adjoint (here, the transpose) of the
210     tangent linear operator $M$, $ \delta \vec{v}^{\ast} $
211     the adjoint variable of the model state $ \vec{v} $, and
212     $ \delta \vec{u}^{\ast} $ the adjoint variable of the control variable $ \vec{u} $.
213    
214     The {\sf reverse} nature of the adjoint calculation can be readily
215 heimbach 1.4 seen as follows.
216     Consider a model integration which consists of $ \Lambda $
217     consecutive operations
218     $ {\cal M}_{\Lambda} ( {\cal M}_{\Lambda-1} (
219     ...... ( {\cal M}_{\lambda} (
220     ......
221     ( {\cal M}_{1} ( {\cal M}_{0}(\vec{u}) )))) $,
222     where the ${\cal M}$'s could be the elementary steps, i.e. single lines
223     in the code of the model, or successive time steps of the
224     model integration,
225     starting at step 0 and moving up to step $\Lambda$, with intermediate
226     ${\cal M}_{\lambda} (\vec{u}) = \vec{v}^{(\lambda+1)}$ and final
227     ${\cal M}_{\Lambda} (\vec{u}) = \vec{v}^{(\Lambda+1)} = \vec{v}$.
228 cnh 1.7 Let ${\cal J}$ be a cost function which explicitly depends on the
229 heimbach 1.4 final state $\vec{v}$ only
230     (this restriction is for clarity reasons only).
231     %
232     ${\cal J}(u)$ may be decomposed according to:
233 adcroft 1.1 %
234     \begin{equation}
235     {\cal J}({\cal M}(\vec{u})) \, = \,
236     {\cal J} ( {\cal M}_{\Lambda} ( {\cal M}_{\Lambda-1} (
237     ...... ( {\cal M}_{\lambda} (
238     ......
239     ( {\cal M}_{1} ( {\cal M}_{0}(\vec{u}) )))))
240     \label{compos}
241     \end{equation}
242     %
243 heimbach 1.4 Then, according to the chain rule, the forward calculation reads,
244     in terms of the Jacobi matrices
245 adcroft 1.1 (we've omitted the $ | $'s which, nevertheless are important
246     to the aspect of {\it tangent} linearity;
247 heimbach 1.4 note also that by definition
248 adcroft 1.1 $ \langle \, \nabla _{v}{\cal J}^T \, , \, \delta \vec{v} \, \rangle
249     = \nabla_v {\cal J} \cdot \delta \vec{v} $ )
250     %
251     \begin{equation}
252     \begin{split}
253     \nabla_v {\cal J} (M(\delta \vec{u})) & = \,
254     \nabla_v {\cal J} \cdot M_{\Lambda}
255     \cdot ...... \cdot M_{\lambda} \cdot ...... \cdot
256     M_{1} \cdot M_{0} \cdot \delta \vec{u} \\
257     ~ & = \, \nabla_v {\cal J} \cdot \delta \vec{v} \\
258     \end{split}
259     \label{forward}
260     \end{equation}
261     %
262     whereas in reverse mode we have
263     %
264     \begin{equation}
265     \boxed{
266     \begin{split}
267     M^T ( \nabla_v {\cal J}^T) & = \,
268     M_{0}^T \cdot M_{1}^T
269     \cdot ...... \cdot M_{\lambda}^T \cdot ...... \cdot
270     M_{\Lambda}^T \cdot \nabla_v {\cal J}^T \\
271     ~ & = \, M_{0}^T \cdot M_{1}^T
272     \cdot ...... \cdot
273     \nabla_{v^{(\lambda)}} {\cal J}^T \\
274     ~ & = \, \nabla_u {\cal J}^T
275     \end{split}
276     }
277     \label{reverse}
278     \end{equation}
279     %
280     clearly expressing the reverse nature of the calculation.
281     Eq. (\ref{reverse}) is at the heart of automatic adjoint compilers.
282 heimbach 1.4 If the intermediate steps $\lambda$ in
283 adcroft 1.1 eqn. (\ref{compos}) -- (\ref{reverse})
284 heimbach 1.4 represent the model state (forward or adjoint) at each
285     intermediate time step as noted above, then correspondingly,
286     $ M^T (\delta \vec{v}^{(\lambda) \, \ast}) =
287     \delta \vec{v}^{(\lambda-1) \, \ast} $ for the adjoint variables.
288     It thus becomes evident that the adjoint calculation also
289     yields the adjoint of each model state component
290     $ \vec{v}^{(\lambda)} $ at each intermediate step $ \lambda $, namely
291 adcroft 1.1 %
292     \begin{equation}
293     \boxed{
294     \begin{split}
295     \nabla_{v^{(\lambda)}} {\cal J}^T |_{\vec{v}^{(\lambda)}}
296     & = \,
297     M_{\lambda}^T |_{\vec{v}^{(\lambda)}} \cdot ...... \cdot
298     M_{\Lambda}^T |_{\vec{v}^{(\lambda)}} \cdot \delta \vec{v}^{\ast} \\
299     ~ & = \, \delta \vec{v}^{(\lambda) \, \ast}
300     \end{split}
301     }
302     \end{equation}
303     %
304     in close analogy to eq. (\ref{adjoint})
305     We note in passing that that the $\delta \vec{v}^{(\lambda) \, \ast}$
306 heimbach 1.4 are the Lagrange multipliers of the model equations which determine
307     $ \vec{v}^{(\lambda)}$.
308 adcroft 1.1
309 cnh 1.7 In components, eq. (\ref{adjoint}) reads as follows.
310 adcroft 1.1 Let
311     \[
312     \begin{array}{rclcrcl}
313     \delta \vec{u} & = &
314     \left( \delta u_1,\ldots, \delta u_m \right)^T , & \qquad &
315     \delta \vec{u}^{\ast} \,\, = \,\, \nabla_u {\cal J}^T & = &
316     \left(
317     \frac{\partial {\cal J}}{\partial u_1},\ldots,
318     \frac{\partial {\cal J}}{\partial u_m}
319     \right)^T \\
320     \delta \vec{v} & = &
321     \left( \delta v_1,\ldots, \delta u_n \right)^T , & \qquad &
322     \delta \vec{v}^{\ast} \,\, = \,\, \nabla_v {\cal J}^T & = &
323     \left(
324     \frac{\partial {\cal J}}{\partial v_1},\ldots,
325     \frac{\partial {\cal J}}{\partial v_n}
326     \right)^T \\
327     \end{array}
328     \]
329     denote the perturbations in $\vec{u}$ and $\vec{v}$, respectively,
330 cnh 1.7 and their adjoint variables;
331 adcroft 1.1 further
332     \[
333     M \, = \, \left(
334     \begin{array}{ccc}
335     \frac{\partial {\cal M}_1}{\partial u_1} & \ldots &
336     \frac{\partial {\cal M}_1}{\partial u_m} \\
337     \vdots & ~ & \vdots \\
338     \frac{\partial {\cal M}_n}{\partial u_1} & \ldots &
339     \frac{\partial {\cal M}_n}{\partial u_m} \\
340     \end{array}
341     \right)
342     \]
343     is the Jacobi matrix of $ {\cal M} $
344     (an $ n \times m $ matrix)
345     such that $ \delta \vec{v} = M \cdot \delta \vec{u} $, or
346     \[
347     \delta v_{j}
348     \, = \, \sum_{i=1}^m M_{ji} \, \delta u_{i}
349     \, = \, \sum_{i=1}^m \, \frac{\partial {\cal M}_{j}}{\partial u_{i}}
350     \delta u_{i}
351     \]
352     %
353     Then eq. (\ref{adjoint}) takes the form
354     \[
355     \delta u_{i}^{\ast}
356     \, = \, \sum_{j=1}^n M_{ji} \, \delta v_{j}^{\ast}
357     \, = \, \sum_{j=1}^n \, \frac{\partial {\cal M}_{j}}{\partial u_{i}}
358     \delta v_{j}^{\ast}
359     \]
360     %
361     or
362     %
363     \[
364     \left(
365     \begin{array}{c}
366     \left. \frac{\partial}{\partial u_1} {\cal J} \right|_{\vec{u}^{(0)}} \\
367     \vdots \\
368     \left. \frac{\partial}{\partial u_m} {\cal J} \right|_{\vec{u}^{(0)}} \\
369     \end{array}
370     \right)
371     \, = \,
372     \left(
373     \begin{array}{ccc}
374     \left. \frac{\partial {\cal M}_1}{\partial u_1} \right|_{\vec{u}^{(0)}}
375     & \ldots &
376     \left. \frac{\partial {\cal M}_n}{\partial u_1} \right|_{\vec{u}^{(0)}} \\
377     \vdots & ~ & \vdots \\
378     \left. \frac{\partial {\cal M}_1}{\partial u_m} \right|_{\vec{u}^{(0)}}
379     & \ldots &
380     \left. \frac{\partial {\cal M}_n}{\partial u_m} \right|_{\vec{u}^{(0)}} \\
381     \end{array}
382     \right)
383     \cdot
384     \left(
385     \begin{array}{c}
386     \left. \frac{\partial}{\partial v_1} {\cal J} \right|_{\vec{v}} \\
387     \vdots \\
388     \left. \frac{\partial}{\partial v_n} {\cal J} \right|_{\vec{v}} \\
389     \end{array}
390     \right)
391     \]
392     %
393     Furthermore, the adjoint $ \delta v^{(\lambda) \, \ast} $
394     of any intermediate state $ v^{(\lambda)} $
395     may be obtained, using the intermediate Jacobian
396     (an $ n_{\lambda+1} \times n_{\lambda} $ matrix)
397     %
398     \[
399     M_{\lambda} \, = \,
400     \left(
401     \begin{array}{ccc}
402     \frac{\partial ({\cal M}_{\lambda})_1}{\partial v^{(\lambda)}_1}
403     & \ldots &
404     \frac{\partial ({\cal M}_{\lambda})_1}{\partial v^{(\lambda)}_{n_{\lambda}}} \\
405     \vdots & ~ & \vdots \\
406     \frac{\partial ({\cal M}_{\lambda})_{n_{\lambda+1}}}{\partial v^{(\lambda)}_1}
407     & \ldots &
408     \frac{\partial ({\cal M}_{\lambda})_{n_{\lambda+1}}}{\partial v^{(\lambda)}_{n_{\lambda}}} \\
409     \end{array}
410     \right)
411     \]
412     %
413     and the shorthand notation for the adjoint variables
414     $ \delta v^{(\lambda) \, \ast}_{j} = \frac{\partial}{\partial v^{(\lambda)}_{j}}
415     {\cal J}^T $, $ j = 1, \ldots , n_{\lambda} $,
416     for intermediate components, yielding
417 heimbach 1.4 \begin{equation}
418     \small
419     \begin{split}
420 adcroft 1.1 \left(
421     \begin{array}{c}
422     \delta v^{(\lambda) \, \ast}_1 \\
423     \vdots \\
424     \delta v^{(\lambda) \, \ast}_{n_{\lambda}} \\
425     \end{array}
426     \right)
427 heimbach 1.4 \, = &
428 adcroft 1.1 \left(
429     \begin{array}{ccc}
430     \frac{\partial ({\cal M}_{\lambda})_1}{\partial v^{(\lambda)}_1}
431 heimbach 1.4 & \ldots \,\, \ldots &
432 adcroft 1.1 \frac{\partial ({\cal M}_{\lambda})_{n_{\lambda+1}}}{\partial v^{(\lambda)}_1} \\
433     \vdots & ~ & \vdots \\
434     \frac{\partial ({\cal M}_{\lambda})_1}{\partial v^{(\lambda)}_{n_{\lambda}}}
435 heimbach 1.4 & \ldots \,\, \ldots &
436 adcroft 1.1 \frac{\partial ({\cal M}_{\lambda})_{n_{\lambda+1}}}{\partial v^{(\lambda)}_{n_{\lambda}}} \\
437     \end{array}
438     \right)
439 heimbach 1.4 \cdot
440 adcroft 1.1 %
441 heimbach 1.4 \\ ~ & ~
442     \\ ~ &
443 adcroft 1.1 %
444     \left(
445     \begin{array}{ccc}
446     \frac{\partial ({\cal M}_{\lambda+1})_1}{\partial v^{(\lambda+1)}_1}
447     & \ldots &
448     \frac{\partial ({\cal M}_{\lambda+1})_{n_{\lambda+2}}}{\partial v^{(\lambda+1)}_1} \\
449     \vdots & ~ & \vdots \\
450     \vdots & ~ & \vdots \\
451     \frac{\partial ({\cal M}_{\lambda+1})_1}{\partial v^{(\lambda+1)}_{n_{\lambda+1}}}
452     & \ldots &
453     \frac{\partial ({\cal M}_{\lambda+1})_{n_{\lambda+2}}}{\partial v^{(\lambda+1)}_{n_{\lambda+1}}} \\
454     \end{array}
455     \right)
456 heimbach 1.4 \cdot \, \ldots \, \cdot
457 adcroft 1.1 \left(
458     \begin{array}{c}
459     \delta v^{\ast}_1 \\
460     \vdots \\
461     \delta v^{\ast}_{n} \\
462     \end{array}
463     \right)
464 heimbach 1.4 \end{split}
465     \end{equation}
466 adcroft 1.1
467     Eq. (\ref{forward}) and (\ref{reverse}) are perhaps clearest in
468     showing the advantage of the reverse over the forward mode
469     if the gradient $\nabla _{u}{\cal J}$, i.e. the sensitivity of the
470     cost function $ {\cal J} $ with respect to {\it all} input
471     variables $u$
472     (or the sensitivity of the cost function with respect to
473     {\it all} intermediate states $ \vec{v}^{(\lambda)} $) are sought.
474     In order to be able to solve for each component of the gradient
475     $ \partial {\cal J} / \partial u_{i} $ in (\ref{forward})
476 cnh 1.7 a forward calculation has to be performed for each component separately,
477 adcroft 1.1 i.e. $ \delta \vec{u} = \delta u_{i} {\vec{e}_{i}} $
478     for the $i$-th forward calculation.
479     Then, (\ref{forward}) represents the
480     projection of $ \nabla_u {\cal J} $ onto the $i$-th component.
481     The full gradient is retrieved from the $ m $ forward calculations.
482     In contrast, eq. (\ref{reverse}) yields the full
483     gradient $\nabla _{u}{\cal J}$ (and all intermediate gradients
484     $\nabla _{v^{(\lambda)}}{\cal J}$) within a single reverse calculation.
485    
486 heimbach 1.4 Note, that if $ {\cal J} $ is a vector-valued function
487 adcroft 1.1 of dimension $ l > 1 $,
488     eq. (\ref{reverse}) has to be modified according to
489     \[
490     M^T \left( \nabla_v {\cal J}^T \left(\delta \vec{J}\right) \right)
491     \, = \,
492     \nabla_u {\cal J}^T \cdot \delta \vec{J}
493     \]
494 heimbach 1.4 where now $ \delta \vec{J} \in I\!\!R^l $ is a vector of
495 cnh 1.7 dimension $ l $.
496 adcroft 1.1 In this case $ l $ reverse simulations have to be performed
497     for each $ \delta J_{k}, \,\, k = 1, \ldots, l $.
498     Then, the reverse mode is more efficient as long as
499     $ l < n $, otherwise the forward mode is preferable.
500 cnh 1.7 Strictly, the reverse mode is called adjoint mode only for
501 adcroft 1.1 $ l = 1 $.
502    
503     A detailed analysis of the underlying numerical operations
504     shows that the computation of $\nabla _{u}{\cal J}$ in this way
505     requires about 2 to 5 times the computation of the cost function.
506     Alternatively, the gradient vector could be approximated
507     by finite differences, requiring $m$ computations
508     of the perturbed cost function.
509    
510     To conclude we give two examples of commonly used types
511     of cost functions:
512    
513     \paragraph{Example 1:
514     $ {\cal J} = v_{j} (T) $} ~ \\
515     The cost function consists of the $j$-th component of the model state
516     $ \vec{v} $ at time $T$.
517     Then $ \nabla_v {\cal J}^T = {\vec{f}_{j}} $ is just the $j$-th
518     unit vector. The $ \nabla_u {\cal J}^T $
519     is the projection of the adjoint
520     operator onto the $j$-th component ${\bf f_{j}}$,
521     \[
522     \nabla_u {\cal J}^T
523     \, = \, M^T \cdot \nabla_v {\cal J}^T
524     \, = \, \sum_{i} M^T_{ji} \, {\vec{e}_{i}}
525     \]
526    
527     \paragraph{Example 2:
528     $ {\cal J} = \langle \, {\cal H}(\vec{v}) - \vec{d} \, ,
529     \, {\cal H}(\vec{v}) - \vec{d} \, \rangle $} ~ \\
530 heimbach 1.4 The cost function represents the quadratic model vs. data misfit.
531 adcroft 1.1 Here, $ \vec{d} $ is the data vector and $ {\cal H} $ represents the
532     operator which maps the model state space onto the data space.
533     Then, $ \nabla_v {\cal J} $ takes the form
534     %
535     \begin{equation*}
536     \begin{split}
537     \nabla_v {\cal J}^T & = \, 2 \, \, H \cdot
538     \left( \, {\cal H}(\vec{v}) - \vec{d} \, \right) \\
539     ~ & = \, 2 \sum_{j} \left\{ \sum_k
540     \frac{\partial {\cal H}_k}{\partial v_{j}}
541     \left( {\cal H}_k (\vec{v}) - d_k \right)
542     \right\} \, {\vec{f}_{j}} \\
543     \end{split}
544     \end{equation*}
545     %
546     where $H_{kj} = \partial {\cal H}_k / \partial v_{j} $ is the
547     Jacobi matrix of the data projection operator.
548     Thus, the gradient $ \nabla_u {\cal J} $ is given by the
549     adjoint operator,
550     driven by the model vs. data misfit:
551     \[
552     \nabla_u {\cal J}^T \, = \, 2 \, M^T \cdot
553     H \cdot \left( {\cal H}(\vec{v}) - \vec{d} \, \right)
554     \]
555    
556     \subsection{Storing vs. recomputation in reverse mode}
557     \label{checkpointing}
558    
559     We note an important aspect of the forward vs. reverse
560     mode calculation.
561 heimbach 1.4 Because of the local character of the derivative
562     (a derivative is defined w.r.t. a point along the trajectory),
563 adcroft 1.1 the intermediate results of the model trajectory
564     $\vec{v}^{(\lambda+1)}={\cal M}_{\lambda}(v^{(\lambda)})$
565 heimbach 1.15 may be required to evaluate the intermediate Jacobian
566 adcroft 1.1 $M_{\lambda}|_{\vec{v}^{(\lambda)}} \, \delta \vec{v}^{(\lambda)} $.
567 heimbach 1.15 This is the case e.g. for nonlinear expressions
568     (momentum advection, nonlinear equation of state), state-dependent
569     conditional statements (parameterization schemes).
570 adcroft 1.1 In the forward mode, the intermediate results are required
571     in the same order as computed by the full forward model ${\cal M}$,
572 heimbach 1.4 but in the reverse mode they are required in the reverse order.
573 adcroft 1.1 Thus, in the reverse mode the trajectory of the forward model
574     integration ${\cal M}$ has to be stored to be available in the reverse
575 heimbach 1.4 calculation. Alternatively, the complete model state up to the
576     point of evaluation has to be recomputed whenever its value is required.
577 adcroft 1.1
578     A method to balance the amount of recomputations vs.
579     storage requirements is called {\sf checkpointing}
580 heimbach 1.15 (e.g. \cite{gri:92}, \cite{res-eta:98}).
581 adcroft 1.6 It is depicted in \ref{fig:3levelcheck} for a 3-level checkpointing
582 heimbach 1.4 [as an example, we give explicit numbers for a 3-day
583 adcroft 1.1 integration with a 1-hourly timestep in square brackets].
584     \begin{itemize}
585     %
586     \item [$lev3$]
587     In a first step, the model trajectory is subdivided into
588     $ {n}^{lev3} $ subsections [$ {n}^{lev3} $=3 1-day intervals],
589     with the label $lev3$ for this outermost loop.
590 heimbach 1.4 The model is then integrated along the full trajectory,
591 heimbach 1.15 and the model state stored to disk only at every $ k_{i}^{lev3} $-th timestep
592 adcroft 1.1 [i.e. 3 times, at
593     $ i = 0,1,2 $ corresponding to $ k_{i}^{lev3} = 0, 24, 48 $].
594 heimbach 1.15 In addition, the cost function is computed, if needed.
595 adcroft 1.1 %
596     \item [$lev2$]
597 heimbach 1.4 In a second step each subsection itself is divided into
598 heimbach 1.15 $ {n}^{lev2} $ subsections
599 adcroft 1.1 [$ {n}^{lev2} $=4 6-hour intervals per subsection].
600     The model picks up at the last outermost dumped state
601 heimbach 1.4 $ v_{k_{n}^{lev3}} $ and is integrated forward in time along
602 adcroft 1.1 the last subsection, with the label $lev2$ for this
603     intermediate loop.
604 heimbach 1.15 The model state is now stored to disk at every $ k_{i}^{lev2} $-th
605 adcroft 1.1 timestep
606     [i.e. 4 times, at
607     $ i = 0,1,2,3 $ corresponding to $ k_{i}^{lev2} = 48, 54, 60, 66 $].
608     %
609     \item [$lev1$]
610 heimbach 1.4 Finally, the model picks up at the last intermediate dump state
611     $ v_{k_{n}^{lev2}} $ and is integrated forward in time along
612 heimbach 1.15 the last subsection, with the label $lev1$ for this
613 adcroft 1.1 intermediate loop.
614 heimbach 1.15 Within this sub-subsection only, parts of the model state is stored
615     to memory at every timestep
616 adcroft 1.1 [i.e. every hour $ i=0,...,5$ corresponding to
617     $ k_{i}^{lev1} = 66, 67, \ldots, 71 $].
618 heimbach 1.15 The final state $ v_n = v_{k_{n}^{lev1}} $ is reached
619     and the model state of all preceding timesteps along the last
620     innermost subsection are available, enabling integration backwards
621     in time along the last subsection.
622     The adjoint can thus be computed along this last
623     subsection $k_{n}^{lev2}$.
624 adcroft 1.1 %
625     \end{itemize}
626     %
627     This procedure is repeated consecutively for each previous
628 heimbach 1.15 subsection $k_{n-1}^{lev2}, \ldots, k_{1}^{lev2} $
629 adcroft 1.1 carrying the adjoint computation to the initial time
630     of the subsection $k_{n}^{lev3}$.
631     Then, the procedure is repeated for the previous subsection
632     $k_{n-1}^{lev3}$
633     carrying the adjoint computation to the initial time
634     $k_{1}^{lev3}$.
635    
636     For the full model trajectory of
637     $ n^{lev3} \cdot n^{lev2} \cdot n^{lev1} $ timesteps
638     the required storing of the model state was significantly reduced to
639 heimbach 1.15 $ n^{lev2} + n^{lev3} $ to disk and roughly $ n^{lev1} $ to memory
640 adcroft 1.1 [i.e. for the 3-day integration with a total oof 72 timesteps
641 heimbach 1.15 the model state was stored 7 times to disk and roughly 6 times
642     to memory].
643 adcroft 1.1 This saving in memory comes at a cost of a required
644     3 full forward integrations of the model (one for each
645     checkpointing level).
646 heimbach 1.15 The optimal balance of storage vs. recomputation certainly depends
647     on the computing resources available and may be adjusted by
648     adjusting the partitioning among the
649     $ n^{lev3}, \,\, n^{lev2}, \,\, n^{lev1} $.
650 adcroft 1.1
651     \begin{figure}[t!]
652 adcroft 1.6 \begin{center}
653 adcroft 1.1 %\psdraft
654 adcroft 1.6 %\psfrag{v_k1^lev3}{\mathinfigure{v_{k_{1}^{lev3}}}}
655     %\psfrag{v_kn-1^lev3}{\mathinfigure{v_{k_{n-1}^{lev3}}}}
656     %\psfrag{v_kn^lev3}{\mathinfigure{v_{k_{n}^{lev3}}}}
657     %\psfrag{v_k1^lev2}{\mathinfigure{v_{k_{1}^{lev2}}}}
658     %\psfrag{v_kn-1^lev2}{\mathinfigure{v_{k_{n-1}^{lev2}}}}
659     %\psfrag{v_kn^lev2}{\mathinfigure{v_{k_{n}^{lev2}}}}
660     %\psfrag{v_k1^lev1}{\mathinfigure{v_{k_{1}^{lev1}}}}
661     %\psfrag{v_kn^lev1}{\mathinfigure{v_{k_{n}^{lev1}}}}
662     %\mbox{\epsfig{file=part5/checkpointing.eps, width=0.8\textwidth}}
663     \resizebox{5.5in}{!}{\includegraphics{part5/checkpointing.eps}}
664 adcroft 1.1 %\psfull
665 adcroft 1.6 \end{center}
666     \caption{
667     Schematic view of intermediate dump and restart for
668 adcroft 1.1 3-level checkpointing.}
669 heimbach 1.4 \label{fig:3levelcheck}
670 adcroft 1.1 \end{figure}
671    
672 heimbach 1.4 % \subsection{Optimal perturbations}
673     % \label{sec_optpert}
674 adcroft 1.1
675    
676 heimbach 1.4 % \subsection{Error covariance estimate and Hessian matrix}
677     % \label{sec_hessian}
678 adcroft 1.1
679     \newpage
680    
681     %**********************************************************************
682 heimbach 1.4 \section{TLM and ADM generation in general}
683 adcroft 1.1 \label{sec_ad_setup_gen}
684 edhill 1.17 \begin{rawhtml}
685     <!-- CMIREDIR:sec_ad_setup_gen: -->
686     \end{rawhtml}
687 adcroft 1.1 %**********************************************************************
688    
689     In this section we describe in a general fashion
690     the parts of the code that are relevant for automatic
691 heimbach 1.16 differentiation using the software tool TAF.
692 adcroft 1.1
693 heimbach 1.4 \input{part5/doc_ad_the_model}
694    
695 adcroft 1.6 The basic flow is depicted in \ref{fig:adthemodel}.
696 heimbach 1.16 If CPP option {\tt ALLOW\_AUTODIFF\_TAMC} is defined, the driver routine
697 heimbach 1.4 {\it the\_model\_main}, instead of calling {\it the\_main\_loop},
698     invokes the adjoint of this routine, {\it adthe\_main\_loop},
699 heimbach 1.16 which is the toplevel routine in terms of automatic differentiation.
700     The routine {\it adthe\_main\_loop} has been generated by TAF.
701     It contains both the forward integration of the full model, the
702     cost function calculation,
703 heimbach 1.4 any additional storing that is required for efficient checkpointing,
704     and the reverse integration of the adjoint model.
705 heimbach 1.16
706     [DESCRIBE IN A SEPARATE SECTION THE WORKING OF THE TLM]
707    
708     In Fig. \ref{fig:adthemodel}
709     the structure of {\it adthe\_main\_loop} has been strongly
710     simplified to focus on the essentials; in particular, no checkpointing
711 heimbach 1.4 procedures are shown here.
712     Prior to the call of {\it adthe\_main\_loop}, the routine
713 heimbach 1.15 {\it ctrl\_unpack} is invoked to unpack the control vector
714     or initialise the control variables.
715     Following the call of {\it adthe\_main\_loop},
716     the routine {\it ctrl\_pack}
717 heimbach 1.4 is invoked to pack the control vector
718     (cf. Section \ref{section_ctrl}).
719     If gradient checks are to be performed, the option
720     {\tt ALLOW\_GRADIENT\_CHECK} is defined. In this case
721     the driver routine {\it grdchk\_main} is called after
722     the gradient has been computed via the adjoint
723     (cf. Section \ref{section_grdchk}).
724    
725 heimbach 1.16 %------------------------------------------------------------------
726    
727     \subsection{General setup
728     \label{section_ad_setup}}
729    
730     In order to configure AD-related setups the following packages need
731     to be enabled:
732     {\it
733     \begin{table}[h!]
734     \begin{tabular}{l}
735     autodiff \\
736     ctrl \\
737     cost \\
738     grdchk \\
739     \end{tabular}
740     \end{table}
741     }
742     The packages are enabled by adding them to your experiment-specific
743     configuration file
744     {\it packages.conf} (see Section ???).
745    
746     The following AD-specific CPP option files need to be customized:
747     %
748     \begin{itemize}
749     %
750     \item {\it ECCO\_CPPOPTIONS.h} \\
751     This header file collects CPP options for the packages
752     {\it autodiff, cost, ctrl} as well as AD-unrelated options for
753     the external forcing package {\it exf}.
754     \footnote{NOTE: These options are not set in their package-specific
755     headers such as {\it COST\_CPPOPTIONS.h}, but are instead collected
756     in the single header file {\it ECCO\_CPPOPTIONS.h}.
757     The package-specific header files serve as simple
758     placeholders at this point.}
759     %
760     \item {\it tamc.h} \\
761     This header configures the splitting of the time stepping loop
762     w.r.t. the 3-level checkpointing (see section ???).
763    
764     %
765     \end{itemize}
766    
767     %------------------------------------------------------------------
768    
769     \subsection{Building the AD code
770     \label{section_ad_build}}
771    
772     The build process of an AD code is very similar to building
773     the forward model. However, depending on which AD code one wishes
774     to generate, and on which AD tool is available (TAF or TAMC),
775     the following {\tt make} targets are available:
776    
777     \begin{table}[h!]
778     {\footnotesize
779     \begin{tabular}{ccll}
780     ~ & {\it AD-target} & {\it output} & {\it description} \\
781     \hline
782     \hline
783     (1) & {\tt <MODE><TOOL>only} & {\tt <MODE>\_<TOOL>\_output.f} &
784     generates code for $<$MODE$>$ using $<$TOOL$>$ \\
785     ~ & ~ & ~ & no {\tt make} dependencies on {\tt .F .h} \\
786     ~ & ~ & ~ & useful for compiling on remote platforms \\
787     \hline
788     (2) & {\tt <MODE><TOOL>} & {\tt <MODE>\_<TOOL>\_output.f} &
789     generates code for $<$MODE$>$ using $<$TOOL$>$ \\
790     ~ & ~ & ~ & includes {\tt make} dependencies on {\tt .F .h} \\
791     ~ & ~ & ~ & i.e. input for $<$TOOL$>$ may be re-generated \\
792     \hline
793     (3) & {\tt <MODE>all} & {\tt mitgcmuv\_<MODE>} &
794     generates code for $<$MODE$>$ using $<$TOOL$>$ \\
795     ~ & ~ & ~ & and compiles all code \\
796     ~ & ~ & ~ & (use of TAF is set as default) \\
797     \hline
798     \hline
799     \end{tabular}
800     }
801     \end{table}
802     %
803     Here, the following placeholders are used
804     %
805     \begin{itemize}
806     %
807     \item [$<$TOOL$>$]
808     %
809     \begin{itemize}
810     %
811     \item {\tt TAF}
812     \item {\tt TAMC}
813     %
814     \end{itemize}
815     %
816     \item [$<$MODE$>$]
817     %
818     \begin{itemize}
819     %
820     \item {\tt ad} generates the adjoint model (ADM)
821     \item {\tt ftl} generates the tangent linear model (TLM)
822     \item {\tt svd} generates both ADM and TLM for \\
823     singular value decomposition (SVD) type calculations
824     %
825     \end{itemize}
826     %
827     \end{itemize}
828    
829     For example, to generate the adjoint model using TAF after routines ({\tt .F})
830     or headers ({\tt .h}) have been modified, but without compilation,
831     type {\tt make adtaf};
832     or, to generate the tangent linear model using TAMC without
833     re-generating the input code, type {\tt make ftltamconly}.
834    
835    
836     A typical full build process to generate the ADM via TAF would
837     look like follows:
838     \begin{verbatim}
839     % mkdir build
840     % cd build
841     % ../../../tools/genmake2 -mods=../code_ad
842     % make depend
843     % make adall
844     \end{verbatim}
845    
846     %------------------------------------------------------------------
847    
848     \subsection{The AD build process in detail
849     \label{section_ad_build_detail}}
850    
851     The {\tt make <MODE>all} target consists of the following procedures:
852    
853     \begin{enumerate}
854     %
855     \item
856     A header file {\tt AD\_CONFIG.h} is generated which contains a CPP option
857     on which code ought to be generated. Depending on the {\tt make} target,
858     the contents is
859     \begin{itemize}
860     \item
861     {\tt \#define ALLOW\_ADJOINT\_RUN}
862     \item
863     {\tt \#define ALLOW\_TANGENTLINEAR\_RUN}
864     \item
865     {\tt \#define ALLOW\_ECCO\_OPTIMIZATION}
866     \end{itemize}
867     %
868     \item
869     A single file {\tt <MODE>\_input\_code.f} is concatenated
870     consisting of all {\tt .f} files that are part of the list {\bf AD\_FILES}
871     and all {\tt .flow} files that are part of the list {\bf AD\_FLOW\_FILES}.
872     %
873     \item
874     The AD tool is invoked with the {\bf <MODE>\_<TOOL>\_FLAGS}.
875     The default AD tool flags in {\tt genmake2} can be overrwritten by
876     an {\tt adjoint\_options} file (similar to the platform-specific
877     {\tt build\_options}, see Section ???.
878     The AD tool writes the resulting AD code into the file
879     {\tt <MODE>\_input\_code\_ad.f}
880     %
881     \item
882     A short sed script {\tt adjoint\_sed} is applied to
883     {\tt <MODE>\_input\_code\_ad.f}
884     to reinstate {\bf myThid} into the CALL argument list of active file I/O.
885     The result is written to file {\tt <MODE>\_<TOOL>\_output.f}.
886     %
887     \item
888     All routines are compiled and an executable is generated
889     (see Table ???).
890     %
891     \end{enumerate}
892    
893     \subsubsection{The list AD\_FILES and {\tt .list} files}
894    
895     Not all routines are presented to the AD tool.
896     Routines typically hidden are diagnostics routines which
897     do not influence the cost function, but may create
898     artificial flow dependencies such as I/O of active variables.
899    
900     {\tt genmake2} generates a list (or variable) {\bf AD\_FILES}
901     which contains all routines that are shown to the AD tool.
902     This list is put together from all files with suffix {\tt .list}
903     that {\tt genmake2} finds in its search directories.
904     The list file for the core MITgcm routines is in {\tt model/src/}
905     is called {\tt model\_ad\_diff.list}.
906     Note that no wrapper routine is shown to TAF. These are either
907     not visible at all to the AD code, or hand-written AD code
908     is available (see next section).
909    
910     Each package directory contains its package-specific
911     list file {\tt <PKG>\_ad\_diff.list}. For example,
912     {\tt pkg/ptracers/} contains the file {\tt ptracers\_ad\_diff.list}.
913     Thus, enabling a package will automatically extend the
914     {\bf AD\_FILES} list of {\tt genmake2} to incorporate the
915     package-specific routines.
916     Note that you will need to regenerate the {\tt Makefile} if
917     you enable a package (e.g. by adding it to {\tt packages.conf})
918     and a {\tt Makefile} already exists.
919    
920     \subsubsection{The list AD\_FLOW\_FILES and {\tt .flow} files}
921    
922     TAMC and TAF can evaluate user-specified directives
923     that start with a specific syntax ({\tt CADJ}, {\tt C\$TAF}, {\tt !\$TAF}).
924     The main categories of directives are STORE directives and
925     FLOW directives. Here, we are concerned with flow directives,
926     store directives are treated elsewhere.
927    
928     Flow directives enable the AD tool to evaluate how it should treat
929     routines that are 'hidden' by the user, i.e. routines which are
930     not contained in the {\bf AD\_FILES} list (see previous section),
931     but which are called in part of the code that the AD tool does see.
932     The flow directive tell the AD tool
933     %
934     \begin{itemize}
935     %
936     \item which subroutine arguments are input/output
937     \item which subroutine arguments are active
938     \item which subroutine arguments are required to compute the cost
939     \item which subroutine arguments are dependent
940     %
941     \end{itemize}
942     %
943     The syntax for the flow directives can be found in the
944     AD tool manuals.
945    
946     {\tt genmake2} generates a list (or variable) {\bf AD\_FLOW\_FILES}
947     which contains all files with suffix{\tt .flow} that it finds
948     in its search directories.
949     The flow directives for the core MITgcm routines of
950     {\tt eesupp/src/} and {\tt model/src/}
951     reside in {\tt pkg/autodiff/}.
952     This directory also contains hand-written adjoint code
953     for the MITgcm WRAPPER (see Section ???).
954    
955     Flow directives for package-specific routines are contained in
956     the corresponding package directories in the file
957     {\tt <PKG>\_ad.flow}, e.g. ptracers-specific directives are in
958     {\tt ptracers\_ad.flow}.
959    
960     \subsubsection{Store directives for 3-level checkpointing}
961    
962     The storing that is required at each period of the
963     3-level checkpointing is controled by three
964     top-level headers.
965    
966     \begin{verbatim}
967     do ilev_3 = 1, nchklev_3
968     # include ``checkpoint_lev3.h''
969     do ilev_2 = 1, nchklev_2
970     # include ``checkpoint_lev2.h''
971     do ilev_1 = 1, nchklev_1
972     # include ``checkpoint_lev1.h''
973    
974     ...
975    
976     end do
977     end do
978     end do
979     \end{verbatim}
980    
981     All files {\tt checkpoint\_lev?.h} are contained in directory
982     {\tt pkg/autodiff/}.
983    
984    
985     \subsubsection{Changing the default AD tool flags: ad\_options files}
986    
987    
988     \subsubsection{Hand-written adjoint code}
989    
990     %------------------------------------------------------------------
991    
992 heimbach 1.4 \subsection{The cost function (dependent variable)
993     \label{section_cost}}
994 adcroft 1.1
995     The cost function $ {\cal J} $ is referred to as the {\sf dependent variable}.
996     It is a function of the input variables $ \vec{u} $ via the composition
997     $ {\cal J}(\vec{u}) \, = \, {\cal J}(M(\vec{u})) $.
998 heimbach 1.15 The input are referred to as the
999 adcroft 1.1 {\sf independent variables} or {\sf control variables}.
1000     All aspects relevant to the treatment of the cost function $ {\cal J} $
1001 cnh 1.7 (parameter setting, initialization, accumulation,
1002 heimbach 1.4 final evaluation), are controlled by the package {\it pkg/cost}.
1003 heimbach 1.15 The aspects relevant to the treatment of the independent variables
1004     are controlled by the package {\it pkg/ctrl} and will be treated
1005     in the next section.
1006 heimbach 1.4
1007     \input{part5/doc_cost_flow}
1008 adcroft 1.1
1009 heimbach 1.16 \subsubsection{Enabling the package}
1010    
1011 adcroft 1.1 \fbox{
1012     \begin{minipage}{12cm}
1013 heimbach 1.16 {\it packages.conf}, {\it ECCO\_CPPOPTIONS.h}
1014 adcroft 1.1 \end{minipage}
1015     }
1016 heimbach 1.16 \begin{itemize}
1017 adcroft 1.1 %
1018 heimbach 1.16 \item
1019     The package is enabled by adding {\it cost} to your file {\it packages.conf}
1020     (see Section ???)
1021 adcroft 1.1 %
1022 heimbach 1.16 \item
1023    
1024    
1025     \end{itemize}
1026 adcroft 1.1 %
1027 heimbach 1.16
1028 heimbach 1.15 N.B.: In general the following packages ought to be enabled
1029     simultaneously: {\it autodiff, cost, ctrl}.
1030 heimbach 1.4 The basic CPP option to enable the cost function is {\bf ALLOW\_COST}.
1031     Each specific cost function contribution has its own option.
1032     For the present example the option is {\bf ALLOW\_COST\_TRACER}.
1033     All cost-specific options are set in {\it ECCO\_CPPOPTIONS.h}
1034 adcroft 1.1 Since the cost function is usually used in conjunction with
1035     automatic differentiation, the CPP option
1036 heimbach 1.15 {\bf ALLOW\_ADJOINT\_RUN} (file {\it CPP\_OPTIONS.h}) and
1037     {\bf ALLOW\_AUTODIFF\_TAMC} (file {\it ECCO\_CPPOPTIONS.h})
1038     should be defined.
1039 adcroft 1.1
1040 cnh 1.7 \subsubsection{Initialization}
1041 adcroft 1.1 %
1042 cnh 1.7 The initialization of the {\it cost} package is readily enabled
1043 heimbach 1.15 as soon as the CPP option {\bf ALLOW\_COST} is defined.
1044 adcroft 1.1 %
1045     \begin{itemize}
1046     %
1047     \item
1048     \fbox{
1049     \begin{minipage}{12cm}
1050     Parameters: {\it cost\_readparms}
1051     \end{minipage}
1052     }
1053     \\
1054     This S/R
1055     reads runtime flags and parameters from file {\it data.cost}.
1056     For the present example the only relevant parameter read
1057     is {\bf mult\_tracer}. This multiplier enables different
1058     cost function contributions to be switched on
1059     ( = 1.) or off ( = 0.) at runtime.
1060     For more complex cost functions which involve model vs. data
1061     misfits, the corresponding data filenames and data
1062     specifications (start date and time, period, ...) are read
1063     in this S/R.
1064     %
1065     \item
1066     \fbox{
1067     \begin{minipage}{12cm}
1068     Variables: {\it cost\_init}
1069     \end{minipage}
1070     }
1071     \\
1072     This S/R
1073 cnh 1.7 initializes the different cost function contributions.
1074     The contribution for the present example is {\bf objf\_tracer}
1075 adcroft 1.1 which is defined on each tile (bi,bj).
1076     %
1077     \end{itemize}
1078     %
1079 heimbach 1.4 \subsubsection{Accumulation}
1080 adcroft 1.1 %
1081     \begin{itemize}
1082     %
1083     \item
1084     \fbox{
1085     \begin{minipage}{12cm}
1086     {\it cost\_tile}, {\it cost\_tracer}
1087     \end{minipage}
1088     }
1089     \end{itemize}
1090     %
1091     The 'driver' routine
1092     {\it cost\_tile} is called at the end of each time step.
1093     Within this 'driver' routine, S/R are called for each of
1094     the chosen cost function contributions.
1095     In the present example ({\bf ALLOW\_COST\_TRACER}),
1096     S/R {\it cost\_tracer} is called.
1097     It accumulates {\bf objf\_tracer} according to eqn. (\ref{???}).
1098     %
1099     \subsubsection{Finalize all contributions}
1100     %
1101     \begin{itemize}
1102     %
1103     \item
1104     \fbox{
1105     \begin{minipage}{12cm}
1106     {\it cost\_final}
1107     \end{minipage}
1108     }
1109     \end{itemize}
1110     %
1111     At the end of the forward integration S/R {\it cost\_final}
1112     is called. It accumulates the total cost function {\bf fc}
1113     from each contribution and sums over all tiles:
1114     \begin{equation}
1115     {\cal J} \, = \,
1116     {\rm fc} \, = \,
1117 heimbach 1.15 {\rm mult\_tracer} \sum_{\text{global sum}} \sum_{bi,\,bj}^{nSx,\,nSy}
1118 adcroft 1.1 {\rm objf\_tracer}(bi,bj) \, + \, ...
1119     \end{equation}
1120     %
1121     The total cost function {\bf fc} will be the
1122     'dependent' variable in the argument list for TAMC, i.e.
1123     \begin{verbatim}
1124     tamc -output 'fc' ...
1125     \end{verbatim}
1126    
1127 cnh 1.3 %%%% \end{document}
1128 adcroft 1.1
1129     \input{part5/doc_ad_the_main}
1130    
1131 heimbach 1.4 \subsection{The control variables (independent variables)
1132     \label{section_ctrl}}
1133 adcroft 1.1
1134     The control variables are a subset of the model input
1135     (initial conditions, boundary conditions, model parameters).
1136     Here we identify them with the variable $ \vec{u} $.
1137     All intermediate variables whose derivative w.r.t. control
1138 heimbach 1.4 variables do not vanish are called {\sf active variables}.
1139 adcroft 1.1 All subroutines whose derivative w.r.t. the control variables
1140     don't vanish are called {\sf active routines}.
1141     Read and write operations from and to file can be viewed
1142     as variable assignments. Therefore, files to which
1143     active variables are written and from which active variables
1144     are read are called {\sf active files}.
1145     All aspects relevant to the treatment of the control variables
1146 cnh 1.7 (parameter setting, initialization, perturbation)
1147     are controlled by the package {\it pkg/ctrl}.
1148 adcroft 1.1
1149 heimbach 1.4 \input{part5/doc_ctrl_flow}
1150    
1151 adcroft 1.1 \subsubsection{genmake and CPP options}
1152     %
1153     \begin{itemize}
1154     %
1155     \item
1156     \fbox{
1157     \begin{minipage}{12cm}
1158     {\it genmake}, {\it CPP\_OPTIONS.h}, {\it ECCO\_CPPOPTIONS.h}
1159     \end{minipage}
1160     }
1161     \end{itemize}
1162     %
1163     To enable the directory to be included to the compile list,
1164     {\bf ctrl} has to be added to the {\bf enable} list in
1165 heimbach 1.15 {\it .genmakerc} or in {\it genmake} itself (analogous to {\it cost}
1166     package, cf. previous section).
1167 adcroft 1.1 Each control variable is enabled via its own CPP option
1168     in {\it ECCO\_CPPOPTIONS.h}.
1169    
1170 cnh 1.7 \subsubsection{Initialization}
1171 adcroft 1.1 %
1172     \begin{itemize}
1173     %
1174     \item
1175     \fbox{
1176     \begin{minipage}{12cm}
1177     Parameters: {\it ctrl\_readparms}
1178     \end{minipage}
1179     }
1180     \\
1181     %
1182     This S/R
1183     reads runtime flags and parameters from file {\it data.ctrl}.
1184     For the present example the file contains the file names
1185     of each control variable that is used.
1186     In addition, the number of wet points for each control
1187     variable and the net dimension of the space of control
1188     variables (counting wet points only) {\bf nvarlength}
1189     is determined.
1190     Masks for wet points for each tile {\bf (bi,\,bj)}
1191     and vertical layer {\bf k} are generated for the three
1192     relevant categories on the C-grid:
1193     {\bf nWetCtile} for tracer fields,
1194     {\bf nWetWtile} for zonal velocity fields,
1195     {\bf nWetStile} for meridional velocity fields.
1196     %
1197     \item
1198     \fbox{
1199     \begin{minipage}{12cm}
1200     Control variables, control vector,
1201     and their gradients: {\it ctrl\_unpack}
1202     \end{minipage}
1203     }
1204     \\
1205     %
1206     Two important issues related to the handling of the control
1207     variables in the MITGCM need to be addressed.
1208     First, in order to save memory, the control variable arrays
1209     are not kept in memory, but rather read from file and added
1210 cnh 1.7 to the initial fields during the model initialization phase.
1211 adcroft 1.1 Similarly, the corresponding adjoint fields which represent
1212     the gradient of the cost function w.r.t. the control variables
1213 heimbach 1.4 are written to file at the end of the adjoint integration.
1214 adcroft 1.1 Second, in addition to the files holding the 2-dim. and 3-dim.
1215 heimbach 1.4 control variables and the corresponding cost gradients,
1216     a 1-dim. {\sf control vector}
1217 adcroft 1.1 and {\sf gradient vector} are written to file. They contain
1218     only the wet points of the control variables and the corresponding
1219     gradient.
1220     This leads to a significant data compression.
1221 heimbach 1.4 Furthermore, an option is available
1222     ({\tt ALLOW\_NONDIMENSIONAL\_CONTROL\_IO}) to
1223     non-dimensionalise the control and gradient vector,
1224     which otherwise would contain different pieces of different
1225     magnitudes and units.
1226     Finally, the control and gradient vector can be passed to a
1227 adcroft 1.1 minimization routine if an update of the control variables
1228     is sought as part of a minimization exercise.
1229    
1230     The files holding fields and vectors of the control variables
1231     and gradient are generated and initialised in S/R {\it ctrl\_unpack}.
1232     %
1233     \end{itemize}
1234    
1235     \subsubsection{Perturbation of the independent variables}
1236     %
1237 heimbach 1.4 The dependency flow for differentiation w.r.t. the controls
1238     starts with adding a perturbation onto the input variable,
1239 adcroft 1.1 thus defining the independent or control variables for TAMC.
1240 heimbach 1.4 Three types of controls may be considered:
1241 adcroft 1.1 %
1242     \begin{itemize}
1243     %
1244     \item
1245     \fbox{
1246     \begin{minipage}{12cm}
1247     {\it ctrl\_map\_ini} (initial value sensitivity):
1248     \end{minipage}
1249     }
1250     \\
1251     %
1252     Consider as an example the initial tracer distribution
1253     {\bf tr1} as control variable.
1254     After {\bf tr1} has been initialised in
1255 heimbach 1.4 {\it ini\_tr1} (dynamical variables such as
1256 adcroft 1.1 temperature and salinity are initialised in {\it ini\_fields}),
1257     a perturbation anomaly is added to the field in S/R
1258     {\it ctrl\_map\_ini}
1259     %
1260     \begin{equation}
1261     \begin{split}
1262     u & = \, u_{[0]} \, + \, \Delta u \\
1263     {\bf tr1}(...) & = \, {\bf tr1_{ini}}(...) \, + \, {\bf xx\_tr1}(...)
1264     \label{perturb}
1265     \end{split}
1266     \end{equation}
1267     %
1268 heimbach 1.4 {\bf xx\_tr1} is a 3-dim. global array
1269 adcroft 1.1 holding the perturbation. In the case of a simple
1270     sensitivity study this array is identical to zero.
1271 heimbach 1.4 However, it's specification is essential in the context
1272     of automatic differentiation since TAMC
1273 adcroft 1.1 treats the corresponding line in the code symbolically
1274     when determining the differentiation chain and its origin.
1275     Thus, the variable names are part of the argument list
1276     when calling TAMC:
1277     %
1278     \begin{verbatim}
1279     tamc -input 'xx_tr1 ...' ...
1280     \end{verbatim}
1281     %
1282     Now, as mentioned above, the MITGCM avoids maintaining
1283     an array for each control variable by reading the
1284     perturbation to a temporary array from file.
1285     To ensure the symbolic link to be recognized by TAMC, a scalar
1286     dummy variable {\bf xx\_tr1\_dummy} is introduced
1287     and an 'active read' routine of the adjoint support
1288     package {\it pkg/autodiff} is invoked.
1289     The read-procedure is tagged with the variable
1290 cnh 1.7 {\bf xx\_tr1\_dummy} enabling TAMC to recognize the
1291     initialization of the perturbation.
1292 adcroft 1.1 The modified call of TAMC thus reads
1293     %
1294     \begin{verbatim}
1295     tamc -input 'xx_tr1_dummy ...' ...
1296     \end{verbatim}
1297     %
1298     and the modified operation to (\ref{perturb})
1299     in the code takes on the form
1300     %
1301     \begin{verbatim}
1302     call active_read_xyz(
1303     & ..., tmpfld3d, ..., xx_tr1_dummy, ... )
1304    
1305     tr1(...) = tr1(...) + tmpfld3d(...)
1306     \end{verbatim}
1307     %
1308     Note, that reading an active variable corresponds
1309     to a variable assignment. Its derivative corresponds
1310 heimbach 1.15 to a write statement of the adjoint variable, followed by
1311     a reset.
1312 adcroft 1.1 The 'active file' routines have been designed
1313 heimbach 1.4 to support active read and corresponding adjoint active write
1314     operations (and vice versa).
1315 adcroft 1.1 %
1316     \item
1317     \fbox{
1318     \begin{minipage}{12cm}
1319     {\it ctrl\_map\_forcing} (boundary value sensitivity):
1320     \end{minipage}
1321     }
1322     \\
1323     %
1324     The handling of boundary values as control variables
1325     proceeds exactly analogous to the initial values
1326     with the symbolic perturbation taking place in S/R
1327     {\it ctrl\_map\_forcing}.
1328     Note however an important difference:
1329     Since the boundary values are time dependent with a new
1330     forcing field applied at each time steps,
1331 heimbach 1.4 the general problem may be thought of as
1332     a new control variable at each time step
1333     (or, if the perturbation is averaged over a certain period,
1334     at each $ N $ timesteps), i.e.
1335 adcroft 1.1 \[
1336     u_{\rm forcing} \, = \,
1337     \{ \, u_{\rm forcing} ( t_n ) \, \}_{
1338     n \, = \, 1, \ldots , {\rm nTimeSteps} }
1339     \]
1340     %
1341     In the current example an equilibrium state is considered,
1342     and only an initial perturbation to
1343     surface forcing is applied with respect to the
1344     equilibrium state.
1345     A time dependent treatment of the surface forcing is
1346     implemented in the ECCO environment, involving the
1347     calendar ({\it cal}~) and external forcing ({\it exf}~) packages.
1348     %
1349     \item
1350     \fbox{
1351     \begin{minipage}{12cm}
1352     {\it ctrl\_map\_params} (parameter sensitivity):
1353     \end{minipage}
1354     }
1355     \\
1356     %
1357     This routine is not yet implemented, but would proceed
1358     proceed along the same lines as the initial value sensitivity.
1359 heimbach 1.4 The mixing parameters {\bf diffkr} and {\bf kapgm}
1360     are currently added as controls in {\it ctrl\_map\_ini.F}.
1361 adcroft 1.1 %
1362     \end{itemize}
1363     %
1364    
1365     \subsubsection{Output of adjoint variables and gradient}
1366     %
1367 heimbach 1.4 Several ways exist to generate output of adjoint fields.
1368 adcroft 1.1 %
1369     \begin{itemize}
1370     %
1371     \item
1372     \fbox{
1373     \begin{minipage}{12cm}
1374 heimbach 1.4 {\it ctrl\_map\_ini, ctrl\_map\_forcing}:
1375 adcroft 1.1 \end{minipage}
1376     }
1377     \\
1378     \begin{itemize}
1379     %
1380 heimbach 1.4 \item {\bf xx\_...}: the control variable fields \\
1381     Before the forward integration, the control
1382     variables are read from file {\bf xx\_ ...} and added to
1383     the model field.
1384 adcroft 1.1 %
1385     \item {\bf adxx\_...}: the adjoint variable fields, i.e. the gradient
1386 heimbach 1.4 $ \nabla _{u}{\cal J} $ for each control variable \\
1387     After the adjoint integration the corresponding adjoint
1388     variables are written to {\bf adxx\_ ...}.
1389 adcroft 1.1 %
1390 heimbach 1.4 \end{itemize}
1391 adcroft 1.1 %
1392 heimbach 1.4 \item
1393     \fbox{
1394     \begin{minipage}{12cm}
1395     {\it ctrl\_unpack, ctrl\_pack}:
1396     \end{minipage}
1397     }
1398     \\
1399     %
1400     \begin{itemize}
1401     %
1402     \item {\bf vector\_ctrl}: the control vector \\
1403 cnh 1.7 At the very beginning of the model initialization,
1404 heimbach 1.4 the updated compressed control vector is read (or initialised)
1405     and distributed to 2-dim. and 3-dim. control variable fields.
1406     %
1407     \item {\bf vector\_grad}: the gradient vector \\
1408     At the very end of the adjoint integration,
1409     the 2-dim. and 3-dim. adjoint variables are read,
1410     compressed to a single vector and written to file.
1411 adcroft 1.1 %
1412     \end{itemize}
1413     %
1414     \item
1415     \fbox{
1416     \begin{minipage}{12cm}
1417     {\it addummy\_in\_stepping}:
1418     \end{minipage}
1419     }
1420     \\
1421     In addition to writing the gradient at the end of the
1422 heimbach 1.4 forward/adjoint integration, many more adjoint variables
1423     of the model state
1424     at intermediate times can be written using S/R
1425 adcroft 1.1 {\it addummy\_in\_stepping}.
1426     This routine is part of the adjoint support package
1427     {\it pkg/autodiff} (cf.f. below).
1428 heimbach 1.15 The procedure is enabled using via the CPP-option
1429     {\bf ALLOW\_AUTODIFF\_MONITOR} (file {\it ECCO\_CPPOPTIONS.h}).
1430 adcroft 1.1 To be part of the adjoint code, the corresponding S/R
1431     {\it dummy\_in\_stepping} has to be called in the forward
1432     model (S/R {\it the\_main\_loop}) at the appropriate place.
1433 heimbach 1.15 The adjoint common blocks are extracted from the adjoint code
1434     via the header file {\it adcommon.h}.
1435 adcroft 1.1
1436     {\it dummy\_in\_stepping} is essentially empty,
1437     the corresponding adjoint routine is hand-written rather
1438     than generated automatically.
1439     Appropriate flow directives ({\it dummy\_in\_stepping.flow})
1440     ensure that TAMC does not automatically
1441     generate {\it addummy\_in\_stepping} by trying to differentiate
1442 heimbach 1.4 {\it dummy\_in\_stepping}, but instead refers to
1443     the hand-written routine.
1444 adcroft 1.1
1445     {\it dummy\_in\_stepping} is called in the forward code
1446     at the beginning of each
1447     timestep, before the call to {\it dynamics}, thus ensuring
1448     that {\it addummy\_in\_stepping} is called at the end of
1449     each timestep in the adjoint calculation, after the call to
1450     {\it addynamics}.
1451    
1452     {\it addummy\_in\_stepping} includes the header files
1453 heimbach 1.4 {\it adcommon.h}.
1454     This header file is also hand-written. It contains
1455     the common blocks
1456     {\bf /addynvars\_r/}, {\bf /addynvars\_cd/},
1457     {\bf /addynvars\_diffkr/}, {\bf /addynvars\_kapgm/},
1458 adcroft 1.1 {\bf /adtr1\_r/}, {\bf /adffields/},
1459     which have been extracted from the adjoint code to enable
1460     access to the adjoint variables.
1461 heimbach 1.15
1462     {\bf WARNING:} If the structure of the common blocks
1463     {\bf /dynvars\_r/}, {\bf /dynvars\_cd/}, etc., changes
1464     similar changes will occur in the adjoint common blocks.
1465     Therefore, consistency between the TAMC-generated common blocks
1466     and those in {\it adcommon.h} have to be checked.
1467 adcroft 1.1 %
1468     \end{itemize}
1469    
1470    
1471     \subsubsection{Control variable handling for
1472     optimization applications}
1473    
1474     In optimization mode the cost function $ {\cal J}(u) $ is sought
1475     to be minimized with respect to a set of control variables
1476     $ \delta {\cal J} \, = \, 0 $, in an iterative manner.
1477     The gradient $ \nabla _{u}{\cal J} |_{u_{[k]}} $ together
1478     with the value of the cost function itself $ {\cal J}(u_{[k]}) $
1479     at iteration step $ k $ serve
1480     as input to a minimization routine (e.g. quasi-Newton method,
1481 heimbach 1.9 conjugate gradient, ... \cite{gil-lem:89})
1482 heimbach 1.4 to compute an update in the
1483 adcroft 1.1 control variable for iteration step $k+1$
1484     \[
1485     u_{[k+1]} \, = \, u_{[0]} \, + \, \Delta u_{[k+1]}
1486     \quad \mbox{satisfying} \quad
1487     {\cal J} \left( u_{[k+1]} \right) \, < \, {\cal J} \left( u_{[k]} \right)
1488     \]
1489     $ u_{[k+1]} $ then serves as input for a forward/adjoint run
1490     to determine $ {\cal J} $ and $ \nabla _{u}{\cal J} $ at iteration step
1491     $ k+1 $.
1492     Tab. \ref{???} sketches the flow between forward/adjoint model
1493     and the minimization routine.
1494    
1495     \begin{eqnarray*}
1496 heimbach 1.4 \scriptsize
1497 adcroft 1.1 \begin{array}{ccccc}
1498     u_{[0]} \,\, , \,\, \Delta u_{[k]} & ~ & ~ & ~ & ~ \\
1499     {\Big\downarrow}
1500     & ~ & ~ & ~ & ~ \\
1501     ~ & ~ & ~ & ~ & ~ \\
1502     \hline
1503     \multicolumn{1}{|c}{~} & ~ & ~ & ~ & \multicolumn{1}{c|}{~} \\
1504     \multicolumn{1}{|c}{
1505     u_{[k]} = u_{[0]} + \Delta u_{[k]}} &
1506     \stackrel{\bf forward}{\bf \longrightarrow} &
1507     v_{[k]} = M \left( u_{[k]} \right) &
1508     \stackrel{\bf forward}{\bf \longrightarrow} &
1509     \multicolumn{1}{c|}{
1510     {\cal J}_{[k]} = {\cal J} \left( M \left( u_{[k]} \right) \right)} \\
1511     \multicolumn{1}{|c}{~} & ~ & ~ & ~ & \multicolumn{1}{c|}{~} \\
1512     \hline
1513 heimbach 1.4 \multicolumn{1}{|c}{~} & ~ & ~ & ~ & \multicolumn{1}{c|}{~} \\
1514     \multicolumn{1}{|c}{~} & ~ & ~ & ~ & \multicolumn{1}{c|}{{\Big\downarrow}} \\
1515     \multicolumn{1}{|c}{~} & ~ & ~ & ~ & \multicolumn{1}{c|}{~} \\
1516 adcroft 1.1 \hline
1517     \multicolumn{1}{|c}{~} & ~ & ~ & ~ & \multicolumn{1}{c|}{~} \\
1518     \multicolumn{1}{|c}{
1519     \nabla_u {\cal J}_{[k]} (\delta {\cal J}) =
1520 heimbach 1.4 T^{\ast} \cdot \nabla_v {\cal J} |_{v_{[k]}} (\delta {\cal J})} &
1521 adcroft 1.1 \stackrel{\bf adjoint}{\mathbf \longleftarrow} &
1522     ad \, v_{[k]} (\delta {\cal J}) =
1523     \nabla_v {\cal J} |_{v_{[k]}} (\delta {\cal J}) &
1524     \stackrel{\bf adjoint}{\mathbf \longleftarrow} &
1525     \multicolumn{1}{c|}{ ad \, {\cal J} = \delta {\cal J}} \\
1526     \multicolumn{1}{|c}{~} & ~ & ~ & ~ & \multicolumn{1}{c|}{~} \\
1527     \hline
1528     ~ & ~ & ~ & ~ & ~ \\
1529 heimbach 1.4 \hspace*{15ex}{\Bigg\downarrow}
1530     \quad {\cal J}_{[k]}, \quad \nabla_u {\cal J}_{[k]}
1531     & ~ & ~ & ~ & ~ \\
1532 adcroft 1.1 ~ & ~ & ~ & ~ & ~ \\
1533     \hline
1534     \multicolumn{1}{|c}{~} & ~ & ~ & ~ & \multicolumn{1}{c|}{~} \\
1535     \multicolumn{1}{|c}{
1536     {\cal J}_{[k]} \,\, , \,\, \nabla_u {\cal J}_{[k]}} &
1537     {\mathbf \longrightarrow} & \text{\bf minimisation} &
1538     {\mathbf \longrightarrow} &
1539     \multicolumn{1}{c|}{ \Delta u_{[k+1]}} \\
1540     \multicolumn{1}{|c}{~} & ~ & ~ & ~ & \multicolumn{1}{c|}{~} \\
1541     \hline
1542     ~ & ~ & ~ & ~ & ~ \\
1543     ~ & ~ & ~ & ~ & \Big\downarrow \\
1544     ~ & ~ & ~ & ~ & \Delta u_{[k+1]} \\
1545     \end{array}
1546     \end{eqnarray*}
1547    
1548     The routines {\it ctrl\_unpack} and {\it ctrl\_pack} provide
1549     the link between the model and the minimization routine.
1550     As described in Section \ref{???}
1551     the {\it unpack} and {\it pack} routines read and write
1552     control and gradient {\it vectors} which are compressed
1553     to contain only wet points, in addition to the full
1554     2-dim. and 3-dim. fields.
1555     The corresponding I/O flow looks as follows:
1556    
1557     \vspace*{0.5cm}
1558    
1559 heimbach 1.4 {\scriptsize
1560 adcroft 1.1 \begin{tabular}{ccccc}
1561     {\bf vector\_ctrl\_$<$k$>$ } & ~ & ~ & ~ & ~ \\
1562     {\big\downarrow} & ~ & ~ & ~ & ~ \\
1563     \cline{1-1}
1564     \multicolumn{1}{|c|}{\it ctrl\_unpack} & ~ & ~ & ~ & ~ \\
1565     \cline{1-1}
1566     {\big\downarrow} & ~ & ~ & ~ & ~ \\
1567     \cline{3-3}
1568     \multicolumn{1}{l}{\bf xx\_theta0...$<$k$>$} & ~ &
1569     \multicolumn{1}{|c|}{~} & ~ & ~ \\
1570 heimbach 1.4 \multicolumn{1}{l}{\bf xx\_salt0...$<$k$>$} &
1571     $\stackrel{\mbox{read}}{\longrightarrow}$ &
1572 adcroft 1.1 \multicolumn{1}{|c|}{forward integration} & ~ & ~ \\
1573     \multicolumn{1}{l}{\bf \vdots} & ~ & \multicolumn{1}{|c|}{~}
1574     & ~ & ~ \\
1575     \cline{3-3}
1576 heimbach 1.4 ~ & ~ & $\downarrow$ & ~ & ~ \\
1577 adcroft 1.1 \cline{3-3}
1578     ~ & ~ &
1579     \multicolumn{1}{|c|}{~} & ~ &
1580     \multicolumn{1}{l}{\bf adxx\_theta0...$<$k$>$} \\
1581     ~ & ~ & \multicolumn{1}{|c|}{adjoint integration} &
1582 heimbach 1.4 $\stackrel{\mbox{write}}{\longrightarrow}$ &
1583 adcroft 1.1 \multicolumn{1}{l}{\bf adxx\_salt0...$<$k$>$} \\
1584     ~ & ~ & \multicolumn{1}{|c|}{~}
1585     & ~ & \multicolumn{1}{l}{\bf \vdots} \\
1586     \cline{3-3}
1587     ~ & ~ & ~ & ~ & {\big\downarrow} \\
1588     \cline{5-5}
1589     ~ & ~ & ~ & ~ & \multicolumn{1}{|c|}{\it ctrl\_pack} \\
1590     \cline{5-5}
1591     ~ & ~ & ~ & ~ & {\big\downarrow} \\
1592     ~ & ~ & ~ & ~ & {\bf vector\_grad\_$<$k$>$ } \\
1593     \end{tabular}
1594 heimbach 1.4 }
1595 adcroft 1.1
1596     \vspace*{0.5cm}
1597    
1598    
1599 heimbach 1.4 {\it ctrl\_unpack} reads the updated control vector
1600 adcroft 1.1 {\bf vector\_ctrl\_$<$k$>$}.
1601     It distributes the different control variables to
1602     2-dim. and 3-dim. files {\it xx\_...$<$k$>$}.
1603 heimbach 1.4 At the start of the forward integration the control variables
1604     are read from {\it xx\_...$<$k$>$} and added to the
1605     field.
1606     Correspondingly, at the end of the adjoint integration
1607     the adjoint fields are written
1608 adcroft 1.1 to {\it adxx\_...$<$k$>$}, again via the active file routines.
1609 heimbach 1.4 Finally, {\it ctrl\_pack} collects all adjoint files
1610 adcroft 1.1 and writes them to the compressed vector file
1611     {\bf vector\_grad\_$<$k$>$}.

  ViewVC Help
Powered by ViewVC 1.1.22