1 |
dfer |
1.10 |
% $Header: /u/gcmpack/manual/part3/case_studies/global_oce_estimation/global_oce_estimation.tex,v 1.9 2008/01/15 18:54:54 jmc Exp $ |
2 |
dfer |
1.1 |
% $Name: $ |
3 |
|
|
|
4 |
dfer |
1.5 |
\section[Global Ocean State Estimation Example]{Global Ocean State Estimation at $4^\circ$ Resolution} |
5 |
dfer |
1.1 |
\label{www:tutorials} |
6 |
|
|
\label{sect:eg-global_state_estimate} |
7 |
|
|
\begin{rawhtml} |
8 |
|
|
<!-- CMIREDIR:eg-global_state_estimate: --> |
9 |
|
|
\end{rawhtml} |
10 |
jmc |
1.9 |
\begin{center} |
11 |
|
|
(in directory: {\it verification/tutorial\_global\_oce\_optim/}) |
12 |
|
|
\end{center} |
13 |
dfer |
1.1 |
|
14 |
dfer |
1.3 |
\subsection{Overview} |
15 |
|
|
|
16 |
dfer |
1.5 |
Mean surface heat flux as a control variable : $Q_\mathrm{netm}$ |
17 |
dfer |
1.1 |
|
18 |
dfer |
1.10 |
This experiment illustrates the optimization capacity of the MITgcm. |
19 |
|
|
Using an ocean configuration with realistic geography and bathymetry on a |
20 |
|
|
$4\times4^\circ$ spherical polar grid, we estimate a time-independent surface |
21 |
|
|
heat flux adjustment $Q_\mathrm{netm}$ that attempts to bring the model |
22 |
|
|
climatology into consistency with observations (Levitus climatology). |
23 |
|
|
The files for this experiment can be found in the verification directory under |
24 |
molod |
1.7 |
tutorial\_global\_oce\_optim. |
25 |
dfer |
1.1 |
|
26 |
dfer |
1.10 |
This adjustment $Q_\mathrm{netm}$ (a 2D field only function of longitude and |
27 |
|
|
latitude) is the control variable of an optimization problem. It is inferred |
28 |
|
|
by an iterative procedure using an `adjoint technique' and a least-squares |
29 |
|
|
method (see, for example, Stammer et al. (2002) and Ferreira et al. (2005)). |
30 |
dfer |
1.1 |
|
31 |
|
|
The ocean model is run forward in time and the quality of the solution is |
32 |
|
|
determined by a cost function, $J_1$, a measure of the departure of the model |
33 |
|
|
climatology from observations: |
34 |
|
|
\begin{equation} |
35 |
|
|
J_1=\frac{1}{N}\sum_{i=1}^N \left[ \frac{\overline{T}_i-\overline{T}_i^{lev}}{\sigma_i^T}\right]^2 |
36 |
|
|
\end{equation} |
37 |
dfer |
1.10 |
where $\overline{T}_i$ is the average model potential temperature and |
38 |
|
|
$\overline{T}_i^{lev}$ the annual mean observed potential temperature at each |
39 |
|
|
grid point $i$. The differences are weighted by an {\it a priori} uncertainty |
40 |
|
|
$\sigma_i^T$ on observations (as provided by Levitus and Boyer 1994). The error |
41 |
|
|
$\sigma_i^T$ is only a function of depth and varies from 0.5 at the surface to |
42 |
|
|
0.05~K at the bottom of the ocean, mainly reflecting the decreasing |
43 |
|
|
temperature variance with depth (Fig. \ref{Error}a). A value of $J_1$ of |
44 |
|
|
order 1 means that the model is, on average, within observational |
45 |
|
|
uncertainties. |
46 |
dfer |
1.1 |
|
47 |
dfer |
1.10 |
The cost function also places constraints on the adjustment to insure it is |
48 |
|
|
"reasonable", i.e. of order of the uncertainties on the observed surface heat |
49 |
dfer |
1.1 |
flux: |
50 |
|
|
\begin{equation} |
51 |
dfer |
1.10 |
J_2 = \frac{1}{N} \sum_{i=1}^N \left[\frac{Q_\mathrm{netm}}{\sigma^Q_i} \right]^2 |
52 |
dfer |
1.1 |
\end{equation} |
53 |
dfer |
1.10 |
where $\sigma^Q_i$ are the {\it a priori} errors on the observed heat flux as |
54 |
|
|
estimated by Stammer et al. (2002) from 30\% of local root-mean-square |
55 |
|
|
variability of the NCEP forcing field (Fig \ref{Error}b ). |
56 |
dfer |
1.1 |
|
57 |
dfer |
1.10 |
The total cost function is defined as $J=\lambda_1 J_1+ \lambda_2 J_2$ where |
58 |
dfer |
1.1 |
$\lambda_1$ and $\lambda_2$ are weights controlling the relative contribution |
59 |
dfer |
1.10 |
of the two components. The adjoint model then yields the sensitivities |
60 |
dfer |
1.1 |
$\partial J/\partial Q_\mathrm{netm}$ of $J$ relative to the 2D fields |
61 |
dfer |
1.10 |
$Q_\mathrm{netm}$. Using a line-searching algorithm (Gilbert and |
62 |
|
|
Lemar\'{e}chal, 1989), $Q_\mathrm{netm}$ is adjusted then in the sense to |
63 |
|
|
reduce $J$ --- the procedure is repeated until convergence. |
64 |
|
|
|
65 |
|
|
In the following example, the configuration is identical to the ``Global Ocean |
66 |
|
|
circulation'' tutorial where more details can be found. In each iteration, the |
67 |
|
|
model is started from rest with temperature and salinity initial conditions |
68 |
|
|
taken from Levitus dataset and run for a year. The first guess |
69 |
|
|
$Q_\mathrm{netm}$ is chosen to be zero. The experiment is run for a year |
70 |
|
|
and the temperature cost function $J_1$ is computed using the annual averaged |
71 |
|
|
potential temperature. |
72 |
|
|
|
73 |
|
|
The experiment employs two executables: one for the MITgcm and its adjoint and |
74 |
|
|
one for the line-search algorithm (offline optimization). The implementation |
75 |
|
|
of the control variable $Q_\mathrm{netm}$, the cost function $J$ and the I/O |
76 |
|
|
required for the communication between the model and the line-search are |
77 |
|
|
described in detail in section 2. The compilation of the two executables is |
78 |
|
|
given in section 3. The method used to run the experiment is described in |
79 |
|
|
section 4. |
80 |
|
|
|
81 |
|
|
\begin{figure} [tpb] |
82 |
|
|
\begin{center} |
83 |
|
|
\includegraphics[width=\textwidth,height=.3\textheight]{part3/case_studies/global_oce_estimation/Error.eps} |
84 |
|
|
\caption{{\it A priori} errors used in the cost functions on a) the temperature and b) the surface heat flux.} |
85 |
|
|
\label{Error} |
86 |
|
|
\end{center} |
87 |
|
|
\end{figure} |
88 |
dfer |
1.1 |
|
89 |
|
|
Gilbert, J. C., and C. Lemar\'echal, 1989: Some numerical experiments with |
90 |
|
|
variable-storage quasi-Newton algorithms. \textit{Math. Programm.,} |
91 |
|
|
\textbf{45,} 407-435. |
92 |
|
|
|
93 |
dfer |
1.10 |
\subsection{Implementation of the control variable and the cost function} |
94 |
dfer |
1.1 |
|
95 |
dfer |
1.10 |
All subroutines that require modifications are found in |
96 |
|
|
verifications/Optim/code\_ad |
97 |
dfer |
1.1 |
|
98 |
dfer |
1.3 |
\subsubsection{The control variable} |
99 |
|
|
|
100 |
dfer |
1.10 |
The adjustment $Q_\mathrm{netm}$ is activated by setting |
101 |
|
|
ALLOW\_HFLUXM\_CONTROL to "define" in ECCO\_OPTIONS.h. |
102 |
dfer |
1.1 |
|
103 |
dfer |
1.10 |
It is first implemented as a ``normal'' forcing variable. It is defined in |
104 |
|
|
FFIELDS.h, initialized to zero in ini\_forcing.F, and then used in |
105 |
|
|
external\_forcing\_surf.F. $Q_\mathrm{netm}$ is made a control variable in |
106 |
|
|
the ctrl package by modifying the following subroutines: |
107 |
dfer |
1.1 |
|
108 |
dfer |
1.3 |
\begin{itemize} |
109 |
dfer |
1.10 |
\item ctrl\_init.F where $Q_\mathrm{netm}$ is defined as the control variable |
110 |
|
|
number 24, |
111 |
dfer |
1.1 |
|
112 |
dfer |
1.10 |
\item ctrl\_pack.F which writes, at the end of each iteration, the sensitivity |
113 |
|
|
of the cost function $\partial J/\partial Q_\mathrm{netm}$ in to a file to be |
114 |
|
|
used by the line-search algorithm, |
115 |
dfer |
1.1 |
|
116 |
dfer |
1.10 |
\item ctrl\_unpack.F which reads, at the start of each iteration, the updated |
117 |
|
|
perturbations as provided by the line-search algorithm, |
118 |
dfer |
1.1 |
|
119 |
dfer |
1.10 |
\item ctrl\_map\_forcing.F in which the updated perturbation is added to the |
120 |
|
|
first guess $Q_\mathrm{netm}$. |
121 |
dfer |
1.3 |
\end{itemize} |
122 |
dfer |
1.1 |
|
123 |
dfer |
1.10 |
Note also some minor changes in ctrl.h, ctrl\_readparams.F, and ctrl\_dummy.h |
124 |
|
|
(xx\_hfluxm\_file, fname\_hfluxm, xx\_hfluxm\_dummy). |
125 |
dfer |
1.1 |
|
126 |
dfer |
1.3 |
\subsubsection{Cost functions} |
127 |
|
|
|
128 |
dfer |
1.10 |
The cost functions are implemented using the {\it cost} package. |
129 |
dfer |
1.1 |
|
130 |
dfer |
1.3 |
\begin{itemize} |
131 |
dfer |
1.1 |
|
132 |
dfer |
1.10 |
\item The temperature cost function $J_1$ which measures the drift of the mean |
133 |
|
|
model temperature from the Levitus climatology is implemented cost\_temp.F. |
134 |
|
|
It is activated by ALLOW\_COST\_TEMP in ECCO\_OPTIONS.h. It requires the mean |
135 |
|
|
temperature of the model which is obtained by accumulating the temperature in |
136 |
|
|
cost\_tile.F (called at each time step). |
137 |
|
|
The value of the cost function is stored in {\it objf\_temp} and its weight |
138 |
|
|
$\lambda_1$ in {\it mult\_temp}. |
139 |
|
|
|
140 |
|
|
\item The heat flux cost function, penalizing the departure of the surface |
141 |
|
|
heat flux from observations is implemented in cost\_hflux.F, and activated by |
142 |
|
|
activated ALLOW\_COST\_HFLUXM in ECCO\_OPTIONS.h. The value of the cost |
143 |
|
|
function is stored in {\it objf\_hfluxm} and its weight $\lambda_2$ in |
144 |
|
|
{\it mult\_hfluxm}. |
145 |
dfer |
1.1 |
|
146 |
dfer |
1.3 |
\item The subroutine cost\_final.F calls the cost\_functions subroutines |
147 |
dfer |
1.10 |
and make the (weighted) sum of the various contributions. |
148 |
dfer |
1.1 |
|
149 |
dfer |
1.10 |
\item The various weights used in the cost functions are read in |
150 |
|
|
cost\_weights.F. The weight of the cost functions are read in |
151 |
|
|
cost\_readparams.F from the input file data.cost. |
152 |
dfer |
1.1 |
|
153 |
dfer |
1.3 |
\end{itemize} |
154 |
|
|
|
155 |
dfer |
1.5 |
|
156 |
|
|
\subsection{Code Configuration} |
157 |
|
|
\label{www:tutorials} |
158 |
|
|
\label{SEC:eg_fourl_code_config} |
159 |
|
|
|
160 |
dfer |
1.10 |
The model configuration for this experiment resides under the directory |
161 |
|
|
{\it verification/tutorial\_global\_oce\_optim/}. The experiment files in |
162 |
|
|
code\_ad/ and input\_ad/ contain the code customizations and parameter |
163 |
|
|
settings. Most of them are identical to those used in the Global Ocean |
164 |
|
|
experiment. Below we describe the customizations to these files required for |
165 |
|
|
the experiment. |
166 |
|
|
|
167 |
|
|
\subsubsection{Compilation-time customizations in {\it code\_ad/}} |
168 |
|
|
|
169 |
|
|
ECCO_CPPOPTIONS.h the optimization. |
170 |
|
|
|
171 |
|
|
\begin{itemize} |
172 |
|
|
\item ALLOW\_ECCO\_OPTIMIZATION |
173 |
|
|
|
174 |
|
|
\item ALLOW\_COST ALLOW\_COST\_TEMP ALLOW\_COST\_HFLUXM |
175 |
|
|
|
176 |
|
|
\item ALLOW\_HFLUXM\_CONTROL |
177 |
|
|
\end{itemize} |
178 |
|
|
|
179 |
|
|
\subsubsection{Running-time customizations in {\it input\_ad/}} |
180 |
|
|
|
181 |
|
|
\begin{itemize} |
182 |
|
|
|
183 |
|
|
\item {\it data} is similar to |
184 |
|
|
|
185 |
|
|
\item {\it data.optim} specifies the iteration number. |
186 |
|
|
|
187 |
|
|
\item {\it data.ctrl} is used, in particular, to specify the |
188 |
|
|
name of the sensitivity and adjustment files associated to a control |
189 |
|
|
variable. |
190 |
|
|
|
191 |
|
|
\item {\it data.cost} |
192 |
|
|
|
193 |
|
|
\item {\it data.pkg} |
194 |
|
|
|
195 |
|
|
\item {\it Err\_hflux.bin} and {\it Err\_levitus\_15layer.bin} are the |
196 |
|
|
files containing the heat flux and potential temperature uncertainties, |
197 |
|
|
respectively. |
198 |
dfer |
1.5 |
|
199 |
dfer |
1.10 |
\item {\it Err\_levitus\_15layer.bin} |
200 |
dfer |
1.5 |
|
201 |
dfer |
1.10 |
\item Subdirectory {\it OPTIM} |
202 |
|
|
|
203 |
|
|
\end{itemize} |
204 |
dfer |
1.5 |
|
205 |
dfer |
1.3 |
\subsection{Compiling} |
206 |
dfer |
1.1 |
|
207 |
|
|
The optimization experiment requires two executables: 1) the |
208 |
dfer |
1.10 |
MITgcm and its adjoint ({\it mitgcmuv\_ad}) and 2) the line-search |
209 |
|
|
algorithm ({\it optim.x}). |
210 |
dfer |
1.1 |
|
211 |
edhill |
1.6 |
\subsubsection{Compilation of MITgcm and its adjoint: {\it mitcgmuv\_ad}} |
212 |
dfer |
1.1 |
|
213 |
|
|
Before compiling, first note that, in the directory code\_ad/, two files |
214 |
|
|
must be updated: |
215 |
dfer |
1.5 |
\begin{itemize} |
216 |
|
|
\item code\_ad\_diff.list which lists new subroutines to be compiled |
217 |
dfer |
1.10 |
by the TAF software (cost\_temp.F and cost\_hflux.F here), |
218 |
dfer |
1.1 |
|
219 |
dfer |
1.5 |
\item the adjoint\_hfluxm files which provides a list of the control variables |
220 |
dfer |
1.10 |
and the name of cost function to the TAF software. |
221 |
dfer |
1.5 |
|
222 |
|
|
\end{itemize} |
223 |
dfer |
1.1 |
|
224 |
|
|
Then, in the directory build\_ad/, type: |
225 |
dfer |
1.5 |
\begin{verbatim} |
226 |
|
|
% ../../../tools/genmake2 -mods=../code\_ad -adof=../code\_ad/adjoint\_hfluxm |
227 |
|
|
% make depend |
228 |
|
|
% make adall |
229 |
|
|
\end{verbatim} |
230 |
dfer |
1.10 |
to generate the MITgcm executable {\it mitgcmuv\_ad}. |
231 |
dfer |
1.1 |
|
232 |
edhill |
1.6 |
\subsubsection{Compilation of the line-search algorithm: {\it optim.x}} |
233 |
dfer |
1.1 |
|
234 |
dfer |
1.10 |
This is done from the directories lsopt/ and optim/ (under MITgcm/). In lsopt/, |
235 |
|
|
unzip the blash1 library adapted to your platform, and change the Makefile |
236 |
|
|
accordingly. Compile with: |
237 |
dfer |
1.5 |
\begin{verbatim} |
238 |
|
|
% make all |
239 |
|
|
\end{verbatim} |
240 |
|
|
(more details in lsopt\_doc.txt) |
241 |
dfer |
1.1 |
|
242 |
dfer |
1.10 |
In optim/, the path of the directory where {\it mitgcm\_ad} was compiled |
243 |
|
|
must be specified in the Makefile in the variable INCLUDEDIRS. The file name |
244 |
|
|
of the control variable (xx\_hfluxm\_file here) must be added to the name list |
245 |
|
|
read by optim\_num.F. Then use |
246 |
dfer |
1.5 |
\begin{verbatim} |
247 |
|
|
% make depend |
248 |
|
|
\end{verbatim} |
249 |
|
|
and |
250 |
|
|
\begin{verbatim} |
251 |
|
|
% make |
252 |
|
|
\end{verbatim} |
253 |
edhill |
1.6 |
to generate the line-search executable {\it optim.x}. |
254 |
dfer |
1.1 |
|
255 |
dfer |
1.3 |
\subsection{Running the estimation} |
256 |
dfer |
1.1 |
|
257 |
dfer |
1.10 |
Copy the {\it mitgcmuv\_ad} executable to input\_ad/ and {\it optim.x} to the |
258 |
|
|
subdirectory input\_ad/OPTIM/. Move into input\_ad/. The first iteration is |
259 |
|
|
somewhat particular and is best done "by hand" while the following iterations |
260 |
|
|
can be run automatically (see below). Check that the iteration number is set |
261 |
|
|
to 0 in data.optim and run the MITgcm: |
262 |
dfer |
1.3 |
\begin{verbatim} |
263 |
|
|
% ./mitgcmuv_ad |
264 |
|
|
\end{verbatim} |
265 |
dfer |
1.1 |
|
266 |
dfer |
1.5 |
The output files adxx\_hfluxm.0000000000.* and xx\_hfluxm.0000000000.* contain |
267 |
|
|
the sensitivity of the cost function to $Q_\mathrm{netm}$ and the perturbations |
268 |
dfer |
1.10 |
to $Q_\mathrm{netm}$ (zero at the first iteration), respectively. Two other |
269 |
|
|
files called ecco\_cost.. and ecco\_ctrl are also generated. They essentially |
270 |
|
|
contain the same information as the adxx\_* and xx\_* files, but in a |
271 |
|
|
compressed format. These two files are the only ones involved in the |
272 |
|
|
communication between the adjoint model {\it mitgcmuv\_ad} and the line-search |
273 |
|
|
algorithm {\it optim.x}. The ecco\_cost*n is an output of the adjoint model at |
274 |
|
|
iteration $n$ and an input of the line-search. The latter returns an updated |
275 |
|
|
perturbation in ecco\_ctrl*n+1 to be used as an input of the adjoint model at |
276 |
|
|
iteration n+1. |
277 |
|
|
|
278 |
|
|
At the first iteration, move these two files (ecco\_cost and ecco\_ctrl) to |
279 |
|
|
OPTIM/, open data.optim and check the iteration number is set to 0. The target |
280 |
|
|
cost function {\it fmin} needs to be specified too: a rule of thumb suggests |
281 |
|
|
it should be set to about 0.95-0.90 times the value of the cost function at |
282 |
|
|
the first iteration. This value is only used at the first iteration and does |
283 |
|
|
not need to be updated afterward. However, it implicitly specifies the |
284 |
|
|
``pace'' at which the cost function is going down (if you are lucky and it does |
285 |
|
|
indeed diminish!). More in the ECCO section maybe ? |
286 |
dfer |
1.4 |
|
287 |
|
|
Once this is done, run the line-search algorithm: |
288 |
|
|
\begin{verbatim} |
289 |
|
|
% ./optim.x |
290 |
|
|
\end{verbatim} |
291 |
dfer |
1.5 |
which computes the updated perturbations for iteration 1 ecco\_ctrl\_1. |
292 |
dfer |
1.4 |
|
293 |
edhill |
1.6 |
The following iterations can be executed automatically using the shell |
294 |
dfer |
1.10 |
script {\it cycsh} found in input\_ad/. This script will take care of changing |
295 |
|
|
the iteration numbers in the data.optim, launch the adjoint model, clean and |
296 |
|
|
store the outputs, move the ecco\_cost* and ecco\_ctrl* files, and run the |
297 |
|
|
line-search algorithm. Edit {\it cycsh} to specify the prefix of the |
298 |
|
|
directories used to store the outputs and the maximum number of iteration. |