articles/ceaice/ceaice_adjoint.tex

\section{Adjoint sensiivities of the MITsim}
\label{sec:adjoint}

\subsection{The adjoint of MITsim}

The ability to generate tangent linear and adjoint model components
of the MITsim has been a main design task.
For the ocean the adjoint capability has proven to be an
invaluable tool for sensitivity analysis as well as state estimation.
In short, the adjoint enables very efficient computation of the gradient
of scalar-valued model diagnostics (called cost function or objective function)
with respect to many model "variables".
These variables can be two- or three-dimensional fields of initial 
conditions, model parameters such as mixing coefficients, or
time-varying surface or lateral (open) boundary conditions.
When combined, these variables span a potentially high-dimensional
(e.g. O(10$^8$)) so-called control space. Performing parameter perturbations
to assess model sensitivities quickly becomes prohibitive at these scales.
Alternatively, (time-varying) sensitivities of the objective function 
to any element of the  control space can be computed very efficiently in 
one single adjoint 
model integration, provided an efficient adjoint model is available.

[REFERENCES]


The adjoint operator (ADM) is the transpose of the tangent linear operator (TLM)
of the full (in general nonlinear) forward model, i.e. the MITsim.
The TLM maps perturbations of elements of the control space
(e.g. initial ice thickness distribution)
via the model Jacobian
to a perturbation in the objective function 
(e.g. sea-ice export at the end of the integration interval).
\textit{Tangent} linearity ensures that the derivatives are evaluated
with respect to the underlying model trajectory at each point in time.
This is crucial for nonlinear trajectories and the presence of different
regimes (e.g. effect of the seaice growth term at or away from the
freezing point of the ocean surface).
Ensuring tangent linearity can be easily achieved by integrating
the full model in sync with the TLM to provide the underlying model state.
Ensuring \textit{tangent} adjoints is equally crucial, but much more
difficult to achieve because of the reverse nature of the integration:
the adjoint accumulates sensitivities backward in time,
starting from a unit perturbation of the objective function.
The adjoint model requires the model state in reverse order.
This presents one of the major complications in deriving an
exact, i.e. \textit{tangent} adjoint model.

Following closely the development and maintenance of TLM and ADM
components of the MITgcm we have relied heavily on the
autmomatic differentiation (AD) tool
"Transformation of Algorithms in Fortran" (TAF)
developed by Fastopt (Giering and Kaminski, 1998)
to derive TLM and ADM code of the MITsim.
Briefly, the nonlinear parent model is fed to the AD tool which produces 
derivative code for the specified control space and objective function.
Following this approach has (apart from its evident success)
several advantages:
(1) the adjoint model is the exact adjoint operator of the parent model,
(2) the adjoint model can be kept up to date with respect to ongoing
development of the parent model, and adjustments to the parent model
to extend the automatically generated adjoint are incremental changes
only, rather than extensive re-developments,
(3) the parallel structure of the parent model is preserved 
by the adjoint model, ensuring efficient use in high performance
computing environments.

Some initial code adjustments are required to support dependency analysis
of the flow reversal and certain language limitations which may lead
to irreducible flow graphs (e.g. GOTO statements).
The problem of providing the required model state in reverse order
at the time of evaluating nonlinear or conditional
derivatives is solved via balancing
storing vs. recomputation of the model state in a multi-level
checkpointing loop.
Again, an initial code adjustment is required to support TAFs 
checkpointing capability.
The code adjustments are sufficiently simple so as not to cause
major limitations to the full nonlinear parent model.
Once in place, an adjoint model of a new model configuration
may be derived in about 10 minutes.

[HIGHLIGHT COUPLED NATURE OF THE ADJOINT!]

\subsection{Special considerations}

* growth term(?)

* small active denominators

* dynamic solver (implicit function theorem)

* approximate adjoints


\subsection{An example: sensitivities of sea-ice export through Fram Strait}

We demonstrate the power of the adjoint method
in the context of investigating sea-ice export sensitivities through Fram Strait
(for details of this study see Heimbach et al., 2007).
%\citep[for details of this study see][]{heimbach07}. %Heimbach et al., 2007).
The domain chosen is a coarsened version of the Arctic face of the
high-resolution cubed-sphere configuration of the ECCO2 project
\citep[see][]{menemenlis05}. It covers the entire Arctic,
extends into the North Pacific such as to cover the entire
ice-covered regions, and comprises parts of the North Atlantic
down to XXN to enable analysis of remote influences of the 
North Atlantic current to sea-ice variability and export.
The horizontal resolution varies between XX and YY km
with 50 unevenly spaced vertical levels.
The adjoint models run efficiently on 80 processors
(benchmarks have been performed both on an SGI Altix as well as an 
IBM SP5 at NASA/ARC).

Following a 1-year spinup, the model has been integrated for four
years between 1992 and 1995. It is forced using realistic 6-hourly
NCEP/NCAR atmospheric state variables. Over the open ocean these are
converted into air-sea fluxes via the bulk formulae of
\citet{large04}.  Derivation of air-sea fluxes in the presence of
sea-ice is handled by the ice model as described in \refsec{model}.
The objective function chosen is sea-ice export through Fram Strait
computed for December 1995.  The adjoint model computes sensitivities
to sea-ice export back in time from 1995 to 1992 along this
trajectory.  In principle all adjoint model variable (i.e., Lagrange
multipliers) of the coupled ocean/sea-ice model are available to
analyze the transient sensitivity behaviour of the ocean and sea-ice
state.  Over the open ocean, the adjoint of the bulk formula scheme
computes sensitivities to the time-varying atmospheric state.  Over
ice-covered parts, the sea-ice adjoint converts surface ocean
sensitivities to atmospheric sensitivities.

\reffig{4yradjheff}(a--d) depict sensitivities of sea-ice export
through Fram Strait in December 1995 to changes in sea-ice thickness
12, 24, 36, 48 months back in time. Corresponding sensitivities to
ocean surface temperature are depicted in
\reffig{4yradjthetalev1}(a--d).  The main characteristics is
consistency with expected advection of sea-ice over the relevant time
scales considered.  The general positive pattern means that an
increase in sea-ice thickness at location $(x,y)$ and time $t$ will
increase sea-ice export through Fram Strait at time $T_e$.  Largest
distances from Fram Strait indicate fastest sea-ice advection over the
time span considered.  The ice thickness sensitivities are in close
correspondence to ocean surface sentivitites, but of opposite sign.
An increase in temperature will incur ice melting, decrease in ice
thickness, and therefore decrease in sea-ice export at time $T_e$.

The picture is fundamentally different and much more complex
for sensitivities to ocean temperatures away from the surface.
\reffig{4yradjthetalev10??}(a--d) depicts ice export sensitivities to
temperatures at roughly 400 m depth.
Primary features are the effect of the heat transport of the North
Atlantic current which feeds into the West Spitsbergen current,
the circulation around Svalbard, and ...

\begin{figure}[t!]
\centerline{
\subfigure[{\footnotesize -12 months}]
{\includegraphics*[width=0.44\linewidth]{\fpath/run_4yr_ADJheff_arc_lev1_tim072_cmax2.0E+02.eps}}
%\includegraphics*[width=.3\textwidth]{H_c.bin_res_100_lev1.pdf}
%
\subfigure[{\footnotesize -24 months}]
{\includegraphics*[width=0.44\linewidth]{\fpath/run_4yr_ADJheff_arc_lev1_tim145_cmax2.0E+02.eps}}
}

\centerline{
\subfigure[{\footnotesize
-36 months}]
{\includegraphics*[width=0.44\linewidth]{\fpath/run_4yr_ADJheff_arc_lev1_tim218_cmax2.0E+02.eps}}
%
\subfigure[{\footnotesize
-48 months}]
{\includegraphics*[width=0.44\linewidth]{\fpath/run_4yr_ADJheff_arc_lev1_tim292_cmax2.0E+02.eps}}
}
\caption{Sensitivity of sea-ice export through Fram Strait in December 2005 to
sea-ice thickness at various prior times.
\label{fig:4yradjheff}}
\end{figure}


\begin{figure}[t!]
\centerline{
\subfigure[{\footnotesize -12 months}]
{\includegraphics*[width=0.44\linewidth]{\fpath/run_4yr_ADJtheta_arc_lev1_tim072_cmax5.0E+01.eps}}
%\includegraphics*[width=.3\textwidth]{H_c.bin_res_100_lev1}
%
\subfigure[{\footnotesize -24 months}]
{\includegraphics*[width=0.44\linewidth]{\fpath/run_4yr_ADJtheta_arc_lev1_tim145_cmax5.0E+01.eps}}
}

\centerline{
\subfigure[{\footnotesize
-36 months}] 
{\includegraphics*[width=0.44\linewidth]{\fpath/run_4yr_ADJtheta_arc_lev1_tim218_cmax5.0E+01.eps}}
%
\subfigure[{\footnotesize
-48 months}]
{\includegraphics*[width=0.44\linewidth]{\fpath/run_4yr_ADJtheta_arc_lev1_tim292_cmax5.0E+01.eps}}
}
\caption{Same as \reffig{4yradjheff} but for sea surface temperature
\label{fig:4yradjthetalev1}}
\end{figure}

%%% Local Variables: 
%%% mode: latex
%%% TeX-master: "ceaice"
%%% End: 
1	dimitri	1.1	\section{Adjoint sensiivities of the MITsim}
2			\label{sec:adjoint}
3
4			\subsection{The adjoint of MITsim}
5
6			The ability to generate tangent linear and adjoint model components
7			of the MITsim has been a main design task.
8			For the ocean the adjoint capability has proven to be an
9			invaluable tool for sensitivity analysis as well as state estimation.
10			In short, the adjoint enables very efficient computation of the gradient
11			of scalar-valued model diagnostics (called cost function or objective function)
12			with respect to many model "variables".
13			These variables can be two- or three-dimensional fields of initial
14			conditions, model parameters such as mixing coefficients, or
15			time-varying surface or lateral (open) boundary conditions.
16			When combined, these variables span a potentially high-dimensional
17			(e.g. O(10$^8$)) so-called control space. Performing parameter perturbations
18			to assess model sensitivities quickly becomes prohibitive at these scales.
19			Alternatively, (time-varying) sensitivities of the objective function
20			to any element of the control space can be computed very efficiently in
21			one single adjoint
22			model integration, provided an efficient adjoint model is available.
23
24			[REFERENCES]
25
26
27			The adjoint operator (ADM) is the transpose of the tangent linear operator (TLM)
28			of the full (in general nonlinear) forward model, i.e. the MITsim.
29			The TLM maps perturbations of elements of the control space
30			(e.g. initial ice thickness distribution)
31			via the model Jacobian
32			to a perturbation in the objective function
33			(e.g. sea-ice export at the end of the integration interval).
34			\textit{Tangent} linearity ensures that the derivatives are evaluated
35			with respect to the underlying model trajectory at each point in time.
36			This is crucial for nonlinear trajectories and the presence of different
37			regimes (e.g. effect of the seaice growth term at or away from the
38			freezing point of the ocean surface).
39			Ensuring tangent linearity can be easily achieved by integrating
40			the full model in sync with the TLM to provide the underlying model state.
41			Ensuring \textit{tangent} adjoints is equally crucial, but much more
42			difficult to achieve because of the reverse nature of the integration:
43			the adjoint accumulates sensitivities backward in time,
44			starting from a unit perturbation of the objective function.
45			The adjoint model requires the model state in reverse order.
46			This presents one of the major complications in deriving an
47			exact, i.e. \textit{tangent} adjoint model.
48
49			Following closely the development and maintenance of TLM and ADM
50			components of the MITgcm we have relied heavily on the
51			autmomatic differentiation (AD) tool
52			"Transformation of Algorithms in Fortran" (TAF)
53			developed by Fastopt (Giering and Kaminski, 1998)
54			to derive TLM and ADM code of the MITsim.
55			Briefly, the nonlinear parent model is fed to the AD tool which produces
56			derivative code for the specified control space and objective function.
57			Following this approach has (apart from its evident success)
58			several advantages:
59			(1) the adjoint model is the exact adjoint operator of the parent model,
60			(2) the adjoint model can be kept up to date with respect to ongoing
61			development of the parent model, and adjustments to the parent model
62			to extend the automatically generated adjoint are incremental changes
63			only, rather than extensive re-developments,
64			(3) the parallel structure of the parent model is preserved
65			by the adjoint model, ensuring efficient use in high performance
66			computing environments.
67
68			Some initial code adjustments are required to support dependency analysis
69			of the flow reversal and certain language limitations which may lead
70			to irreducible flow graphs (e.g. GOTO statements).
71			The problem of providing the required model state in reverse order
72			at the time of evaluating nonlinear or conditional
73			derivatives is solved via balancing
74			storing vs. recomputation of the model state in a multi-level
75			checkpointing loop.
76			Again, an initial code adjustment is required to support TAFs
77			checkpointing capability.
78			The code adjustments are sufficiently simple so as not to cause
79			major limitations to the full nonlinear parent model.
80			Once in place, an adjoint model of a new model configuration
81			may be derived in about 10 minutes.
82
83			[HIGHLIGHT COUPLED NATURE OF THE ADJOINT!]
84
85			\subsection{Special considerations}
86
87			* growth term(?)
88
89			* small active denominators
90
91			* dynamic solver (implicit function theorem)
92
93			* approximate adjoints
94
95
96			\subsection{An example: sensitivities of sea-ice export through Fram Strait}
97
98			We demonstrate the power of the adjoint method
99			in the context of investigating sea-ice export sensitivities through Fram Strait
100			(for details of this study see Heimbach et al., 2007).
101			%\citep[for details of this study see][]{heimbach07}. %Heimbach et al., 2007).
102			The domain chosen is a coarsened version of the Arctic face of the
103			high-resolution cubed-sphere configuration of the ECCO2 project
104			\citep[see][]{menemenlis05}. It covers the entire Arctic,
105			extends into the North Pacific such as to cover the entire
106			ice-covered regions, and comprises parts of the North Atlantic
107			down to XXN to enable analysis of remote influences of the
108			North Atlantic current to sea-ice variability and export.
109			The horizontal resolution varies between XX and YY km
110			with 50 unevenly spaced vertical levels.
111			The adjoint models run efficiently on 80 processors
112			(benchmarks have been performed both on an SGI Altix as well as an
113			IBM SP5 at NASA/ARC).
114
115			Following a 1-year spinup, the model has been integrated for four
116			years between 1992 and 1995. It is forced using realistic 6-hourly
117			NCEP/NCAR atmospheric state variables. Over the open ocean these are
118			converted into air-sea fluxes via the bulk formulae of
119			\citet{large04}. Derivation of air-sea fluxes in the presence of
120			sea-ice is handled by the ice model as described in \refsec{model}.
121			The objective function chosen is sea-ice export through Fram Strait
122			computed for December 1995. The adjoint model computes sensitivities
123			to sea-ice export back in time from 1995 to 1992 along this
124			trajectory. In principle all adjoint model variable (i.e., Lagrange
125			multipliers) of the coupled ocean/sea-ice model are available to
126			analyze the transient sensitivity behaviour of the ocean and sea-ice
127			state. Over the open ocean, the adjoint of the bulk formula scheme
128			computes sensitivities to the time-varying atmospheric state. Over
129			ice-covered parts, the sea-ice adjoint converts surface ocean
130			sensitivities to atmospheric sensitivities.
131
132			\reffig{4yradjheff}(a--d) depict sensitivities of sea-ice export
133			through Fram Strait in December 1995 to changes in sea-ice thickness
134			12, 24, 36, 48 months back in time. Corresponding sensitivities to
135			ocean surface temperature are depicted in
136			\reffig{4yradjthetalev1}(a--d). The main characteristics is
137			consistency with expected advection of sea-ice over the relevant time
138			scales considered. The general positive pattern means that an
139			increase in sea-ice thickness at location $(x,y)$ and time $t$ will
140			increase sea-ice export through Fram Strait at time $T_e$. Largest
141			distances from Fram Strait indicate fastest sea-ice advection over the
142			time span considered. The ice thickness sensitivities are in close
143			correspondence to ocean surface sentivitites, but of opposite sign.
144			An increase in temperature will incur ice melting, decrease in ice
145			thickness, and therefore decrease in sea-ice export at time $T_e$.
146
147			The picture is fundamentally different and much more complex
148			for sensitivities to ocean temperatures away from the surface.
149			\reffig{4yradjthetalev10??}(a--d) depicts ice export sensitivities to
150			temperatures at roughly 400 m depth.
151			Primary features are the effect of the heat transport of the North
152			Atlantic current which feeds into the West Spitsbergen current,
153			the circulation around Svalbard, and ...
154
155			\begin{figure}[t!]
156			\centerline{
157			\subfigure[{\footnotesize -12 months}]
158			{\includegraphics*[width=0.44\linewidth]{\fpath/run_4yr_ADJheff_arc_lev1_tim072_cmax2.0E+02.eps}}
159			%\includegraphics*[width=.3\textwidth]{H_c.bin_res_100_lev1.pdf}
160			%
161			\subfigure[{\footnotesize -24 months}]
162			{\includegraphics*[width=0.44\linewidth]{\fpath/run_4yr_ADJheff_arc_lev1_tim145_cmax2.0E+02.eps}}
163			}
164
165			\centerline{
166			\subfigure[{\footnotesize
167			-36 months}]
168			{\includegraphics*[width=0.44\linewidth]{\fpath/run_4yr_ADJheff_arc_lev1_tim218_cmax2.0E+02.eps}}
169			%
170			\subfigure[{\footnotesize
171			-48 months}]
172			{\includegraphics*[width=0.44\linewidth]{\fpath/run_4yr_ADJheff_arc_lev1_tim292_cmax2.0E+02.eps}}
173			}
174			\caption{Sensitivity of sea-ice export through Fram Strait in December 2005 to
175			sea-ice thickness at various prior times.
176			\label{fig:4yradjheff}}
177			\end{figure}
178
179
180			\begin{figure}[t!]
181			\centerline{
182			\subfigure[{\footnotesize -12 months}]
183			{\includegraphics*[width=0.44\linewidth]{\fpath/run_4yr_ADJtheta_arc_lev1_tim072_cmax5.0E+01.eps}}
184	mlosch	1.2	%\includegraphics*[width=.3\textwidth]{H_c.bin_res_100_lev1}
185	dimitri	1.1	%
186			\subfigure[{\footnotesize -24 months}]
187			{\includegraphics*[width=0.44\linewidth]{\fpath/run_4yr_ADJtheta_arc_lev1_tim145_cmax5.0E+01.eps}}
188			}
189
190			\centerline{
191			\subfigure[{\footnotesize
192			-36 months}]
193			{\includegraphics*[width=0.44\linewidth]{\fpath/run_4yr_ADJtheta_arc_lev1_tim218_cmax5.0E+01.eps}}
194			%
195			\subfigure[{\footnotesize
196			-48 months}]
197			{\includegraphics*[width=0.44\linewidth]{\fpath/run_4yr_ADJtheta_arc_lev1_tim292_cmax5.0E+01.eps}}
198			}
199			\caption{Same as \reffig{4yradjheff} but for sea surface temperature
200			\label{fig:4yradjthetalev1}}
201			\end{figure}
202	mlosch	1.2
203			%%% Local Variables:
204			%%% mode: latex
205			%%% TeX-master: "ceaice"
206			%%% End: