| 1 |
\section{ECCO: model-data comparisons using gridded data sets} |
| 2 |
\label{sec:pkg:ecco} |
| 3 |
\begin{rawhtml} |
| 4 |
<!-- CMIREDIR:package_ecco: --> |
| 5 |
\end{rawhtml} |
| 6 |
|
| 7 |
\def\mitgcmCheckpointVersion{65x} |
| 8 |
|
| 9 |
The functionalities implemented in \texttt{pkg/ecco} are: (1) output time-averaged model fields to compare with gridded data sets; (2) compute normalized model-data distances or cost functions. The former is achieved as the model runs forwards in time whereas the latter occurs after time-integration has completed. Following \cite{for-eta:15} the resulting `cost function' is formulated generically as |
| 10 |
\begin{align} |
| 11 |
\mathcal{J}(\vec{u}) &= \sum_i \alpha_i \left(\vec{d}_i^T R_i^{-1} \vec{d}_i\right) + \sum_j \beta_j \vec{u}^T\vec{u}, \label{eq:Jtotal} \\ |
| 12 |
\vec{d}_i &= \mathcal{P}(\vec{m}_i - \vec{o}_i), \label{eq:Jposproc} \\ |
| 13 |
\vec{m}_i &= \mathcal{S}\mathcal{D}\mathcal{M}(\vec{v}), \label{eq:Jpreproc} \\ |
| 14 |
\vec{v} &= \mathcal{Q}(\vec{u}), \label{eq:Upreproc} \\ |
| 15 |
\vec{u} &= \mathcal{R}(\vec{u}') \label{eq:Uprecond} |
| 16 |
\end{align} |
| 17 |
using symbols defined in table~\ref{tbl:gencost_symbols}. Per Eq.~\eqref{eq:Jpreproc} model counterparts ($\vec{m}_i$) to observational data ($\vec{o}_i$) derive from adjustable model parameters ($\vec{v}$) through the model dynamics integration ($\mathcal{M}$), diagnostic calculations ($\mathcal{D}$), and averaging in space and time ($\mathcal{S}$). Alternatively $\mathcal{S}$ stands for subsampling in space and time (section~\ref{sec:pkg:profiles}). Plain model-data misfits ($\vec{m}_i-\vec{o}_i$) can be penalized directly in Eq.~\eqref{eq:Jtotal} but penalized misfits ($\vec{d}_i$) more generally derive from $\vec{m}_i-\vec{o}_i$ through the generic $\mathcal{P}$ post-processor (Eq. \eqref{eq:Jposproc}). Eqs.~\eqref{eq:Upreproc}-\eqref{eq:Uprecond} pertain to model control parameter adjustment capabilities described in section~\ref{sec:pkg:ctrl}. |
| 18 |
|
| 19 |
\begin{table}[!ht] |
| 20 |
\centering |
| 21 |
\begin{tabular}{rl} |
| 22 |
symbol & definition \\ \hline |
| 23 |
$\vec{u}$ & vector of nondimensional control variables \\ |
| 24 |
$\vec{v}$ & vector of dimensional control variables \\ |
| 25 |
$\alpha_i, \beta_j$ & misfit and control cost function multipliers (1 by default) \\ |
| 26 |
$R_i$ & data error covariance matrix ($R_i^{-1}$ are weights) \\ |
| 27 |
$\vec{d}_i$ & a set of model-data differences \\ |
| 28 |
$\vec{o}_i$ & observational data vector \\ |
| 29 |
$\vec{m}_i$ & model counterpart to $\vec{o}_i$ \\ |
| 30 |
$\mathcal{P}$ & post-processing operator (e.g., a smoother) \\ |
| 31 |
$\mathcal{M}$ & forward model dynamics operator \\ |
| 32 |
$\mathcal{D}$ & diagnostic computation operator \\ |
| 33 |
$\mathcal{S}$ & averaging/subsampling operator \\ |
| 34 |
$\mathcal{Q}$ & Pre-processing operator \\ |
| 35 |
$\mathcal{R}$ & Pre-conditioning operator |
| 36 |
\end{tabular} |
| 37 |
\caption{Symbol definitions for pkg/ecco and pkg/ctrl generic cost functions.} |
| 38 |
\label{tbl:gencost_symbols} |
| 39 |
\end{table} |
| 40 |
|
| 41 |
\subsection{Generic Cost Function Terms} \label{costgen} |
| 42 |
|
| 43 |
The parameters available for configuring generic cost terms in \texttt{data.ecco} are given in table~\ref{tbl:gencost_ecco_params} and example of possible specifications are available in: |
| 44 |
\begin{itemize} |
| 45 |
\itemsep0em |
| 46 |
\item MITgcm\_contrib/verification\_other/global\_oce\_cs32/input/data.ecco |
| 47 |
\item MITgcm\_contrib/verification\_other/global\_oce\_cs32/input\_ad.sens/data.ecco |
| 48 |
\item MITgcm\_contrib/gael/verification/global\_oce\_llc90/input.ecco\_v4/data.ecco |
| 49 |
\end{itemize} |
| 50 |
|
| 51 |
\noindent |
| 52 |
The gridded observation file name is specified by `gencost\_datafile'. Observational time series may be provided as on big file or split into yearly files finishing in \texttt{`\_1992'}, \texttt{`\_1993'}, etc. The corresponding physical variable in $\vec{m}_i$ is specified via the first characters in \texttt{gencost\_barfile}. The list of implemented variables (as of checkpoint \mitgcmCheckpointVersion) is reported in table~\ref{tbl:gencost_ecco_barfile}. A file named according to \texttt{gencost\_barfile} will be created where averaged fields will be written progressively as the model integration proceeds forward in time. At the end this file will be re-read by \texttt{cost\_generic} to compute the cost function and (if \texttt{gencost\_outputlevel} = 1) output model-data misfit fields (i.e., $\vec{d}_i$) for offline analysis and visualization. |
| 53 |
|
| 54 |
In the current implementation, model-data error covariance matrices $R_i$ omit non-diagonal terms. Specifying $R_i=\sigma_i^2$ thus boils down to providing uncertainty fields ($\sigma_i$ such that $R_i=\sigma_i^2$) in a file which name is specified via \texttt{gencost\_errfile}. By default $\sigma_i$ is assumed time-invariant but a $\sigma_i$ time series (of the same length as $\vec{o}_i$ time series) can be provided using the \texttt{variaweight} option. By default a cost function is quadratic but $\vec{d}_i^T R_i^{-1} \vec{d}_i$ can be replaced with $R_i^{-1/2} \vec{d}_i$ using the \texttt{nosumsq} option. |
| 55 |
|
| 56 |
In principle, any periodicity should be possible, but only \texttt{`day'}, \texttt{`month'}, \texttt{`step'}, and \texttt{`const'} are implemented for \texttt{gencost\_avgperiod}. If two different averages of the same variable are needed for separate cost function terms (e.g., daily and monthly) then an extension starting with `\_' should be added to \texttt{gencost\_barfile} (such as `\_day' and `\_mon'). |
| 57 |
\footnote{ecco\_check may be missing a test for conflicting names.} If two cost function terms use the same variable and periodicity, however, then using a common \texttt{gencost\_barfile} saves disk space. |
| 58 |
|
| 59 |
Climatologies of $\vec{m}_i$ can be formed from the time series of model averages in order to compare with climatologies of $\vec{o}_i$ by activating the \texttt{`clim'} option via \texttt{gencost\_preproc} and setting the corresponding \texttt{gencost\_preproc\_i} integer parameter to the number of records (i.e., months, days, or time steps) per climatological cycle. The generic post-processor ($\mathcal{P}$ in Eq.~\eqref{eq:Jposproc}) allows model-data misfits to be, for example, smoothed in space by setting \texttt{gencost\_posproc} to \texttt{`smooth'}. Other options associated with the computation of Eq.~\eqref{eq:Jtotal} are summarized in table~\ref{tbl:gencost_ecco_preproc} and further discussed below. Multiple \texttt{gencost\_preproc} / \texttt{gencost\_posproc} options may be specified per cost term. |
| 60 |
|
| 61 |
In general the specification of \texttt{gencost\_name} is optional, has no impact on the end-result, and only serves to identify cost function terms in the model text outputs (STDOUT.0000, STDERR.0000, costfunction000). The exceptions are listed in table~\ref{tbl:gencost_ecco_name}, which activate alternative cost function codes (in place of \texttt{cost\_generic.F}) described in the following subsections. The specification of \texttt{gencost\_mask} allows the user to specify whether the gridded data input and model counterparts is located at tracer points (`c'; the default) or velocity points (`w' or `s'). Note however that the `c' option (not `w' or `s') should be used when gridded velocity data is provided as zonal/meridional components at tracer points. |
| 62 |
|
| 63 |
\begin{table}[!ht] |
| 64 |
\centering |
| 65 |
\begin{tabular}{lll} |
| 66 |
parameter & type & function \\ \hline |
| 67 |
%\texttt{using\_gencost} & logical & Turns specified generic cost term on. \\ |
| 68 |
\texttt{gencost\_name} & character(*) & Name of cost term \\ |
| 69 |
\texttt{gencost\_barfile} & character(*) & File to receive model counterpart $\vec{m}_i$ (see table~\ref{tbl:gencost_ecco_barfile}) \\ |
| 70 |
\texttt{gencost\_datafile} & character(*) & File containing observational data $\vec{o}_i$ \\ |
| 71 |
\texttt{gencost\_avgperiod} & character(5) & Averaging period for $\vec{o}_i$ and $\vec{m}_i$ (see text) \\ |
| 72 |
\texttt{gencost\_outputlevel} & integer & Greater than 0 will output misfit fields\\ |
| 73 |
\texttt{gencost\_errfile} & character(*) & File containing diagonal of error matrix $R_i$\\ |
| 74 |
\texttt{mult\_gencost} & real & Multiplier $\alpha_i$ (default: 1) \\ |
| 75 |
\hline |
| 76 |
\texttt{gencost\_preproc} & character(*) & Preprocessor names \\ |
| 77 |
\texttt{gencost\_preproc\_c} & character(*) & Preprocessor character arguments \\ |
| 78 |
\texttt{gencost\_preproc\_i} & integer(*) & Preprocessor integer arguments \\ |
| 79 |
\texttt{gencost\_preproc\_r} & real(*) & Preprocessor real arguments \\ |
| 80 |
\texttt{gencost\_posproc} & character(*) & Post-processor names \\ |
| 81 |
\texttt{gencost\_posproc\_c} & character(*) & Post-processor character arguments \\ |
| 82 |
\texttt{gencost\_posproc\_i} & integer(*) & Post-processor integer arguments \\ |
| 83 |
\texttt{gencost\_posproc\_r} & real(*) & Post-processor real arguments \\ |
| 84 |
\hline |
| 85 |
\texttt{gencost\_mask} & character(1) & Location of $\vec{m}_i$\\ |
| 86 |
\texttt{gencost\_spmin} & real & Data less than this value will be omitted \\ |
| 87 |
\texttt{gencost\_spmax} & real & Data greater than this value will be omitted \\ |
| 88 |
\texttt{gencost\_spzero} & real & Data points equal to this value will be omitted \\ |
| 89 |
\texttt{gencost\_startdate1} & integer & Start date of observations (YYYMMDD) \\ |
| 90 |
\texttt{gencost\_startdate2} & integer & Start date of observations (HHMMSS) \\ |
| 91 |
\texttt{gencost\_is3d} & logical & Needs to be true for 3D fields \\ |
| 92 |
\hline |
| 93 |
\texttt{gencost\_smooth2Ddiffnbt} &integer & Smoother \# of time steps (sec.~\ref{v4custom} only) |
| 94 |
\\ |
| 95 |
\texttt{gencost\_scalefile} & character(*) & Smoothing scale file (sec.~\ref{v4custom} only) |
| 96 |
\\ |
| 97 |
\texttt{gencost\_nrecperiod} & integer & Deprecated \\ |
| 98 |
\texttt{gencost\_timevaryweight}& logical & Deprecated \\ |
| 99 |
\texttt{gencost\_enddate1} & integer & Not fully implemented \\ |
| 100 |
\texttt{gencost\_enddate2} & integer & Not fully implemented \\ |
| 101 |
\end{tabular} |
| 102 |
\caption{Parameters in \texttt{ecco\_gencost\_nml} namelist in \texttt{data.ecco}. All parameters are vectors of length \texttt{NGENCOST} (the \# of available cost terms) except for \texttt{gencost\_*proc*} are arrays of size \texttt{NGENPPROC}$\times$\texttt{NGENCOST}. Notes: \texttt{gencost\_is3d} will automatically be reset to true in cases listed as 3D in table~\ref{tbl:gencost_ecco_barfile}; the last group of parameters should be disregarded except for the section~\ref{v4custom} special cases.} |
| 103 |
\label{tbl:gencost_ecco_params} |
| 104 |
\end{table} |
| 105 |
|
| 106 |
\begin{table}[!ht] |
| 107 |
\centering |
| 108 |
\begin{tabular}{lll} |
| 109 |
variable name & description & remarks \\ \hline\hline |
| 110 |
\texttt{m\_eta} & sea surface height & free surface + corrections \\ |
| 111 |
\texttt{m\_sst} & sea surface temperature & first level potential temperature \\ |
| 112 |
\texttt{m\_sss} & sea surface salinity & first level salinity \\ |
| 113 |
\texttt{m\_bp} & bottom pressure & \\ \hline |
| 114 |
\texttt{m\_ustress} & zonal wind stress & \\ |
| 115 |
\texttt{m\_vstress} & meridional wind stress & \\ |
| 116 |
\texttt{m\_uwind} & zonal wind & \\ |
| 117 |
\texttt{m\_vwind} & meridional wind & \\ |
| 118 |
\texttt{m\_atemp} & atmospheric temperature & \\ |
| 119 |
\texttt{m\_aqh} & atmospheric specific humidity & \\ |
| 120 |
\texttt{m\_precip} & precipitation & \\ |
| 121 |
\texttt{m\_swdown} & downward shortwave & \\ |
| 122 |
\texttt{m\_lwdown} & downward longwave & \\ |
| 123 |
\texttt{m\_wspeed} & wind speed & \\ \hline |
| 124 |
\texttt{m\_siarea} & sea-ice area & \\ |
| 125 |
\texttt{m\_siheff} & sea-ice effective thickness & \\ |
| 126 |
\texttt{m\_sihsnow} & snow effective thickness & \\ \hline |
| 127 |
\texttt{m\_theta} & potential temperature & three-dimensional \\ |
| 128 |
\texttt{m\_salt} & salinity & three-dimensional \\ |
| 129 |
\texttt{m\_UE} & zonal velocity & three-dimensional \\ |
| 130 |
\texttt{m\_VN} & meridional velocity & three-dimensional \\ \hline |
| 131 |
\texttt{m\_diffkr} & vertical/diapycnal diffusivity & three-dimensional, constant \\ |
| 132 |
\texttt{m\_kapgm} & GM diffusivity & three-dimensional, constant \\ |
| 133 |
\texttt{m\_kapredi} & isopycnal diffusivity & three-dimensional, constant \\ |
| 134 |
\texttt{m\_geothermalflux} & geothermal heat flux & constant \\ |
| 135 |
\texttt{m\_bottomdrag} & bottom drag & constant \\ |
| 136 |
\end{tabular} |
| 137 |
\caption{Implemented \texttt{gencost\_barfile} options (as of checkpoint \mitgcmCheckpointVersion) that can be used via \texttt{cost\_generic.F} as explained in section~\ref{costgen}. An extension starting with `\_' can be appended at the end of the variable name to distinguish between separate cost function terms.} |
| 138 |
\label{tbl:gencost_ecco_barfile} |
| 139 |
\end{table} |
| 140 |
|
| 141 |
\begin{table}[!ht] |
| 142 |
\centering |
| 143 |
\begin{tabular}{lll} |
| 144 |
name & description & specs needed via \texttt{gencost\_preproc\_i}, \texttt{\_r}, or \texttt{\_c} \\ \hline\hline |
| 145 |
\texttt{gencost\_preproc} \\ \hline |
| 146 |
\texttt{clim} & Use climatological misfits & integer: no.\ of records per climatological cycle \\ |
| 147 |
\texttt{mean} & Use time mean of misfits & --- \\ |
| 148 |
\texttt{anom} & Use anomalies from time mean & --- \\ |
| 149 |
\texttt{variaweight} & Use time-varying weight $W_i$& --- \\ |
| 150 |
\texttt{nosumsq} & Use linear misfits & --- \\ |
| 151 |
\texttt{factor} & Multiply $\vec{m}_i$ by a scaling factor & real: the scaling factor \\ \hline \hline |
| 152 |
\texttt{gencost\_posproc} \\ \hline |
| 153 |
\texttt{smooth} & Smooth misfits & character: smoothing scale file\\ |
| 154 |
& & integer: smoother \# of time steps \\ |
| 155 |
\end{tabular} |
| 156 |
\caption{\texttt{gencost\_preproc} and \texttt{gencost\_posproc} options implemented as of checkpoint \mitgcmCheckpointVersion.} |
| 157 |
\label{tbl:gencost_ecco_preproc} |
| 158 |
\end{table} |
| 159 |
|
| 160 |
\clearpage |
| 161 |
|
| 162 |
\subsection{Custom Cost Function Terms} \label{v4custom} |
| 163 |
|
| 164 |
This section pertains to the special cases of cost\_gencost\_bpv4.F cost\_gencost\_seaicev4.F cost\_gencost\_sshv4.F cost\_gencost\_sstv4.F |
| 165 |
|
| 166 |
If \texttt{gencost\_name} is any of the \texttt{sshv4} or \texttt{sstv4} terms, additional smoothing arguments can be specified---\texttt{gencost\_scalefile} points to a file containing the smoothing scales and \texttt{gencost\_smooth2Ddiffnbt} gives the number of smoothing steps. These parameters are otherwise not used. |
| 167 |
|
| 168 |
\subsection{Boxmean Generic Cost Function} \label{genboxmean} |
| 169 |
|
| 170 |
This section pertains to the special case of cost\_gencost\_boxmean.F and is very much a work in progress. The box mean generic cost function penalizes the mean of a model field over a box (non quadratic cost function). To use this cost function, set \texttt{gencost\_name = 'boxmean*'}, where \texttt{*} is an optional suffix starting with \texttt{`\_'}. The model field of interest is specified by \texttt{gencost\_barfile}. Currently valid options are \texttt{m\_boxmean\_theta}, \texttt{m\_boxmean\_salt}, and \texttt{m\_boxmean\_eta}. The ``box'' is a mask of ones and zeros read from files whose prefix is given by the string \texttt{gencost\_errfile}. Different files contain the horizontal, vertical, and temporal masks; these files are distinguished by the suffixes \texttt{`C'}, \texttt{`K'}, and \texttt{`T'}, respectively. For example, if \texttt{gencost\_errfile = 'foo\_mask'}, then the horizontal, vertical, and temporal mask files are named \texttt{foo\_maskC}, \texttt{foo\_maskK}, and \texttt{foo\_maskT}. Note that the horizontal mask can have an arbitrary shape, so the ``box'' is not necessarily rectangular. |
| 171 |
|
| 172 |
\subsection{Transport Generic Cost Function} \label{gentrsp} |
| 173 |
|
| 174 |
This section pertains to the special case of cost\_gencost\_transp.F and is very much a work in progress. The transport generic cost function penalizes the transport of volume, heat, or salt through a specified section (non quadratic cost function). To use this cost function, set \texttt{gencost\_name = 'transp*'}, where \texttt{*} is an optional suffix starting with \texttt{`\_'}, and set \texttt{gencost\_barfile} to one of \texttt{m\_trVol}, \texttt{m\_trHeat}, and \texttt{m\_trSalt}. The section is specified a mask of ones and zeros denoting ``west'' and ``south'' faces through which to compute transport. The prefix for the mask files is given by the string \texttt{gencost\_errfile}, with the suffixes \texttt{`W'} and \texttt{`S'} denoting the ``west'' and ``south'' faces, respectively. There does not appear to be a suffix denoting a temporal mask, but temporal masking could be achieved using the \texttt{variaweight} post-processor. |
| 175 |
|
| 176 |
\begin{table}[!ht] |
| 177 |
\centering |
| 178 |
\begin{tabular}{lll} |
| 179 |
name & description & remarks \\ \hline\hline |
| 180 |
\texttt{sshv4-mdt} & sea surface height & mean dynamic topography (SLA + geod) \\ |
| 181 |
\texttt{sshv4-tp} & sea surface height & TOPEX SLA \\ |
| 182 |
\texttt{sshv4-ers} & sea surface height & ERS SLA \\ |
| 183 |
\texttt{sshv4-gfo} & sea surface height & \\ |
| 184 |
\texttt{sshv4-lsc} & sea surface height & \\ |
| 185 |
\texttt{sshv4-gmsl} & sea surface height & \\ |
| 186 |
\texttt{bpv4-grace} & bottom pressure & \\ |
| 187 |
\texttt{sstv4-amsre} & sea surface temperature & \\ |
| 188 |
\texttt{sstv4-amsre-lsc} & sea surface temperature & \\ \hline |
| 189 |
\texttt{boxmean} & mean over a box & specify box \\ |
| 190 |
\texttt{transp} & transport across section & specify section \\ \hline |
| 191 |
\texttt{si4-cons} & sea ice concentration & \\ |
| 192 |
\texttt{si4-deconc} & model sea ice deficiency & \\ |
| 193 |
\texttt{si4-exconc} & model sea ice excess & \\ |
| 194 |
\end{tabular} |
| 195 |
\caption{Pre-defined \texttt{gencost\_name} options associated with the sections~\ref{v4custom}, \ref{genboxmean}, \ref{gentrsp} special cases.} |
| 196 |
\label{tbl:gencost_ecco_name} |
| 197 |
\end{table} |
| 198 |
|
| 199 |
\begin{table}[!ht] |
| 200 |
\centering |
| 201 |
\begin{tabular}{lll} |
| 202 |
variable name & description & remarks \\ \hline\hline |
| 203 |
\texttt{m\_trVol} & volume transport & three-dimensional, specify section \\ |
| 204 |
\texttt{m\_trHeat} & heat transport & three-dimensional, specify section \\ |
| 205 |
\texttt{m\_trSalt} & salt transport & three-dimensional, specify section \\ |
| 206 |
\texttt{m\_boxmean\_theta} & mean of theta over box & specify box \\ |
| 207 |
\texttt{m\_boxmean\_salt} & mean of salt over box & specify box \\ |
| 208 |
\texttt{m\_boxmean\_eta} & mean of SSH over box & specify box |
| 209 |
\end{tabular} |
| 210 |
\caption{Implemented \texttt{gencost\_barfile} options (as of checkpoint \mitgcmCheckpointVersion) that can be used via \texttt{cost\_gencost\_boxmean.F} or \texttt{cost\_gencost\_transp.F} (see sections~\ref{genboxmean} and ~\ref{gentrsp}.} |
| 211 |
\label{tbl:gencost_ecco_barfile_custom} |
| 212 |
\end{table} |
| 213 |
|
| 214 |
|
| 215 |
|
| 216 |
\subsection{Key Routines} |
| 217 |
|
| 218 |
TBA... \texttt{cost\_generic.F}, \texttt{ecco\_check.F}, \texttt{ecco\_phys.F}, \texttt{ecco\_summary.F}, \texttt{ecco\_readparms.F}, ... |
| 219 |
|
| 220 |
\subsection{Compile Options} |
| 221 |
|
| 222 |
TBA... ALLOW\_GENCOST3D, ALLOW\_PSBAR\_STERIC, ECCO\_CTRL\_DEPRECATED, ... |