/[MITgcm]/manual/s_ecco/text/ecco.tex
ViewVC logotype

Annotation of /manual/s_ecco/text/ecco.tex

Parent Directory Parent Directory | Revision Log Revision Log | View Revision Graph Revision Graph


Revision 1.4 - (hide annotations) (download) (as text)
Wed Sep 21 12:50:01 2016 UTC (7 years, 7 months ago) by gforget
Branch: MAIN
Changes since 1.3: +18 -18 lines
File MIME type: application/x-tex
- document the reconversion of gencost_mask + a few improvements

1 gforget 1.1 \section{ECCO: model-data comparisons using gridded data sets}
2     \label{sec:pkg:ecco}
3     \begin{rawhtml}
4     <!-- CMIREDIR:package_ecco: -->
5     \end{rawhtml}
6    
7 gforget 1.4 \def\mitgcmCheckpointVersion{65z}
8 gforget 1.1
9 gforget 1.3 The functionalities implemented in \texttt{pkg/ecco} are: (1) output time-averaged model fields to compare with gridded data sets; (2) compute normalized model-data distances (i.e., cost functions); (3) compute averages and transports (i.e., integrals). The former is achieved as the model runs forwards in time whereas the others occur after time-integration has completed. Following \cite{for-eta:15} the total cost function is formulated generically as
10 gforget 1.1 \begin{align}
11     \mathcal{J}(\vec{u}) &= \sum_i \alpha_i \left(\vec{d}_i^T R_i^{-1} \vec{d}_i\right) + \sum_j \beta_j \vec{u}^T\vec{u}, \label{eq:Jtotal} \\
12     \vec{d}_i &= \mathcal{P}(\vec{m}_i - \vec{o}_i), \label{eq:Jposproc} \\
13     \vec{m}_i &= \mathcal{S}\mathcal{D}\mathcal{M}(\vec{v}), \label{eq:Jpreproc} \\
14     \vec{v} &= \mathcal{Q}(\vec{u}), \label{eq:Upreproc} \\
15     \vec{u} &= \mathcal{R}(\vec{u}') \label{eq:Uprecond}
16     \end{align}
17 gforget 1.3 using symbols defined in table~\ref{tbl:gencost_symbols}. Per Eq.~\eqref{eq:Jpreproc} model counterparts ($\vec{m}_i$) to observational data ($\vec{o}_i$) derive from adjustable model parameters ($\vec{v}$) through model dynamics integration ($\mathcal{M}$), diagnostic calculations ($\mathcal{D}$), and averaging in space and time ($\mathcal{S}$). Alternatively $\mathcal{S}$ stands for subsampling in space and time (section~\ref{sec:pkg:profiles}). Plain model-data misfits ($\vec{m}_i-\vec{o}_i$) can be penalized directly in Eq.~\eqref{eq:Jtotal} but penalized misfits ($\vec{d}_i$) more generally derive from $\vec{m}_i-\vec{o}_i$ through the generic $\mathcal{P}$ post-processor (Eq. \eqref{eq:Jposproc}). Eqs.~\eqref{eq:Upreproc}-\eqref{eq:Uprecond} pertain to model control parameter adjustment capabilities described in section~\ref{sec:pkg:ctrl}.
18 gforget 1.1
19     \begin{table}[!ht]
20     \centering
21     \begin{tabular}{rl}
22     symbol & definition \\ \hline
23     $\vec{u}$ & vector of nondimensional control variables \\
24     $\vec{v}$ & vector of dimensional control variables \\
25     $\alpha_i, \beta_j$ & misfit and control cost function multipliers (1 by default) \\
26     $R_i$ & data error covariance matrix ($R_i^{-1}$ are weights) \\
27     $\vec{d}_i$ & a set of model-data differences \\
28     $\vec{o}_i$ & observational data vector \\
29     $\vec{m}_i$ & model counterpart to $\vec{o}_i$ \\
30     $\mathcal{P}$ & post-processing operator (e.g., a smoother) \\
31     $\mathcal{M}$ & forward model dynamics operator \\
32     $\mathcal{D}$ & diagnostic computation operator \\
33     $\mathcal{S}$ & averaging/subsampling operator \\
34     $\mathcal{Q}$ & Pre-processing operator \\
35     $\mathcal{R}$ & Pre-conditioning operator
36     \end{tabular}
37     \caption{Symbol definitions for pkg/ecco and pkg/ctrl generic cost functions.}
38     \label{tbl:gencost_symbols}
39     \end{table}
40    
41 gforget 1.3 \subsection{Generic Cost Function} \label{costgen}
42 gforget 1.1
43 gforget 1.3 The parameters available for configuring generic cost function terms in \texttt{data.ecco} are given in table~\ref{tbl:gencost_ecco_params} and examples of possible specifications are available in:
44 gforget 1.1 \begin{itemize}
45     \itemsep0em
46     \item MITgcm\_contrib/verification\_other/global\_oce\_cs32/input/data.ecco
47     \item MITgcm\_contrib/verification\_other/global\_oce\_cs32/input\_ad.sens/data.ecco
48     \item MITgcm\_contrib/gael/verification/global\_oce\_llc90/input.ecco\_v4/data.ecco
49     \end{itemize}
50    
51     \noindent
52 gforget 1.3 The gridded observation file name is specified by \texttt{gencost\_datafile}. Observational time series may be provided as on big file or split into yearly files finishing in `\_1992', `\_1993', etc. The corresponding $\vec{m}_i$ physical variable is specified via the \texttt{gencost\_barfile} root (see table~\ref{tbl:gencost_ecco_barfile}). A file named as specified by \texttt{gencost\_barfile} gets created where averaged fields are written progressively as the model steps forward in time. After the final time step this file is re-read by \texttt{cost\_generic.F} to compute the corresponding cost function term. If \texttt{gencost\_outputlevel} = 1 and \texttt{gencost\_name}=`foo' then \texttt{cost\_generic.F} outputs model-data misfit fields (i.e., $\vec{d}_i$) to a file named `misfit\_foo.data' for offline analysis and visualization.
53 gforget 1.1
54 gforget 1.3 In the current implementation, model-data error covariance matrices $R_i$ omit non-diagonal terms. Specifying $R_i$ thus boils down to providing uncertainty fields ($\sigma_i$ such that $R_i=\sigma_i^2$) in a file specified via \texttt{gencost\_errfile}. By default $\sigma_i$ is assumed to be time-invariant but a $\sigma_i$ time series of the same length as the $\vec{o}_i$ time series can be provided using the \texttt{variaweight} option (table~\ref{tbl:gencost_ecco_preproc}). By default cost functions are quadratic but $\vec{d}_i^T R_i^{-1} \vec{d}_i$ can be replaced with $R_i^{-1/2} \vec{d}_i$ using the \texttt{nosumsq} option (table~\ref{tbl:gencost_ecco_preproc}).
55 gforget 1.1
56 gforget 1.3 In principle, any averaging frequency should be possible, but only {`day'}, {`month'}, {`step'}, and {`const'} are implemented for \texttt{gencost\_avgperiod}. If two different averaging frequencies are needed for a variable used in multiple cost function terms (e.g., daily and monthly) then an extension starting with `\_' should be added to \texttt{gencost\_barfile} (such as `\_day' and `\_mon'). \footnote{ecco\_check may be missing a test for conflicting names...} If two cost function terms use the same variable and frequency, however, then using a common \texttt{gencost\_barfile} saves disk space.
57 gforget 1.1
58 gforget 1.3 Climatologies of $\vec{m}_i$ can be formed from the time series of model averages in order to compare with climatologies of $\vec{o}_i$ by activating the `clim' option via \texttt{gencost\_preproc} and setting the corresponding \texttt{gencost\_preproc\_i} integer parameter to the number of records (i.e., a \# of months, days, or time steps) per climatological cycle. The generic post-processor ($\mathcal{P}$ in Eq.~\eqref{eq:Jposproc}) also allows model-data misfits to be, for example, smoothed in space by setting \texttt{gencost\_posproc} to {`smooth'} and specifying the smoother parameters via \texttt{gencost\_posproc\_c} and \texttt{gencost\_posproc\_i} (see table~\ref{tbl:gencost_ecco_preproc}). Other options associated with the computation of Eq.~\eqref{eq:Jtotal} are summarized in table~\ref{tbl:gencost_ecco_preproc} and further discussed below. Multiple \texttt{gencost\_preproc} / \texttt{gencost\_posproc} options may be specified per cost term.
59 gforget 1.1
60 gforget 1.4 In general the specification of \texttt{gencost\_name} is optional, has no impact on the end-result, and only serves to distinguish between cost function terms amongst the model output (STDOUT.0000, STDERR.0000, costfunction000, misfit*.data). Exceptions listed in table~\ref{tbl:gencost_ecco_name} however activate alternative cost function codes (in place of \texttt{cost\_generic.F}) described in section~\ref{v4custom}. In this section and in table~\ref{tbl:gencost_ecco_barfile} (unlike in other parts of the manual) `zonal' / `meridional' are to be taken literally and these components are centered (i.e., not at the staggered model velocity points). Preparing gridded velocity data sets for use in cost functions thus boils down to interpolating them to XC / YC.
61 gforget 1.1
62     \begin{table}[!ht]
63     \centering
64     \begin{tabular}{lll}
65     parameter & type & function \\ \hline
66     %\texttt{using\_gencost} & logical & Turns specified generic cost term on. \\
67     \texttt{gencost\_name} & character(*) & Name of cost term \\
68     \texttt{gencost\_barfile} & character(*) & File to receive model counterpart $\vec{m}_i$ (see table~\ref{tbl:gencost_ecco_barfile}) \\
69     \texttt{gencost\_datafile} & character(*) & File containing observational data $\vec{o}_i$ \\
70     \texttt{gencost\_avgperiod} & character(5) & Averaging period for $\vec{o}_i$ and $\vec{m}_i$ (see text) \\
71     \texttt{gencost\_outputlevel} & integer & Greater than 0 will output misfit fields\\
72 gforget 1.4 \texttt{gencost\_errfile} & character(*) & Uncertainty field name (not used in section~\ref{intgen})\\
73     \texttt{gencost\_mask} & character(*) & Mask file name root (used only in section~\ref{intgen}) \\
74     \texttt{mult\_gencost} & real & Multiplier $\alpha_i$ (default: 1) \\
75 gforget 1.1 \hline
76     \texttt{gencost\_preproc} & character(*) & Preprocessor names \\
77     \texttt{gencost\_preproc\_c} & character(*) & Preprocessor character arguments \\
78     \texttt{gencost\_preproc\_i} & integer(*) & Preprocessor integer arguments \\
79     \texttt{gencost\_preproc\_r} & real(*) & Preprocessor real arguments \\
80     \texttt{gencost\_posproc} & character(*) & Post-processor names \\
81     \texttt{gencost\_posproc\_c} & character(*) & Post-processor character arguments \\
82     \texttt{gencost\_posproc\_i} & integer(*) & Post-processor integer arguments \\
83     \texttt{gencost\_posproc\_r} & real(*) & Post-processor real arguments \\
84     \hline
85     \texttt{gencost\_spmin} & real & Data less than this value will be omitted \\
86     \texttt{gencost\_spmax} & real & Data greater than this value will be omitted \\
87     \texttt{gencost\_spzero} & real & Data points equal to this value will be omitted \\
88     \texttt{gencost\_startdate1} & integer & Start date of observations (YYYMMDD) \\
89     \texttt{gencost\_startdate2} & integer & Start date of observations (HHMMSS) \\
90 gforget 1.4 \texttt{gencost\_is3d} & logical & Needs to be true for 3D fields \\
91 gforget 1.1 \hline
92 gforget 1.4 \texttt{gencost\_enddate1} & integer & Not fully implemented (used only in sec.~\ref{v4custom})\\
93     \texttt{gencost\_enddate2} & integer & Not fully implemented (used only in sec.~\ref{v4custom})\\
94 gforget 1.1 \end{tabular}
95 gforget 1.4 \caption{Parameters in \texttt{ecco\_gencost\_nml} namelist in \texttt{data.ecco}. All parameters are vectors of length \texttt{NGENCOST} (the \# of available cost terms) except for \texttt{gencost\_*proc*} are arrays of size \texttt{NGENPPROC}$\times$\texttt{NGENCOST}. Notes: \texttt{gencost\_is3d} is automatically reset to true in all 3D cases in table~\ref{tbl:gencost_ecco_barfile}.}
96 gforget 1.1 \label{tbl:gencost_ecco_params}
97     \end{table}
98    
99     \begin{table}[!ht]
100     \centering
101     \begin{tabular}{lll}
102 gforget 1.2 variable name & description & remarks \\ \hline\hline
103     \texttt{m\_eta} & sea surface height & free surface + ice + global steric correction \\
104     \texttt{m\_sst} & sea surface temperature & first level potential temperature \\
105     \texttt{m\_sss} & sea surface salinity & first level salinity \\
106     \texttt{m\_bp} & bottom pressure & phiHydLow\\
107     \texttt{m\_siarea} & sea-ice area & from pkg/seaice \\
108     \texttt{m\_siheff} & sea-ice effective thickness & from pkg/seaice \\
109     \texttt{m\_sihsnow} & snow effective thickness & from pkg/seaice \\ \hline
110     \texttt{m\_theta} & potential temperature & three-dimensional \\
111     \texttt{m\_salt} & salinity & three-dimensional \\
112     \texttt{m\_UE} & zonal velocity & three-dimensional \\
113     \texttt{m\_VN} & meridional velocity & three-dimensional \\ \hline
114     \texttt{m\_ustress} & zonal wind stress & from pkg/exf \\
115     \texttt{m\_vstress} & meridional wind stress & from pkg/exf\\
116     \texttt{m\_uwind} & zonal wind & from pkg/exf\\
117     \texttt{m\_vwind} & meridional wind & from pkg/exf\\
118     \texttt{m\_atemp} & atmospheric temperature & from pkg/exf\\
119     \texttt{m\_aqh} & atmospheric specific humidity & from pkg/exf\\
120     \texttt{m\_precip} & precipitation & from pkg/exf\\
121     \texttt{m\_swdown} & downward shortwave & from pkg/exf\\
122     \texttt{m\_lwdown} & downward longwave & from pkg/exf\\
123     \texttt{m\_wspeed} & wind speed & from pkg/exf\\ \hline
124     \texttt{m\_diffkr} & vertical/diapycnal diffusivity & three-dimensional, constant \\
125     \texttt{m\_kapgm} & GM diffusivity & three-dimensional, constant \\
126 gforget 1.1 \texttt{m\_kapredi} & isopycnal diffusivity & three-dimensional, constant \\
127 gforget 1.2 \texttt{m\_geothermalflux} & geothermal heat flux & constant \\
128     \texttt{m\_bottomdrag} & bottom drag & constant \\
129 gforget 1.1 \end{tabular}
130 gforget 1.4 \caption{Implemented \texttt{gencost\_barfile} options (as of checkpoint \mitgcmCheckpointVersion) that can be used via \texttt{cost\_generic.F} (section~\ref{costgen}). An extension starting with `\_' can be appended at the end of the variable name to distinguish between separate cost function terms. Note: the `m\_eta' formula depends on the \texttt{ATMOSPHERIC\_LOADING} and \texttt{ALLOW\_PSBAR\_STERIC} compile time options and `useRealFreshWaterFlux' run time parameter.}
131 gforget 1.1 \label{tbl:gencost_ecco_barfile}
132     \end{table}
133    
134     \begin{table}[!ht]
135     \centering
136     \begin{tabular}{lll}
137     name & description & specs needed via \texttt{gencost\_preproc\_i}, \texttt{\_r}, or \texttt{\_c} \\ \hline\hline
138     \texttt{gencost\_preproc} \\ \hline
139     \texttt{clim} & Use climatological misfits & integer: no.\ of records per climatological cycle \\
140     \texttt{mean} & Use time mean of misfits & --- \\
141     \texttt{anom} & Use anomalies from time mean & --- \\
142 gforget 1.4 \texttt{variaweight} & Use time-varying weight $W_i$& --- \\
143 gforget 1.1 \texttt{nosumsq} & Use linear misfits & --- \\
144 gforget 1.4 \texttt{factor} & Multiply $\vec{m}_i$ by a scaling factor & real: the scaling factor \\ \hline \hline
145 gforget 1.1 \texttt{gencost\_posproc} \\ \hline
146 gforget 1.4 \texttt{smooth} & Smooth misfits & character: smoothing scale file\\
147     & & integer: smoother \# of time steps \\
148 gforget 1.1 \end{tabular}
149 gforget 1.4 \caption{\texttt{gencost\_preproc} and \texttt{gencost\_posproc} options implemented as of checkpoint \mitgcmCheckpointVersion. Note: the distinction between \texttt{gencost\_preproc} and \texttt{gencost\_posproc} seems unclear and may be revisited in the future.}
150 gforget 1.1 \label{tbl:gencost_ecco_preproc}
151     \end{table}
152    
153     \clearpage
154    
155 gforget 1.3 \subsection{Generic Integral Function} \label{intgen}
156 gforget 1.1
157 gforget 1.3 The functionality described in this section is operated by \texttt{cost\_gencost\_boxmean.F}. It is primarily aimed at obtaining a mechanistic understanding of a chosen physical variable via adjoint sensitivity computations (see Chapter~\ref{chap:autodiff}) as done for example in \cite{maro-eta:99,heim-eta:11,fuku-etal:14}. Thus the quadratic term in Eq.~\ref{eq:Jtotal} ($\vec{d}_i^T R_i^{-1} \vec{d}_i$) is by default replaced with a $d_i$ scalar\footnote{The quadratic option in fact does not yet exist in \texttt{cost\_gencost\_boxmean.F}...} that derives from model fields through a generic integral formula (Eq.~\ref{eq:Jpreproc}). The specification of \texttt{gencost\_barfile} again selects the physical variable type. Current valid options to use \texttt{cost\_gencost\_boxmean.F} are reported in table~\ref{tbl:genint_ecco_barfile}. A suffix starting with \texttt{`\_'} can again be appended to \texttt{gencost\_barfile}.
158     % and the basic averaging frequency is specified via \texttt{gencost\_avgperiod}.
159 gforget 1.1
160 gforget 1.4 The integral formula is defined by masks provided via binary files which names are specified via \texttt{gencost\_mask}. There are two cases: (1) if \texttt{gencost\_mask = `foo\_mask'} and \texttt{gencost\_barfile} is of the `m\_boxmean*' type then the model will search for horizontal, vertical, and temporal mask files named \texttt{foo\_maskC}, \texttt{foo\_maskK}, and \texttt{foo\_maskT}; (2) if instead \texttt{gencost\_barfile} is of the `m\_horflux\_*' type then the model will search for \texttt{foo\_maskW}, \texttt{foo\_maskS}, \texttt{foo\_maskK}, and \texttt{foo\_maskT}.
161 gforget 1.1
162 gforget 1.4 The `C' mask or the `W' / `S' masks are expected to be two-dimensional fields. The `K' and `T' masks (both optional; all 1 by default) are expected to be one-dimensional vectors. The `K' vector length should match Nr. The `T' vector length should match the \# of records that the specification of \texttt{gencost\_avgperiod} implies but there is no restriction on its values. In case \#1 (`m\_boxmean*') the `C' and `K' masks should consists of +1 and 0 values and a volume average will be computed accordingly. In case \#2 (`m\_horflux*') the `W', `S', and `K' masks should consists of +1, -1, and 0 values and an integrated horizontal transport (or overturn) will be computed accordingly.
163 gforget 1.1
164 gforget 1.3 \begin{table}[!ht]
165     \centering
166     \begin{tabular}{lll}
167     variable name & description & remarks \\ \hline\hline
168     \texttt{m\_boxmean\_theta} & mean of theta over box & specify box \\
169     \texttt{m\_boxmean\_salt} & mean of salt over box & specify box \\
170     \texttt{m\_boxmean\_eta} & mean of SSH over box & specify box \\
171     \hline
172     \texttt{m\_horflux\_vol} & volume transport through section & specify transect \\
173     \end{tabular}
174     \caption{Implemented \texttt{gencost\_barfile} options (as of checkpoint \mitgcmCheckpointVersion) that can be used via \texttt{cost\_gencost\_boxmean.F} (section~\ref{intgen}).}
175     \label{tbl:genint_ecco_barfile}
176     \end{table}
177 gforget 1.1
178 gforget 1.3 \subsection{Custom Cost Functions} \label{v4custom}
179    
180     This section (very much a work in progress...) pertains to the special cases of \texttt{cost\_gencost\_bpv4.F}, \texttt{cost\_gencost\_seaicev4.F}, \texttt{cost\_gencost\_sshv4.F}, \texttt{cost\_gencost\_sstv4.F}, and \texttt{cost\_gencost\_transp.F}. The cost\_gencost\_transp.F function can be used to compute a transport of volume, heat, or salt through a specified section (non quadratic cost function). To this end one sets \texttt{gencost\_name = `transp*'}, where \texttt{*} is an optional suffix starting with \texttt{`\_'}, and set \texttt{gencost\_barfile} to one of \texttt{m\_trVol}, \texttt{m\_trHeat}, and \texttt{m\_trSalt}.
181 gforget 1.1
182     \begin{table}[!ht]
183     \centering
184     \begin{tabular}{lll}
185 gforget 1.3 name & description & remarks \\ \hline\hline
186 gforget 1.2 \texttt{sshv4-mdt} & sea surface height & mean dynamic topography (SSH - geod) \\
187     \texttt{sshv4-tp} & sea surface height & Along-Track Topex/Jason SLA (level 3) \\
188     \texttt{sshv4-ers} & sea surface height & Along-Track ERS/Envisat SLA (level 3)\\
189     \texttt{sshv4-gfo} & sea surface height & Along-Track GFO class SLA (level 3)\\
190     \texttt{sshv4-lsc} & sea surface height & Large-Scale SLA (from the above)\\
191     \texttt{sshv4-gmsl} & sea surface height & Global-Mean SLA (from the above)\\ \hline
192     \texttt{bpv4-grace} & bottom pressure & GRACE maps (level 4) \\ \hline
193     \texttt{sstv4-amsre} & sea surface temperature & Along-Swath SST (level 3)\\
194     \texttt{sstv4-amsre-lsc} & sea surface temperature & Large-Scale SST (from the above)\\ \hline
195     \texttt{si4-cons} & sea ice concentration & needs sea-ice adjoint (level 4)\\
196     \texttt{si4-deconc} & model sea ice deficiency & proxy penalty (from the above)\\
197     \texttt{si4-exconc} & model sea ice excess & proxy penalty (from the above)\\ \hline
198 gforget 1.3 \texttt{transp\_trVol} & volume transport & specify section as in section~\ref{intgen}\\
199     \texttt{transp\_trHeat} & heat transport & specify section as in section~\ref{intgen} \\
200     \texttt{transp\_trSalt} & salt transport & specify section as in section~\ref{intgen} \\
201 gforget 1.1 \end{tabular}
202 gforget 1.3 \caption{Pre-defined \texttt{gencost\_name} special cases (as of checkpoint \mitgcmCheckpointVersion; section~\ref{v4custom}).}
203 gforget 1.1 \label{tbl:gencost_ecco_name}
204     \end{table}
205    
206     \subsection{Key Routines}
207    
208 gforget 1.4 TBA... \texttt{cost\_generic.F}, \texttt{cost\_gencost\_boxmean.F}, \texttt{ecco\_phys.F}, \texttt{cost\_gencost\_customize.F}, \texttt{cost\_averagesfields.F}, \texttt{ecco\_toolbox.F}, ... \texttt{ecco\_readparms.F}, \texttt{ecco\_check.F}, \texttt{ecco\_summary.F}, ...
209 gforget 1.1
210     \subsection{Compile Options}
211    
212     TBA... ALLOW\_GENCOST3D, ALLOW\_PSBAR\_STERIC, ECCO\_CTRL\_DEPRECATED, ...

  ViewVC Help
Powered by ViewVC 1.1.22