--- manual/s_phys_pkgs/mnc.tex 2004/04/06 16:48:33 1.9
+++ manual/s_phys_pkgs/mnc.tex 2005/07/18 14:00:00 1.15
@@ -1,8 +1,11 @@
-% $Header: /home/ubuntu/mnt/e9_copy/manual/s_phys_pkgs/Attic/mnc.tex,v 1.9 2004/04/06 16:48:33 edhill Exp $
+% $Header: /home/ubuntu/mnt/e9_copy/manual/s_phys_pkgs/Attic/mnc.tex,v 1.15 2005/07/18 14:00:00 edhill Exp $
% $Name: $
\section{NetCDF I/O Integration: MNC}
\label{sec:pkg:mnc}
+\begin{rawhtml}
+
+\end{rawhtml}
The \texttt{mnc} package is a set of convenience routines written to
expedite the process of creating, appending, and reading NetCDF files.
@@ -17,8 +20,345 @@
\end{verbatim}
\begin{rawhtml} \end{rawhtml}
+Since it is a ``wrapper'' for netCDF, MNC depends upon the Fortran-77
+interface included with the standard netCDF v3.x library which is
+often called \texttt{libnetcdf.a}. Please contact your local systems
+administrators or the
+\begin{rawhtml} \end{rawhtml}
+MITgcm-support
+\begin{rawhtml} \end{rawhtml}
+list for help building and installing netCDF for your particular
+platform.
+
+
+\subsection{Using MNC}
+
+\subsubsection{MNC Configuration}
+
+As with all MITgcm packages, MNC can be turned on or off at compile time
+using the \texttt{packages.conf} file or the \texttt{genmake2}
+\texttt{-enable=mnc} or \texttt{-disable=mnc} switches.
+
+While MNC is likely to work ``as is'', there are a few compile--time
+constants that may need to be increased for simulations that employ
+large numbers of tiles within each process. Note that the important
+quantity is the maximum number of tiles \textbf{per process}. Since
+MPI configurations tend to distribute large numbers of tiles over
+relatively large numbers of MPI processes, these constants will rarely
+need to be increased.
+
+If MNC runs out of space within its ``lookup'' tables during a
+simulation, then it will provide an error message along with a
+recommendation of which parameter to increase. The parameters are all
+located within \filelink{pkg/mnc/mnc\_common.h}{pkg-mnc-mnc_common.h}
+and the ones that may need to be increased are:
+
+\begin{center}
+ {\footnotesize
+ \begin{tabular}[htb]{|l|r|l|}\hline
+ \textbf{Name} &
+ \textbf{Default} & \textbf{Description} \\\hline
+ & & \\
+ \texttt{MNC\_MAX\_ID} & 1000 &
+ \textbf{IDs for various low-level entities} \\
+ \texttt{MNC\_MAX\_INFO} & 400 &
+ \textbf{IDs (mostly for object sizes)} \\
+ \texttt{MNC\_CW\_MAX\_I} & 150 &
+ \textbf{IDs for the ``wrapper'' layer} \\\hline
+ \end{tabular}
+ }
+\end{center}
+
+In those rare cases where MNC ``out-of-memory'' error messages are
+encountered, it is a good idea to increase the too-small parameter by
+a factor of \textbf{2--10} in order to avoid wasting time on an
+iterative compile--test sequence.
+
+
+\subsubsection{MNC Inputs}
+
+Like most MITgcm packages, all of MNC can be turned on/off at runtime
+using a single flag in \texttt{data.pkg}
+\begin{center}
+ {\footnotesize
+ \begin{tabular}[htb]{|l|c|l|l|}\hline
+ \textbf{Name} & \textbf{T} &
+ \textbf{Default} & \textbf{Description} \\\hline
+ & & & \\
+ \texttt{useMNC} & L & \texttt{.FALSE.} &
+ overall MNC ON/OFF switch \\\hline
+ \end{tabular}
+ }
+\end{center}
+
+One important MNC--related flag is present in the main \texttt{data}
+namelist file in the \texttt{PARM03} section and it is:
+\begin{center}
+ {\footnotesize
+ \begin{tabular}[htb]{|l|c|l|l|}\hline
+ \textbf{Name} & \textbf{T} &
+ \textbf{Default} & \textbf{Description} \\\hline
+ & & & \\
+ \texttt{outputTypesInclusive} & L & \texttt{.FALSE.} &
+ use all available output ``types'' \\\hline
+ \end{tabular}
+ }
+\end{center}
+which specifies that turning on MNC for a particular type of output
+should not simultaneously turn off the default output method as it
+normally does. Usually, this option is only used for debugging
+purposes since it is inefficient to write output types using both MNC
+and MDSIO or ASCII output. This option can also be helpful when
+transitioning from MDSIO to MNC since the output can be readily
+compared.
+
+For run-time configuration, most of the MNC--related model parameters
+are contained within a Fortran namelist file called
+\texttt{data.mnc}. The availabe parameters currently include:
+\begin{center}
+ {\footnotesize
+ \begin{tabular}[htb]{|l|c|l|l|}\hline
+ \textbf{Name} & \textbf{T} &
+ \textbf{Default} & \textbf{Description} \\\hline
+ & & & \\
+ \texttt{mnc\_use\_outdir} & L & \texttt{.FALSE.} &
+ create a directory for output \\
+ \ \ \texttt{mnc\_outdir\_str} & S & \texttt{'mnc\_'} &
+ output directory name \\
+ \ \ \texttt{mnc\_outdir\_date} & L & \texttt{.FALSE.} &
+ embed date in the outdir name \\
+ \ \ \texttt{mnc\_outdir\_num} & L & \texttt{.FALSE.} &
+ optional \\
+ \texttt{pickup\_write\_mnc} & L & \texttt{.FALSE.} &
+ use MNC to write pickup files \\
+ \texttt{pickup\_read\_mnc} & L & \texttt{.FALSE.} &
+ use MNC to read pickup files \\
+ \texttt{mnc\_use\_indir} & L & \texttt{.FALSE.} &
+ use a directory (path) for input \\
+ \ \ \texttt{mnc\_indir\_str} & S & \texttt{''} &
+ input directory (or path) name \\
+ \texttt{snapshot\_mnc} & L & \texttt{.FALSE.} &
+ write \texttt{snapshot} output w/MNC \\
+ \texttt{monitor\_mnc} & L & \texttt{.FALSE.} &
+ write \texttt{monitor} output w/MNC \\
+ \texttt{timeave\_mnc} & L & \texttt{.FALSE.} &
+ write \texttt{timeave} output w/MNC \\
+ \texttt{autodiff\_mnc} & L & \texttt{.FALSE.} &
+ write \texttt{autodiff} output w/MNC \\
+ \texttt{mnc\_max\_fsize} & R & 2.1e+09 &
+ max allowable file size \\
+ \texttt{readgrid\_mnc} & L & \texttt{.FALSE.} &
+ read grid quantities using MNC \\
+ \texttt{mnc\_echo\_gvtypes} & L & \texttt{.FALSE.} &
+ list pre-defined ``types'' (debug) \\\hline
+ \end{tabular}
+ }
+\end{center}
+
+Unlike the older MDSIO method, MNC has the ability to create or use
+existing output directories. If either \texttt{mnc\_outdir\_date} or
+\texttt{mnc\_outdir\_num} is true, then MNC will try to create
+directories on a \textit{PER PROCESS} basis for its output. This
+means that a single directory will be created for a non-MPI run and
+multiple directories (one per MPI process) will be created for an MPI
+run. This approach was chosen since it works safely on both shared
+global file systems (such as NFS and AFS) and on local
+(per-compute-node) file systems. And if both
+\texttt{mnc\_outdir\_date} and \texttt{mnc\_outdir\_num} are false,
+then the MNC package will assume that the directory specified in
+\texttt{mnc\_outdir\_str} already exists and will use it. This allows
+the user to create and specify directories outside of the model.
+
+For input, MNC can use a single global input directory. This is a
+just convenience that allows MNC to gather all of its input files from a
+path other than the current working directory. As with MDSIO, the
+default is to use the current working directory.
+
+The flags \texttt{snapshot\_mnc}, \texttt{monitor\_mnc},
+\texttt{timeave\_mnc}, and \texttt{autodiff\_mnc} allow the user to
+turn on MNC for particular ``types'' of output. If a type is
+selected, then MNC will be used for all output that matches that type.
+This applies to output from the main model and from all of the
+optional MITgcm packages. Mostly, the names used here correspond to
+the names used for the output frequencies in the main \texttt{data}
+namelist file.
+
+The \texttt{mnc\_max\_fsize} parameter is a convenience added to help
+users work around common file size limitations. On many computer
+systems, either the opterating system, the file system(s), and/or the
+netCDF libraries are unable to handle files greater than two or four
+gigabytes in size. The MNC package is able to work within this
+limitation by creating new files which grow along the netCDF
+``unlimited'' (usually, time) dimension. The default value for this
+parameter is just slightly less than 2GB which is safe on virtually
+all operating systems. Essentially, this feature is a way to
+intelligently and automatically split files output along the unlimited
+dimension. On systems that support large file sizes, these splits can
+be readily concatenated (that is, un-done) using tools such as the
+netCDF Operators (with \texttt{ncrcat}) which is available at:
+\begin{rawhtml} \end{rawhtml}
+\begin{verbatim}
+http://nco.sourceforge.net/
+\end{verbatim}
+\begin{rawhtml} \end{rawhtml}
+
+Additional MNC--related parameters may be contained within each
+package. Please see the individual packages for descriptions of their
+use of MNC.
+
+
+\subsubsection{MNC Output}
+
+Depending upon the flags used, MNC will produce zero or more
+directories containing one or more netCDF files as output. These
+files are either mostly or entirely compliant with the netCDF ``CF''
+convention (v1.0) and any conformance issues will be fixed over time.
+The patterns used for file names are:
+\begin{center}
+\texttt{BASENAME.nIter0.tileNum.seqNum.nc}
+\end{center}
+and an example is:
+\begin{center}
+\texttt{grid.0000000000.000001.0000.nc}
+\end{center}
+where \texttt{BASENAME} is the name selected to represent a set of
+variables written together, \texttt{nIter0} is the starting iteration
+number as specified in the main \texttt{data} namelist input file and
+written in a zero-filled 10-digit format, \texttt{tileNum} is the
+six-digit zero-filled tile number, \texttt{seqnum} is a four-digit
+zero-filled sequence number used when maximum allowable files sizes
+are too small to contain all of the output for a particular type
+within one run (new files are created with sequential numbers as files
+reach the maximum file size limit), and \texttt{.nc} is the file
+suffix specified by the current netCDF ``CF'' conventions.
+
+Some example \texttt{BASENAME} values are:
+\begin{description}
+\item[grid] contains the variables that describe the various grid
+ constants related to locations, lengths, areas, etc.
+\item[state] contains the variables output at the snapshot or
+ \texttt{dumpFreq} time frequency
+\item[pickup.ckptA, pickup.ckptB] are the ``rolling'' checkpoint files
+\item[tave] contains the time-averaged quantities from the main model
+\end{description}
+
+All MNC output is currently done in a ``file-per-tile'' fashion since
+most NetCDF v3.x implementions cannot write safely within MPI or
+multi-threaded environments. This tiling is done in a global fashion
+and the tile numbers are appended to the base names as described
+above. Some scripts to manipulate MNC output are available at
+\texttt{MITgcm/utils/matlab/} which includes a spatial ``assembly''
+script called \texttt{MITgcm/utils/matlab/mnc\_assembly.m}.
+
+More general manipulations can be performed on netCDF files with
+\begin{rawhtml} \end{rawhtml}
+\begin{verbatim}
+the NetCDF Operators (``NCO'')
+at http://nco.sourceforge.net
+\end{verbatim}
+\begin{rawhtml} \end{rawhtml}
+or with
+\begin{rawhtml} \end{rawhtml}
+\begin{verbatim}
+the Climate Data Operators (``CDO'')
+at http://www.mpimet.mpg.de/~cdo/
+\end{verbatim}
+\begin{rawhtml} \end{rawhtml}
+
+Unlike the older MDSIO routines, MNC reads and writes variables on
+different ``grids'' depending upon their location on, for instance, an
+Arakawa C--grid. The following table provides examples:
+\begin{center}
+ {\footnotesize
+ \begin{tabular}[htb]{|l|c|c|c|}\hline
+ \textbf{Name} & \textbf{C--grid location} &
+ \textbf{\# in X} & \textbf{\# in Y} \\\hline
+ Temperature & mass & \texttt{sNx} & \texttt{sNy} \\
+ Salinity & mass & \texttt{sNx} & \texttt{sNy} \\
+ U velocity & U & \texttt{sNx+1} & \texttt{sNy} \\
+ V velocity & V & \texttt{sNx} & \texttt{sNy+1} \\
+ Vorticity & vorticity & \texttt{sNx+1} & \texttt{sNy+1} \\\hline
+ \end{tabular}
+ }
+\end{center}
+and the intent is two--fold:
+\begin{enumerate}
+\item For some grid topologies it is impossible to output all
+ quantities using only \texttt{sNx,sNy} arrays for every tile. Two
+ examples of this failure are the missing corners problem for
+ vorticity values on the cubesphere and the velocity edge values for
+ some open--boundary domains.
+\item Writing quantities located on velocity or vorticity points with
+ the above scheme introduces a very small data redundancy. However,
+ any slight inconvenience is easily offset by the ease with which one
+ can, on every individual tile, interpolate these values to mass
+ points without having to perform an ``exchange'' (or
+ ``halo-filling'') operation to collect the values from neighboring
+ tiles. This makes the most common post--processing operations much
+ easier to implement.
+\end{enumerate}
+
+
+\subsection{MNC Troubleshooting}
+
+\subsubsection{Build Troubleshooting}
+
+In order to build MITgcm with MNC enabled, the netCDF v3.x Fortran-77
+(not Fortran-90) library must be available. This library is compposed
+of a single header file (called \texttt{netcdf.inc}) and a single
+library file (usually called \texttt{libnetcdf.a}) and it must be
+built with the same compiler (or a binary-compatible compiler) with
+compatible compiler options as the one used to build MITgcm.
+
+For more details concerning the netCDF build and install process,
+please visit the netCDF home page at:
+\begin{rawhtml} \end{rawhtml}
+\begin{verbatim}
+http://www.unidata.ucar.edu/packages/netcdf/
+\end{verbatim}
+\begin{rawhtml} \end{rawhtml}
+which includes an extensive list of known--good netCDF configurations
+for various platforms
+
+\subsubsection{Runtime Troubleshooting}
+
+Please be aware of the following:
+
+\begin{itemize}
+\item As a safety feature, the MNC package does not, by default, allow
+ pre-existing files to be appended to or overwritten. This is in
+ contrast to the older MDSIO package which will, without any warning,
+ overwrite existing files. If MITgcm aborts with an error message
+ about the inability to open or write to a netCDF file, please check
+ \textbf{first} whether you are attempting to overwrite files from a
+ previous run.
+
+\item The constraints placed upon the ``unlimited'' (or ``record'')
+ dimension inherent with NetCDF v3.x make it very inefficient to put
+ variables written at potentially different intervals within the same
+ file. For this reason, MNC output is split into groups of files
+ which attempt to reflect the nature of their content.
+
+\item On many systems, netCDF has practical file size limits on the
+ order of 2--4GB (the maximium memory addressable with 32bit pointers
+ or pointer differences) due to a lack of operating system, compiler,
+ and/or library support. The latest revisions of netCDF v3.x have
+ large file support and, on some operating systems, file sizes are
+ only limited by available disk space.
+
+\item There is an 80 character limit to the total length of all file
+ names. This limit includes the directory (or path) since paths and
+ file names are internally appended. Generally, file names will not
+ exceed the limit and paths can usually be shortened using, for
+ example, soft links.
+
+\item MNC does not (yet) provide a mechanism for reading information
+ from a single ``global'' file as can be done with the MDSIO
+ package. This is in progress.
+\end{itemize}
+
-\subsection{Introduction}
+\subsection{MNC Internals}
The \texttt{mnc} package is a two-level convenience library (or
``wrapper'') for most of the NetCDF Fortran API. Its purpose is to
@@ -66,9 +406,7 @@
\end{description}
-\subsection{Using MNC}
-
-\subsubsection{Grid--Types and Variable--Types}
+\subsubsection{MNC Grid--Types and Variable--Types}
As a convenience for users, the MNC package includes numerous routines
to aid in the writing of data to NetCDF format. Probably the biggest
@@ -127,7 +465,7 @@
and writing variables.
-\subsubsection{An Example}
+\subsubsection{Using MNC: Examples}
Writing variables to NetCDF files can be accomplished in as few as two
function calls. The first function call defines a variable type,
@@ -175,7 +513,7 @@
\begin{verbatim}
C Write dynvars using the MNC package
CALL MNC_CW_SET_UDIM('state', -1, myThid)
- CALL MNC_CW_RL_W('I','state',0,0,'iter', myIter, myThid)
+ CALL MNC_CW_I_W('I','state',0,0,'iter', myIter, myThid)
CALL MNC_CW_SET_UDIM('state', 0, myThid)
CALL MNC_CW_RL_W('D','state',0,0,'model_time',myTime, myThid)
CALL MNC_CW_RL_W('D','state',0,0,'U', uVel, myThid)
@@ -183,45 +521,31 @@
\end{verbatim}
}
+While it is easiest to write variables within typical 2D and 3D fields
+where all data is known at a given time, it is also possible to write
+fields where only a portion (\textit{eg.} a ``slab'' or ``slice'') is
+known at a given instant. An example is provided within
+\filelink{pkg/mom\_vecinv/mom\_vecinv.F}{pkg-mom_vecinv-mom_vecinv.F}
+where an offset vector is used: {\footnotesize
+\begin{verbatim}
+ IF (useMNC .AND. snapshot_mnc) THEN
+ CALL MNC_CW_RL_W_OFFSET('D','mom_vi',bi,bj, 'fV', uCf,
+ & offsets, myThid)
+ CALL MNC_CW_RL_W_OFFSET('D','mom_vi',bi,bj, 'fU', vCf,
+ & offsets, myThid)
+ ENDIF
+\end{verbatim}
+}
+to write a 3D field one depth slice at a time.
-\subsubsection{Parameters}
-
-All the MNC parameters are contained within a file named
-\texttt{data.mnc}. If this file does not exist, then the MNC package
-will interpret that as an indication that it is not to be used. If
-the \texttt{data.mnc} does exist, then it may contain the following
-parameters:
-
-\begin{center}
- {\footnotesize
- \begin{tabular}[htb]{|l|l|l|l|}\hline
- & & & \\
- \textbf{Name} & \textbf{Type} &
- \textbf{Default} & \textbf{Description} \\\hline
- & & & \\
- \texttt{useMNC} & Logical & \texttt{.FALSE.} &
- \textbf{overall MNC ON/OFF switch} \\
- \texttt{mnc\_echo\_gvtypes} & Logical & \texttt{.FALSE.} &
- echo pre-defined ``types'' to STDOUT? \\
- \texttt{mnc\_use\_outdir} & Logical & \texttt{.FALSE.} &
- create a directory for output? \\
- \texttt{mnc\_outdir\_str} & String & \texttt{'mnc\_'} &
- output directory name \\
- \texttt{mnc\_outdir\_date} & Logical & \texttt{.FALSE.} &
- embed date in output directory name? \\
- \texttt{mnc\_pickup\_write} & Logical & \texttt{.FALSE.} &
- use MNC to write (create) pickup files? \\
- \texttt{mnc\_pickup\_read} & Logical & \texttt{.FALSE.} &
- use MNC to read pickup files? \\
- \texttt{mnc\_use\_indir} & Logical & \texttt{.FALSE.} &
- use a directory (path) for input? \\
- \texttt{mnc\_indir\_str} & String & \texttt{''} &
- input directory (or path) name \\
- \texttt{mnc\_use\_for\_mon} & Logical & \texttt{.FALSE.} &
- write \texttt{monitor} output using MNC? \\\hline
- \end{tabular}
- }
-\end{center}
-
-%\subsection{Package Reference}
+Each element in the offset vector corresponds (in order) to the
+dimensions of the ``full'' (or virtual) array and specifies which are
+known at the time of the call. A zero within the offset array means
+that all values along that dimension are available while a positive
+integer means that only values along that index of the dimension are
+available. In all cases, the matrix passed is assumed to start (that
+is, have an in-memory structure) coinciding with the start of the
+specified slice. Thus, using this offset array mechanism, a slice
+can be written along any single dimension or combinations of
+dimensions.