--- manual/s_phys_pkgs/mnc.tex 2004/04/06 16:48:33 1.9 +++ manual/s_phys_pkgs/mnc.tex 2005/07/18 14:00:00 1.15 @@ -1,8 +1,11 @@ -% $Header: /home/ubuntu/mnt/e9_copy/manual/s_phys_pkgs/Attic/mnc.tex,v 1.9 2004/04/06 16:48:33 edhill Exp $ +% $Header: /home/ubuntu/mnt/e9_copy/manual/s_phys_pkgs/Attic/mnc.tex,v 1.15 2005/07/18 14:00:00 edhill Exp $ % $Name: $ \section{NetCDF I/O Integration: MNC} \label{sec:pkg:mnc} +\begin{rawhtml} + +\end{rawhtml} The \texttt{mnc} package is a set of convenience routines written to expedite the process of creating, appending, and reading NetCDF files. @@ -17,8 +20,345 @@ \end{verbatim} \begin{rawhtml} \end{rawhtml} +Since it is a ``wrapper'' for netCDF, MNC depends upon the Fortran-77 +interface included with the standard netCDF v3.x library which is +often called \texttt{libnetcdf.a}. Please contact your local systems +administrators or the +\begin{rawhtml} \end{rawhtml} +MITgcm-support +\begin{rawhtml} \end{rawhtml} +list for help building and installing netCDF for your particular +platform. + + +\subsection{Using MNC} + +\subsubsection{MNC Configuration} + +As with all MITgcm packages, MNC can be turned on or off at compile time +using the \texttt{packages.conf} file or the \texttt{genmake2} +\texttt{-enable=mnc} or \texttt{-disable=mnc} switches. + +While MNC is likely to work ``as is'', there are a few compile--time +constants that may need to be increased for simulations that employ +large numbers of tiles within each process. Note that the important +quantity is the maximum number of tiles \textbf{per process}. Since +MPI configurations tend to distribute large numbers of tiles over +relatively large numbers of MPI processes, these constants will rarely +need to be increased. + +If MNC runs out of space within its ``lookup'' tables during a +simulation, then it will provide an error message along with a +recommendation of which parameter to increase. The parameters are all +located within \filelink{pkg/mnc/mnc\_common.h}{pkg-mnc-mnc_common.h} +and the ones that may need to be increased are: + +\begin{center} + {\footnotesize + \begin{tabular}[htb]{|l|r|l|}\hline + \textbf{Name} & + \textbf{Default} & \textbf{Description} \\\hline + & & \\ + \texttt{MNC\_MAX\_ID} & 1000 & + \textbf{IDs for various low-level entities} \\ + \texttt{MNC\_MAX\_INFO} & 400 & + \textbf{IDs (mostly for object sizes)} \\ + \texttt{MNC\_CW\_MAX\_I} & 150 & + \textbf{IDs for the ``wrapper'' layer} \\\hline + \end{tabular} + } +\end{center} + +In those rare cases where MNC ``out-of-memory'' error messages are +encountered, it is a good idea to increase the too-small parameter by +a factor of \textbf{2--10} in order to avoid wasting time on an +iterative compile--test sequence. + + +\subsubsection{MNC Inputs} + +Like most MITgcm packages, all of MNC can be turned on/off at runtime +using a single flag in \texttt{data.pkg} +\begin{center} + {\footnotesize + \begin{tabular}[htb]{|l|c|l|l|}\hline + \textbf{Name} & \textbf{T} & + \textbf{Default} & \textbf{Description} \\\hline + & & & \\ + \texttt{useMNC} & L & \texttt{.FALSE.} & + overall MNC ON/OFF switch \\\hline + \end{tabular} + } +\end{center} + +One important MNC--related flag is present in the main \texttt{data} +namelist file in the \texttt{PARM03} section and it is: +\begin{center} + {\footnotesize + \begin{tabular}[htb]{|l|c|l|l|}\hline + \textbf{Name} & \textbf{T} & + \textbf{Default} & \textbf{Description} \\\hline + & & & \\ + \texttt{outputTypesInclusive} & L & \texttt{.FALSE.} & + use all available output ``types'' \\\hline + \end{tabular} + } +\end{center} +which specifies that turning on MNC for a particular type of output +should not simultaneously turn off the default output method as it +normally does. Usually, this option is only used for debugging +purposes since it is inefficient to write output types using both MNC +and MDSIO or ASCII output. This option can also be helpful when +transitioning from MDSIO to MNC since the output can be readily +compared. + +For run-time configuration, most of the MNC--related model parameters +are contained within a Fortran namelist file called +\texttt{data.mnc}. The availabe parameters currently include: +\begin{center} + {\footnotesize + \begin{tabular}[htb]{|l|c|l|l|}\hline + \textbf{Name} & \textbf{T} & + \textbf{Default} & \textbf{Description} \\\hline + & & & \\ + \texttt{mnc\_use\_outdir} & L & \texttt{.FALSE.} & + create a directory for output \\ + \ \ \texttt{mnc\_outdir\_str} & S & \texttt{'mnc\_'} & + output directory name \\ + \ \ \texttt{mnc\_outdir\_date} & L & \texttt{.FALSE.} & + embed date in the outdir name \\ + \ \ \texttt{mnc\_outdir\_num} & L & \texttt{.FALSE.} & + optional \\ + \texttt{pickup\_write\_mnc} & L & \texttt{.FALSE.} & + use MNC to write pickup files \\ + \texttt{pickup\_read\_mnc} & L & \texttt{.FALSE.} & + use MNC to read pickup files \\ + \texttt{mnc\_use\_indir} & L & \texttt{.FALSE.} & + use a directory (path) for input \\ + \ \ \texttt{mnc\_indir\_str} & S & \texttt{''} & + input directory (or path) name \\ + \texttt{snapshot\_mnc} & L & \texttt{.FALSE.} & + write \texttt{snapshot} output w/MNC \\ + \texttt{monitor\_mnc} & L & \texttt{.FALSE.} & + write \texttt{monitor} output w/MNC \\ + \texttt{timeave\_mnc} & L & \texttt{.FALSE.} & + write \texttt{timeave} output w/MNC \\ + \texttt{autodiff\_mnc} & L & \texttt{.FALSE.} & + write \texttt{autodiff} output w/MNC \\ + \texttt{mnc\_max\_fsize} & R & 2.1e+09 & + max allowable file size \\ + \texttt{readgrid\_mnc} & L & \texttt{.FALSE.} & + read grid quantities using MNC \\ + \texttt{mnc\_echo\_gvtypes} & L & \texttt{.FALSE.} & + list pre-defined ``types'' (debug) \\\hline + \end{tabular} + } +\end{center} + +Unlike the older MDSIO method, MNC has the ability to create or use +existing output directories. If either \texttt{mnc\_outdir\_date} or +\texttt{mnc\_outdir\_num} is true, then MNC will try to create +directories on a \textit{PER PROCESS} basis for its output. This +means that a single directory will be created for a non-MPI run and +multiple directories (one per MPI process) will be created for an MPI +run. This approach was chosen since it works safely on both shared +global file systems (such as NFS and AFS) and on local +(per-compute-node) file systems. And if both +\texttt{mnc\_outdir\_date} and \texttt{mnc\_outdir\_num} are false, +then the MNC package will assume that the directory specified in +\texttt{mnc\_outdir\_str} already exists and will use it. This allows +the user to create and specify directories outside of the model. + +For input, MNC can use a single global input directory. This is a +just convenience that allows MNC to gather all of its input files from a +path other than the current working directory. As with MDSIO, the +default is to use the current working directory. + +The flags \texttt{snapshot\_mnc}, \texttt{monitor\_mnc}, +\texttt{timeave\_mnc}, and \texttt{autodiff\_mnc} allow the user to +turn on MNC for particular ``types'' of output. If a type is +selected, then MNC will be used for all output that matches that type. +This applies to output from the main model and from all of the +optional MITgcm packages. Mostly, the names used here correspond to +the names used for the output frequencies in the main \texttt{data} +namelist file. + +The \texttt{mnc\_max\_fsize} parameter is a convenience added to help +users work around common file size limitations. On many computer +systems, either the opterating system, the file system(s), and/or the +netCDF libraries are unable to handle files greater than two or four +gigabytes in size. The MNC package is able to work within this +limitation by creating new files which grow along the netCDF +``unlimited'' (usually, time) dimension. The default value for this +parameter is just slightly less than 2GB which is safe on virtually +all operating systems. Essentially, this feature is a way to +intelligently and automatically split files output along the unlimited +dimension. On systems that support large file sizes, these splits can +be readily concatenated (that is, un-done) using tools such as the +netCDF Operators (with \texttt{ncrcat}) which is available at: +\begin{rawhtml} \end{rawhtml} +\begin{verbatim} +http://nco.sourceforge.net/ +\end{verbatim} +\begin{rawhtml} \end{rawhtml} + +Additional MNC--related parameters may be contained within each +package. Please see the individual packages for descriptions of their +use of MNC. + + +\subsubsection{MNC Output} + +Depending upon the flags used, MNC will produce zero or more +directories containing one or more netCDF files as output. These +files are either mostly or entirely compliant with the netCDF ``CF'' +convention (v1.0) and any conformance issues will be fixed over time. +The patterns used for file names are: +\begin{center} +\texttt{BASENAME.nIter0.tileNum.seqNum.nc} +\end{center} +and an example is: +\begin{center} +\texttt{grid.0000000000.000001.0000.nc} +\end{center} +where \texttt{BASENAME} is the name selected to represent a set of +variables written together, \texttt{nIter0} is the starting iteration +number as specified in the main \texttt{data} namelist input file and +written in a zero-filled 10-digit format, \texttt{tileNum} is the +six-digit zero-filled tile number, \texttt{seqnum} is a four-digit +zero-filled sequence number used when maximum allowable files sizes +are too small to contain all of the output for a particular type +within one run (new files are created with sequential numbers as files +reach the maximum file size limit), and \texttt{.nc} is the file +suffix specified by the current netCDF ``CF'' conventions. + +Some example \texttt{BASENAME} values are: +\begin{description} +\item[grid] contains the variables that describe the various grid + constants related to locations, lengths, areas, etc. +\item[state] contains the variables output at the snapshot or + \texttt{dumpFreq} time frequency +\item[pickup.ckptA, pickup.ckptB] are the ``rolling'' checkpoint files +\item[tave] contains the time-averaged quantities from the main model +\end{description} + +All MNC output is currently done in a ``file-per-tile'' fashion since +most NetCDF v3.x implementions cannot write safely within MPI or +multi-threaded environments. This tiling is done in a global fashion +and the tile numbers are appended to the base names as described +above. Some scripts to manipulate MNC output are available at +\texttt{MITgcm/utils/matlab/} which includes a spatial ``assembly'' +script called \texttt{MITgcm/utils/matlab/mnc\_assembly.m}. + +More general manipulations can be performed on netCDF files with +\begin{rawhtml} \end{rawhtml} +\begin{verbatim} +the NetCDF Operators (``NCO'') +at http://nco.sourceforge.net +\end{verbatim} +\begin{rawhtml} \end{rawhtml} +or with +\begin{rawhtml} \end{rawhtml} +\begin{verbatim} +the Climate Data Operators (``CDO'') +at http://www.mpimet.mpg.de/~cdo/ +\end{verbatim} +\begin{rawhtml} \end{rawhtml} + +Unlike the older MDSIO routines, MNC reads and writes variables on +different ``grids'' depending upon their location on, for instance, an +Arakawa C--grid. The following table provides examples: +\begin{center} + {\footnotesize + \begin{tabular}[htb]{|l|c|c|c|}\hline + \textbf{Name} & \textbf{C--grid location} & + \textbf{\# in X} & \textbf{\# in Y} \\\hline + Temperature & mass & \texttt{sNx} & \texttt{sNy} \\ + Salinity & mass & \texttt{sNx} & \texttt{sNy} \\ + U velocity & U & \texttt{sNx+1} & \texttt{sNy} \\ + V velocity & V & \texttt{sNx} & \texttt{sNy+1} \\ + Vorticity & vorticity & \texttt{sNx+1} & \texttt{sNy+1} \\\hline + \end{tabular} + } +\end{center} +and the intent is two--fold: +\begin{enumerate} +\item For some grid topologies it is impossible to output all + quantities using only \texttt{sNx,sNy} arrays for every tile. Two + examples of this failure are the missing corners problem for + vorticity values on the cubesphere and the velocity edge values for + some open--boundary domains. +\item Writing quantities located on velocity or vorticity points with + the above scheme introduces a very small data redundancy. However, + any slight inconvenience is easily offset by the ease with which one + can, on every individual tile, interpolate these values to mass + points without having to perform an ``exchange'' (or + ``halo-filling'') operation to collect the values from neighboring + tiles. This makes the most common post--processing operations much + easier to implement. +\end{enumerate} + + +\subsection{MNC Troubleshooting} + +\subsubsection{Build Troubleshooting} + +In order to build MITgcm with MNC enabled, the netCDF v3.x Fortran-77 +(not Fortran-90) library must be available. This library is compposed +of a single header file (called \texttt{netcdf.inc}) and a single +library file (usually called \texttt{libnetcdf.a}) and it must be +built with the same compiler (or a binary-compatible compiler) with +compatible compiler options as the one used to build MITgcm. + +For more details concerning the netCDF build and install process, +please visit the netCDF home page at: +\begin{rawhtml} \end{rawhtml} +\begin{verbatim} +http://www.unidata.ucar.edu/packages/netcdf/ +\end{verbatim} +\begin{rawhtml} \end{rawhtml} +which includes an extensive list of known--good netCDF configurations +for various platforms + +\subsubsection{Runtime Troubleshooting} + +Please be aware of the following: + +\begin{itemize} +\item As a safety feature, the MNC package does not, by default, allow + pre-existing files to be appended to or overwritten. This is in + contrast to the older MDSIO package which will, without any warning, + overwrite existing files. If MITgcm aborts with an error message + about the inability to open or write to a netCDF file, please check + \textbf{first} whether you are attempting to overwrite files from a + previous run. + +\item The constraints placed upon the ``unlimited'' (or ``record'') + dimension inherent with NetCDF v3.x make it very inefficient to put + variables written at potentially different intervals within the same + file. For this reason, MNC output is split into groups of files + which attempt to reflect the nature of their content. + +\item On many systems, netCDF has practical file size limits on the + order of 2--4GB (the maximium memory addressable with 32bit pointers + or pointer differences) due to a lack of operating system, compiler, + and/or library support. The latest revisions of netCDF v3.x have + large file support and, on some operating systems, file sizes are + only limited by available disk space. + +\item There is an 80 character limit to the total length of all file + names. This limit includes the directory (or path) since paths and + file names are internally appended. Generally, file names will not + exceed the limit and paths can usually be shortened using, for + example, soft links. + +\item MNC does not (yet) provide a mechanism for reading information + from a single ``global'' file as can be done with the MDSIO + package. This is in progress. +\end{itemize} + -\subsection{Introduction} +\subsection{MNC Internals} The \texttt{mnc} package is a two-level convenience library (or ``wrapper'') for most of the NetCDF Fortran API. Its purpose is to @@ -66,9 +406,7 @@ \end{description} -\subsection{Using MNC} - -\subsubsection{Grid--Types and Variable--Types} +\subsubsection{MNC Grid--Types and Variable--Types} As a convenience for users, the MNC package includes numerous routines to aid in the writing of data to NetCDF format. Probably the biggest @@ -127,7 +465,7 @@ and writing variables. -\subsubsection{An Example} +\subsubsection{Using MNC: Examples} Writing variables to NetCDF files can be accomplished in as few as two function calls. The first function call defines a variable type, @@ -175,7 +513,7 @@ \begin{verbatim} C Write dynvars using the MNC package CALL MNC_CW_SET_UDIM('state', -1, myThid) - CALL MNC_CW_RL_W('I','state',0,0,'iter', myIter, myThid) + CALL MNC_CW_I_W('I','state',0,0,'iter', myIter, myThid) CALL MNC_CW_SET_UDIM('state', 0, myThid) CALL MNC_CW_RL_W('D','state',0,0,'model_time',myTime, myThid) CALL MNC_CW_RL_W('D','state',0,0,'U', uVel, myThid) @@ -183,45 +521,31 @@ \end{verbatim} } +While it is easiest to write variables within typical 2D and 3D fields +where all data is known at a given time, it is also possible to write +fields where only a portion (\textit{eg.} a ``slab'' or ``slice'') is +known at a given instant. An example is provided within +\filelink{pkg/mom\_vecinv/mom\_vecinv.F}{pkg-mom_vecinv-mom_vecinv.F} +where an offset vector is used: {\footnotesize +\begin{verbatim} + IF (useMNC .AND. snapshot_mnc) THEN + CALL MNC_CW_RL_W_OFFSET('D','mom_vi',bi,bj, 'fV', uCf, + & offsets, myThid) + CALL MNC_CW_RL_W_OFFSET('D','mom_vi',bi,bj, 'fU', vCf, + & offsets, myThid) + ENDIF +\end{verbatim} +} +to write a 3D field one depth slice at a time. -\subsubsection{Parameters} - -All the MNC parameters are contained within a file named -\texttt{data.mnc}. If this file does not exist, then the MNC package -will interpret that as an indication that it is not to be used. If -the \texttt{data.mnc} does exist, then it may contain the following -parameters: - -\begin{center} - {\footnotesize - \begin{tabular}[htb]{|l|l|l|l|}\hline - & & & \\ - \textbf{Name} & \textbf{Type} & - \textbf{Default} & \textbf{Description} \\\hline - & & & \\ - \texttt{useMNC} & Logical & \texttt{.FALSE.} & - \textbf{overall MNC ON/OFF switch} \\ - \texttt{mnc\_echo\_gvtypes} & Logical & \texttt{.FALSE.} & - echo pre-defined ``types'' to STDOUT? \\ - \texttt{mnc\_use\_outdir} & Logical & \texttt{.FALSE.} & - create a directory for output? \\ - \texttt{mnc\_outdir\_str} & String & \texttt{'mnc\_'} & - output directory name \\ - \texttt{mnc\_outdir\_date} & Logical & \texttt{.FALSE.} & - embed date in output directory name? \\ - \texttt{mnc\_pickup\_write} & Logical & \texttt{.FALSE.} & - use MNC to write (create) pickup files? \\ - \texttt{mnc\_pickup\_read} & Logical & \texttt{.FALSE.} & - use MNC to read pickup files? \\ - \texttt{mnc\_use\_indir} & Logical & \texttt{.FALSE.} & - use a directory (path) for input? \\ - \texttt{mnc\_indir\_str} & String & \texttt{''} & - input directory (or path) name \\ - \texttt{mnc\_use\_for\_mon} & Logical & \texttt{.FALSE.} & - write \texttt{monitor} output using MNC? \\\hline - \end{tabular} - } -\end{center} - -%\subsection{Package Reference} +Each element in the offset vector corresponds (in order) to the +dimensions of the ``full'' (or virtual) array and specifies which are +known at the time of the call. A zero within the offset array means +that all values along that dimension are available while a positive +integer means that only values along that index of the dimension are +available. In all cases, the matrix passed is assumed to start (that +is, have an in-memory structure) coinciding with the start of the +specified slice. Thus, using this offset array mechanism, a slice +can be written along any single dimension or combinations of +dimensions.