MITgcm Development HOWTO Ed Hill III
eh3@mit.edu
0.01 2003-08-07 eh3 Initial version. This document describes how to develop software for the MITgcm project.
Introduction The purpose of this document is to help new developers get "up to speed" with MITgcm development. New Versions of This Document You can obtain the latest version of this document online in various formats. Feedback and corrections If you have questions or comments about this document, please feel free to contact the authors. Background User Manual Before jumping into development, please familiarize yourself with the MITgcm user manual which is available on the main web page. This document contains volumes of useful information and is included here by reference. Also, a "snapshot" ordevelopment version of the user manual may be available, though this is only put on the web for testing purposes. Prerequisites To develop for MITgcm project you will need a UNIX or UNIX-like set of build tools including the following:
CVS client make or (preferably) GNU make FORTRAN compiler C compiler [ba]sh and [t]csh shells PERL LaTeX and LaTeX2HTML
Essentially all of the work described here has been tested on recent versions of Red Hat Linux (eg. 7.3 through 9). Except where noted, all shell commands will be provided using bash syntax.
CVS Repository Layout Unlike many open source projects, the MITgcm CVS tree does not follow a simple "src", "docs", "share", and "test" directory layout. Instead, there are multiple higher-level directories that each, to some extent, depend upon the presence of the others. The tree currently resembles: gcmpack/ MITgcm-contrib contributed code CS-regrid goes into utils cvspolicy.html -save- CVSROOT -save- development experimental stuff manual -save- misc -?- MITgcm code adjoint fold into genmake bin stub for ecco build compare01 old from 20th century diags timeave f77 in pkgs now doc tags -- connect to real docs? eesupp cnh? exe ecco user build *- jobs runtime shell scripts for | various platforms | lsopt line search m| model main dynamics (core) e| optimization_drivers ? r| optim line search interface g| pkg alternate and optional numerics, etc. e*- tools ?| tutorial_examples documented tests | only populated on release1 branch | and not validated during "testscript" *- utils verification std tests mitgcmdoc -> manual -remove- mitgcm.org build web site models -?- packages -?- preprocess -?- tmp -?- Efforts are underway to reduce the complexity. Branches As shown in the online ViewCVS-generated tree, the MITgcm codebase is split into to two branches or "lines" under which development proceeds. These two lines are referred to as the "MAIN" and "ecco" versions of the code. While not identical, the bulk of the MAIN and ecco lines are composed of files from the same codebase. Periodically, a "Release" branch is formed from the "MAIN" development branch. This is done in order to create a relatively stable reference point for both users and developers. The intent is that once a relese branch has been created, only bug-fixes will be added to it. Meanwhile, development (which might "break" or otherwise render invalid the documentation, tutorials, and/or examples contained within a release branch) is allowed to continue along the MAIN and ecco lines. Tagging The intent of tagging is to create "known-good" checkpoints that developers can use as references. Traditionally, MITgcm tagging has maintained the following conventions: Developer checks out code into a local CVS-managed directory, makes various changes/additions, tests these edits, and eventually reaches a point where (s)he is satisfied that the changes form a new "useful" point in the evolution of the code. The developer then runs the testscript shell script to see if any problems are introduced. While not intended to be exhaustive, the test cases within the verification directory do provide some indication whether gross errors have been introduced. Having satisfied him- or herself that the changes are ready to be committed to the CVS repository, the developer then: adds a "checkpointXY_pre" comment (where X is a checkpoint number and Y is a letter) to the tag-index file and checks it into the CVS repository submits the set of changes to the CVS repository and adds comments to tag-index describing what the changes are along with a matching "checkpointXY_post" entry The result of this tagging procedure is a sequence of development checkpoints with comments which resembles: checkpoint50e_post o make KPP work with PTRACERS - fix gad_calc_rhs to call new routine kpp_transport_ptr, which is nearly a copy of kpp_transport_s - there is no analogue to SurfaceTendencyS, so I have to use gPtr(of the surface layer) instead o add a new platform SunFire+mpi (SunFire 15000) to genmake checkpoint50e_pre checkpoint50d_post o change kpp output from multiple-record state files to single-record state files analogous to write_state.F o reduce the output frequency of cg3d-related stuff to the monitor frequency, analogous to the cg2d-related output. o fix small problem with in ptracers_write_checkpoint.F: len(suff)=512, so that writing to internal file fn (with length 512) fails. checkpoint50d_pre This information can be used to refer to various stages of the code development. For example, bugs can be traced to individual sets of CVS checkins based upon their first appearance when comparing the results from different checkpoints. Editing the Documentation Getting the Docs and Code The first step towards editing the documentation is to checkout a copy of code, docs, and build scripts from the CVS server using: $ export CVS_RSH=ssh $ export CVSROOT=':ext:auden.lcs.mit.edu:/u/u3/gcmpack' $ mkdir scratch $ cvs co MITgcm manual mitgcm.org These commands extract the necessary information from the CVS server and create a temporary (called scratch) directory for the storage of the HTML and other files that will be created. Please note that you must either create scratch as shown or edit the various Makefiles and scripts used to create the documentation. Editing The documentation is contained in the manual directory in a raw LaTeX format. The main document is manual.tex and it uses \input{}s to include the chapters and subsections. Since the same LaTeX source is used to produce PostScript, PDF, and HTML output, care should be taken to follow certain conventions. Two of the most important are the usage of the \filelink{}{} and \varlink{}{} commands. Both of these commands have been defined to simplify the connection between the automatically generated ("code browser") HTML and the HTML version of the manual produced by LaTeX2HTML. They each take two arguments (corresponding to the contents of the two sets of curly braces) which are the text that the author wishes to be "wrapped" within the link, and a specially formatted link thats relative to the MITgcm directory within the CVS tree. The result is a command that resembles either a reference to a variable or subroutine name such as \varlink{tRef}{tRef}, or a reference to a file such as \varlink{tRef}{path-to-the-file_name.F} where the absolute path to the file is of the form /foo/MITgcm/path/to/the/file_name.F (please note how the leading "/foo/MITgcm" component of the path is dropped leaving the path relative to the head of the code directory and each directory separator "/" is turned into a "-") Building Given the directory structure of , the entire documentation for the web site can be built using: $ cd mitgcm.org/devel/buildweb $ make All Which builds the PDF from the LaTeX source, creates the HTML output from the LaTeX source, parses the FORTRAN code base to produce a hyperlinked HTML version of the source, and then determines the cross-linking between the various HTML components. If there are no errors, the result of the build process (which can take 30+ minutes on a P4/2.5Ghz) will be contained within a single directory called scratch/dev_docs. This is a freshly built version of the entire on-line users manual. If you have the correct permissions, it can be directly copied to the web server area: $ mv scratch/dev_docs /u/u0/httpd/html and the update is complete. Coding Coding Packages Optional parts of code have been separated from the MITgcmUV core driver code and organised into packages. The packaging structure provides a mechanism for maintaining suites of code, specific to particular classes of problems, in a way that is cleanly separated from the generic fluid dynamical engine. The MITgcmUV packaging structure is described below using generic package names ${pkg}. A concrete examples of a package is the code for implementing GM/Redi mixing. This code uses the package name Chris's Notes... MITgcmUV Packages ================= Optional parts of code are separated from the MITgcmUV core driver code and organised into packages. The packaging structure provides a mechanism for maintaining suites of code, specific to particular classes of problem, in a way that is cleanly separated from the generic fluid dynamical engine. The MITgcmUV packaging structure is describe below using generic package names ${pkg}. A concrete examples of a package is the code for implementing GM/Redi mixing. This code uses the package name * ${PKG} = GMREDI * ${pkg} = gmredi * ${Pkg} = gmRedi Package states ============== Packages can be any one of four states, included, excluded, enabled, disabled as follows: included(excluded) compile time state which includes(excludes) package code and routine calls from compilation/linking etc... enabled(disabled) run-time state which enables(disables) package code execution. Every call to a ${pkg}_... routine from outside the package should be placed within both a #ifdef ALLOW_${PKG} ... block and a if ( use${Pkg} ) ... then block. Package states are generally not expected to change during a model run. Package structure ================= o Each package gets its runtime configuration parameters from a file named "data.${pkg}" Package runtime config. options are imported into a common block held in a header file called "${PKG}.h". o The core driver part of the model can check for runtime enabling or disabling of individual packages through logical flags use${Pkg}. The information is loaded from a global package setup file called "data.pkg". The use${Pkg} flags are not used within individual packages. o Included in "${PKG}.h" is a logical flag called ${Pkg}IsOn. The "${PKG}.h" header file can be imported by other packages to check dependencies and requirements from other packages ( see "Package Boot Sequence" section). NOTE: This procedure is not presently implemented, ----- neither for kpp nor for gmRedi. CPP Flags ========= 1. Within the core driver code flags of the form ALLOW_${PKG} are used to include or exclude whole packages. The ALLOW_${PKG} flags are included from a PKG_CPP_OPTIONS block which is currently held in-line in the CPP_OPTIONS.h header file. e.g. Core model code ..... #include "CPP_OPTIONS.h" : : : #ifdef ALLOW_${PKG} if ( use${Pkg} ) CALL ${PKG}_DO_SOMETHING(...) #endif 2. Within an individual package a header file, "${PKG}_OPTIONS.h", is used to set CPP flags specific to that package. It is not recommended to include this file in "CPP_OPTIONS.h". Package Boot Sequence ===================== Calls to package routines within the core code timestepping loop can vary. However, all packages follow a required "boot" sequence outlined here: 1. S/R PACKAGES_BOOT() : CALL OPEN_COPY_DATA_FILE( 'data.pkg', 'PACKAGES_BOOT', ... ) 2. S/R PACKAGES_READPARMS() : #ifdef ALLOW_${PKG} if ( use${Pkg} ) & CALL ${PKG}_READPARMS( retCode ) #endif 2. S/R PACKAGES_CHECK() : #ifdef ALLOW_${PKG} if ( use${Pkg} ) & CALL ${PKG}_CHECK( retCode ) #else if ( use${Pkg} ) & CALL PACKAGES_CHECK_ERROR('${PKG}') #endif 3. S/R PACKAGES_INIT() : #ifdef ALLOW_${PKG} if ( use${Pkg} ) & CALL ${PKG}_INIT( retCode ) #endif Description =========== - ${PKG}_READPARMS() is responsible for reading in the package parameters file data.${pkg}, and storing the package parameters in "${PKG}.h". -> called in INITIALISE_FIXED - ${PKG}_CHECK() is responsible for validating basic package setup and inter-package dependencies. ${PKG}_CHECK can import other package parameters it may need to check. This is done through header files "${PKG}.h". It is assumed that parameters owned by other packages will not be reset during ${PKG}_CHECK(). -> called in INITIALISE_FIXED - ${PKG}_INIT() is responsible for completing the internal setup of a package. This routine is called after the core model state has been completely initialised but before the core model timestepping starts. -> called in INITIALISE_VARIA Summary ======= - CPP options: ----------------------- * ALLOW_${PKG} include/exclude package for compilation - FORTRAN logical: ----------------------- * use${Pkg} enable package for execution at runtime -> declared in PARAMS.h * ${Pkg}IsOn for package cross-dependency check -> declared in ${PKG}.h N.B.: Not presently used! - header files ----------------------- * ${PKG}_OPTIONS.h has further package-specific CPP options * ${PKG}.h package-specific common block variables, fields - FORTRAN source files ----------------------- * ${pkg}_readparms.F reads parameters from file data.${pkg} * ${pkg}_check.F checks package dependencies and consistencies * ${pkg}_init.F initialises package-related fields * ${pkg}_... .F package source code - parameter file ----------------------- * data.${pkg} parameter file