/[MITgcm]/MITgcm_contrib/llc_hires/llc_90/code-async/readme.txt

Diff of /MITgcm_contrib/llc_hires/llc_90/code-async/readme.txt

Parent Directory | Revision Log | View Revision Graph Revision Graph | View Patch Patch

-revision 1.1 by dimitri,
Tue Oct  3 00:09:12 2017 UTC
+revision 1.5 by dimitri,
Wed Mar  6 16:54:57 2019 UTC
 Line 1
- Right now some sizes need to be configured manually:
+ From Bron Nelson, February 28, 2019
- recvTask.c   lines 79-82
+ In the newest version, it is no longer necessary to hand-edit the
+ constants in "recvTask.c" and "readtile_mpiio.c".  Instead, the file
- #define NUM_X   4320
+ "SIZE.h" has been modified in two ways:
- #define NUM_Y   56160L                     // get rid of this someday
+   (1) SIZE.h  now includes the constant "sFacet"
- #define NUM_Z   90
+   (2) SIZE.h  may now be #include in both C and Fortran files
- #define MULTDIM  7
+ This means that "recvTask.c" and "readtile_mpiio.c" now get the
+ information they need directly from "SIZE.h", so the magic constants
+ for the run only need to be edited in one place (namely, SIZE.h).
- and
- readtile_mpiio.c    lines 115-119
-     facetElements1D = 4320;
-     tileSizeX = 72;
-     tileSizeY = 72;
-     xGhosts = 8;
-     yGhosts = 8;
  One tile per rank is recommended, mostly for pickup input performance,
- but it is not strictly necessary.
+ but it is not strictly necessary.  A minimum of one full node of I/O
+ ranks is required.  The async I/O does allocate whole nodes to be
+ either an I/O node, or a compute node.  It is permitted for the last
+ *compute* node to have a "ragged edge", i.e., have fewer MPI processes
+ on it than the other nodes do.  But the I/O nodes are all "full size".
+ The other minimum value the I/O code requires is that there must
+ be at least one core for each field you want to write, e.g., if you
+ are dumping 20 different fields, there must be at least 20 cores
+ allocated to the I/O.  Note that the 20 (or whatever) number is
+ *aggregate* across all the I/O nodes, NOT a "per node" number.
+ Another important constraint is that the total memory on all the I/O
+ nodes *collectively* needs to be twice as big as the largest epoch you
+ write.  So, if you are writing a 1.5 TB pickup dump, then you should
+ have a sum total of 3TB of memory (or more) on the set of I/O nodes.
  Choose dumpFreq and pChkptFreq as usual. We're not set up
  to do the rolling checkpoints yet. It'll dump u,v,t, and etan now -
  send me a list of other fields you want, as it is rather involved
  to change them. But this should be enough to see if it works.
+ Set run-time parameter: useSingleCPUio=.FALSE.
+ Only a couple of files are different from previous version.
+ But note in particular that  "SIZE.h"  is a new file in that directory,
+ and "recvTask.c" has a huge number of changes.
+ The input scheme implemented here is only invoked on
+ the 64bit pickup files.  It is specific to the LLC decomposition and will
+ not work on e.g. the Monterey high-res simulations we did a couple years
+ ago.  (Although, the code should work for any facet size as specified
+ in SIZE.h)  The format of SIZE.h was changed so that it can be included
+ in both C and Fortran files, and I also added the "sFacet" constant that
+ specifies the base facet size (e.g. 1080).  So SIZE.h will probably look
+ kinda weird at first, but shouldn't be hard to figure out.  The major
+ advantage is that now you no longer need to edit any magic constants in
+ recvTask.c and readtile_mpiio.c - they now derive the info they need by
+ directly including SIZE.h
+ The code now automatically figures out how many ranks are running per node.
+ You can run with whatever number of ranks per node that you want, but the
+ number needs to be consistent for all nodes (except possibly the last node,
+ which can be short).
+ The initial burst of output generated by recvTask.c (the "map" describing
+ the way the I/O processes are allocated) is now somewhat longer and more
+ detailed, but can continue to be ignored.
+ I did NOT try to cure the "integer" problem.  It seems that the code is
+ getting fairly close to bumping into the 2G (i.e. 2^31) limit on numbers
+ that fit into a default integer.  I *think* you can probably do one more
+ doubling of the resolution (to 8640), but I'm also pretty sure that going
+ past that will break the code.

 Legend:



Removed from v.1.1
 


changed lines


 
Added in v.1.5
 Legend:



Removed from v.1.1
 


changed lines


 
Added in v.1.5
-Removed from v.1.1
+Added in v.1.5

	ViewVC Help
Powered by ViewVC 1.1.22