| 1 |
From Bron Nelson, February 28, 2019 |
| 2 |
|
| 3 |
In the newest version, it is no longer necessary to hand-edit the |
| 4 |
constants in "recvTask.c" and "readtile_mpiio.c". Instead, the file |
| 5 |
"SIZE.h" has been modified in two ways: |
| 6 |
(1) SIZE.h now includes the constant "sFacet" |
| 7 |
(2) SIZE.h may now be #include in both C and Fortran files |
| 8 |
This means that "recvTask.c" and "readtile_mpiio.c" now get the |
| 9 |
information they need directly from "SIZE.h", so the magic constants |
| 10 |
for the run only need to be edited in one place (namely, SIZE.h). |
| 11 |
|
| 12 |
One tile per rank is recommended, mostly for pickup input performance, |
| 13 |
but it is not strictly necessary. A minimum of one full node of I/O |
| 14 |
ranks is required. The async I/O does allocate whole nodes to be |
| 15 |
either an I/O node, or a compute node. It is permitted for the last |
| 16 |
*compute* node to have a "ragged edge", i.e., have fewer MPI processes |
| 17 |
on it than the other nodes do. But the I/O nodes are all "full size". |
| 18 |
|
| 19 |
The other minimum value the I/O code requires is that there must |
| 20 |
be at least one core for each field you want to write, e.g., if you |
| 21 |
are dumping 20 different fields, there must be at least 20 cores |
| 22 |
allocated to the I/O. Note that the 20 (or whatever) number is |
| 23 |
*aggregate* across all the I/O nodes, NOT a "per node" number. |
| 24 |
|
| 25 |
Another important constraint is that the total memory on all the I/O |
| 26 |
nodes *collectively* needs to be twice as big as the largest epoch you |
| 27 |
write. So, if you are writing a 1.5 TB pickup dump, then you should |
| 28 |
have a sum total of 3TB of memory (or more) on the set of I/O nodes. |
| 29 |
|
| 30 |
Choose dumpFreq and pChkptFreq as usual. We're not set up |
| 31 |
to do the rolling checkpoints yet. It'll dump u,v,t, and etan now - |
| 32 |
send me a list of other fields you want, as it is rather involved |
| 33 |
to change them. But this should be enough to see if it works. |
| 34 |
|
| 35 |
Set run-time parameter: useSingleCPUio=.FALSE. |
| 36 |
|
| 37 |
Only a couple of files are different from previous version. |
| 38 |
But note in particular that "SIZE.h" is a new file in that directory, |
| 39 |
and "recvTask.c" has a huge number of changes. |
| 40 |
|
| 41 |
The input scheme implemented here is only invoked on |
| 42 |
the 64bit pickup files. It is specific to the LLC decomposition and will |
| 43 |
not work on e.g. the Monterey high-res simulations we did a couple years |
| 44 |
ago. (Although, the code should work for any facet size as specified |
| 45 |
in SIZE.h) The format of SIZE.h was changed so that it can be included |
| 46 |
in both C and Fortran files, and I also added the "sFacet" constant that |
| 47 |
specifies the base facet size (e.g. 1080). So SIZE.h will probably look |
| 48 |
kinda weird at first, but shouldn't be hard to figure out. The major |
| 49 |
advantage is that now you no longer need to edit any magic constants in |
| 50 |
recvTask.c and readtile_mpiio.c - they now derive the info they need by |
| 51 |
directly including SIZE.h |
| 52 |
|
| 53 |
The code now automatically figures out how many ranks are running per node. |
| 54 |
You can run with whatever number of ranks per node that you want, but the |
| 55 |
number needs to be consistent for all nodes (except possibly the last node, |
| 56 |
which can be short). |
| 57 |
|
| 58 |
The initial burst of output generated by recvTask.c (the "map" describing |
| 59 |
the way the I/O processes are allocated) is now somewhat longer and more |
| 60 |
detailed, but can continue to be ignored. |
| 61 |
|
| 62 |
I did NOT try to cure the "integer" problem. It seems that the code is |
| 63 |
getting fairly close to bumping into the 2G (i.e. 2^31) limit on numbers |
| 64 |
that fit into a default integer. I *think* you can probably do one more |
| 65 |
doubling of the resolution (to 8640), but I'm also pretty sure that going |
| 66 |
past that will break the code. |