1 |
From Bron, February 28, 2019 |
2 |
|
3 |
In the newest version, it is no longer necessary to hand-edit the |
4 |
constants in "recvTask.c" and "readtile_mpiio.c". Instead, the file |
5 |
"SIZE.h" has been modified in two ways: |
6 |
(1) SIZE.h now includes the constant "sFacet" |
7 |
(2) SIZE.h may now be #include in both C and Fortran files |
8 |
This means that "recvTask.c" and "readtile_mpiio.c" now get the |
9 |
information they need directly from "SIZE.h", so the magic constants |
10 |
for the run only need to be edited in one place (namely, SIZE.h). |
11 |
|
12 |
One tile per rank is recommended, mostly for pickup input performance, |
13 |
but it is not strictly necessary. A minimum of one full node of I/O |
14 |
ranks is required. The async I/O does allocate whole nodes to be |
15 |
either an I/O node, or a compute node. It is permitted for the last |
16 |
*compute* node to have a "ragged edge", i.e., have fewer MPI processes |
17 |
on it than the other nodes do. But the I/O nodes are all "full size". |
18 |
|
19 |
The other minimum value the I/O code requires is that there must |
20 |
be at least one core for each field you want to write, e.g., if you |
21 |
are dumping 20 different fields, there must be at least 20 cores |
22 |
allocated to the I/O. Note that the 20 (or whatever) number is |
23 |
*aggregate* across all the I/O nodes, NOT a "per node" number. |
24 |
|
25 |
Choose dumpFreq and pChkptFreq as usual. We're not set up |
26 |
to do the rolling checkpoints yet. It'll dump u,v,t, and etan now - |
27 |
send me a list of other fields you want, as it is rather involved |
28 |
to change them. But this should be enough to see if it works. |
29 |
|
30 |
Set run-time parameter: useSingleCPUio=.FALSE. |
31 |
|
32 |
Only a couple of files are different from previous version. |
33 |
But note in particular that "SIZE.h" is a new file in that directory, |
34 |
and "recvTask.c" has a huge number of changes. |
35 |
|
36 |
The input scheme implemented here is only invoked on |
37 |
the 64bit pickup files. It is specific to the LLC decomposition and will |
38 |
not work on e.g. the Monterey high-res simulations we did a couple years |
39 |
ago. (Although, the code should work for any facet size as specified |
40 |
in SIZE.h) The format of SIZE.h was changed so that it can be included |
41 |
in both C and Fortran files, and I also added the "sFacet" constant that |
42 |
specifies the base facet size (e.g. 1080). So SIZE.h will probably look |
43 |
kinda weird at first, but shouldn't be hard to figure out. The major |
44 |
advantage is that now you no longer need to edit any magic constants in |
45 |
recvTask.c and readtile_mpiio.c - they now derive the info they need by |
46 |
directly including SIZE.h |
47 |
|
48 |
The code now automatically figures out how many ranks are running per node. |
49 |
You can run with whatever number of ranks per node that you want, but the |
50 |
number needs to be consistent for all nodes (except possibly the last node, |
51 |
which can be short). |
52 |
|
53 |
The initial burst of output generated by recvTask.c (the "map" describing |
54 |
the way the I/O processes are allocated) is now somewhat longer and more |
55 |
detailed, but can continue to be ignored. |
56 |
|
57 |
I did NOT try to cure the "integer" problem. It seems that the code is |
58 |
getting fairly close to bumping into the 2G (i.e. 2^31) limit on numbers |
59 |
that fit into a default integer. I *think* you can probably do one more |
60 |
doubling of the resolution (to 8640), but I'm also pretty sure that going |
61 |
past that will break the code. |