| 1 |
Format for raw output files from MITgcmUV |
| 2 |
========================================= |
| 3 |
|
| 4 |
Introduction |
| 5 |
------------ |
| 6 |
When running in parallel mode with multiple processes the MITgcmUV |
| 7 |
model operates as N separate programs, each responsible for its "local" |
| 8 |
region of the "total" model domain. Synchronisation and sharing of data between |
| 9 |
these processes is done explicitly by calls to data exchange and |
| 10 |
barrier routines. Consequently there is no single program that has |
| 11 |
a view of the whole model domain as the code is running. Any simple |
| 12 |
I/O can only operate on the local region of the model domain - I/O |
| 13 |
operations to and from datasets that represent the total domain need |
| 14 |
to address the multiple process behavior explicitly. |
| 15 |
Under MITgcmUV there are a set of I/O support routines that mask the |
| 16 |
details of this process and enable end-users to read and write datasets |
| 17 |
in a straight-forward manner. The routines use the following design |
| 18 |
strategy: |
| 19 |
o Input datasets are for the total domain |
| 20 |
o Output datasets are for the local domain |
| 21 |
o A separate program "joinds" is provided which joins a set of |
| 22 |
local domain datasets together to form total model domain dataset. |
| 23 |
|
| 24 |
MITgcmUV IO support routines |
| 25 |
---------------------------- |
| 26 |
- READ_FLD_XY_RS |
| 27 |
- READ_FLD_XY_RL |
| 28 |
- READ_FLD_XYZ_RS |
| 29 |
- READ_FLD_XYZ_RL |
| 30 |
|
| 31 |
- WRITE_FLD_XY_RS |
| 32 |
- WRITE_FLD_XY_RL |
| 33 |
- WRITE_FLD_XYZ_RS |
| 34 |
- WRITE_FLD_XYZ_RL |
| 35 |
|
| 36 |
Dataset format |
| 37 |
-------------- |
| 38 |
Datasets are written using the standard Fortran 77 sequential binary |
| 39 |
file format. The Fortran IO statements in he model code do not specify any |
| 40 |
particular format, however, compile and run-time flags are used on some platforms. |
| 41 |
On DEC platforms by default the IO form is set to big-endian with a compile time |
| 42 |
flag. On CRAY platforms a runtime flag is normally used to select IEEE |
| 43 |
representation. The Fortran 77 sequential binary file format is |
| 44 |
4 byte header |
| 45 |
data |
| 46 |
4 byte terminator |
| 47 |
The header and terminator are unsigned integers which give the length |
| 48 |
of the data section in bytes. This is format is standard over all UNIX |
| 49 |
platforms. In Fortran this style of file is generated by code of the |
| 50 |
form |
| 51 |
|
| 52 |
REAL A(dim1, dim2, ..... ) |
| 53 |
OPEN(unitnumber,filename,FORM='FORMATTED') |
| 54 |
WRITE(unitnumber) A |
| 55 |
END |
| 56 |
|
| 57 |
The data is sequenced in the standard Fortran convention of the left-most |
| 58 |
index varying fastest. This convention holds for any dimension of datsets |
| 59 |
one-dimensional, two-dimensional, three-dimensional and four-dimensional or |
| 60 |
more datasets are all written this way. |
| 61 |
|
| 62 |
Multiprocess support |
| 63 |
-------------------- |
| 64 |
The format described above is used for multi-process simulations. In this |
| 65 |
case the data written to separate files with each process writing data that |
| 66 |
is local to it. To support this approach a file naming convention is used and a second |
| 67 |
file of "meta" information accompanines the data. The naming convention |
| 68 |
is used to avoid duplicate names and to make it easy to identify sets of |
| 69 |
files that together represent the total domain data. The meta file contains |
| 70 |
information about the extent of the sub-domain within each file. |
| 71 |
The naming convention used is |
| 72 |
PREF.SUFF.pPNUMBER.tTNUMBER.data |
| 73 |
PREF.SUFF.pPNUMBER.tTNUMBER.meta |
| 74 |
|
| 75 |
where |
| 76 |
PREF - Is a field identifying the data within the file. For |
| 77 |
temperature PREF is T, for zonal velocity PREF is U etc... |
| 78 |
SUFF - Is a field identifying the "instance" of the data within the |
| 79 |
file. The instance is typically the time level. In general |
| 80 |
the instance will be a model timestep number. |
| 81 |
PNUMBER - Is a process number used to identitfy which process of |
| 82 |
a multi-process run generated this data. The number ranges |
| 83 |
from 0 to (number of processors)-1. |
| 84 |
TNUMBER - Is a thread number used to identify which thread of a |
| 85 |
multi-threaded run generated this data. The number ranges |
| 86 |
from 0 to (number of threads)-1. |
| 87 |
|
| 88 |
the .data suffix identifies the file containing the actual data. |
| 89 |
the .meta suffix identifies the file containing textual information |
| 90 |
indicating the extent of the domain written to the .data file. |
| 91 |
|
| 92 |
.meta file Format |
| 93 |
----------------- |
| 94 |
This file contains a set of parameters that are specified using the |
| 95 |
generic parameter specification format used in GCMPACK software. This |
| 96 |
format consists of a sequence of assignments and comments |
| 97 |
Assignments have the form |
| 98 |
keyword =[ val-list ]; |
| 99 |
|
| 100 |
where |
| 101 |
keyword is a text string |
| 102 |
val-list is a sequence of one or more fields separated by commas |
| 103 |
|
| 104 |
Comments are preceeded by // or # characters or contained in |
| 105 |
/* */ pairs. |
| 106 |
The keywords contained in a .meta file are |
| 107 |
id - This is a numeric identifier. It can be used to |
| 108 |
verify consistency over a set of .meta files. |
| 109 |
nDims - This is a single integer indicating the dimensionality |
| 110 |
of the data in the .data file. |
| 111 |
dimList - This is a sequence of triplets. There is one triplet for |
| 112 |
each dimension and the triplets are ordered in the same |
| 113 |
way as the dimensions. Each triplet is made of three integers. |
| 114 |
The first integer gives the domain extent globally for |
| 115 |
the associated dimension. |
| 116 |
The second integer gives the low coordinate for the values |
| 117 |
within .data file for the associated dimension. |
| 118 |
The third integer gives the high coordinate for the values |
| 119 |
within .data file for the associated dimension. |
| 120 |
Thus for a .data file containing the north-west quadrant of |
| 121 |
a global domain of size 90 x 40 the .meta might read |
| 122 |
nDims = [ 2 ]; |
| 123 |
dimList = [ 90, 46, 90, 40, 1, 20]; |
| 124 |
For a global domain of size 90 x 40 x 33 the .meta file |
| 125 |
would read |
| 126 |
nDims = [ 3 ]; |
| 127 |
dimList = [ 90, 46, 90, 40, 1, 20, 33, 1, 33]; |
| 128 |
|
| 129 |
|
| 130 |
|
| 131 |
Example matlab program to join files |
| 132 |
------------------------------------ |
| 133 |
The following matlab script joins together a collection of files that |
| 134 |
were written in split form. The files to join are indicated by a user |
| 135 |
defined PREF.SUFF pair. e.g. T.0000002800. The script uses the UNIX |
| 136 |
ls command to find all files starting with T.0000002800 and then |
| 137 |
scans the .meta files to extract the dimensions. It then merges all |
| 138 |
the sections together to form a complete representation of the global |
| 139 |
dataset. |
| 140 |
>> function [AA] = rdmeta(fname,varargin) |
| 141 |
>> % |
| 142 |
>> % Read MITgcmUV Meta/Data files |
| 143 |
>> % |
| 144 |
>> % A = RDMETA(FNAME) reads data described by meta/data file format. |
| 145 |
>> % FNAME is a string containing the "head" of the file names. |
| 146 |
>> % |
| 147 |
>> % eg. To load the meta-data files |
| 148 |
>> % T.0000002880.p0000.t0000.meta, T.0000002880.p0000.t0000.data |
| 149 |
>> % T.0000002880.p0001.t0000.meta, T.0000002880.p0001.t0000.data |
| 150 |
>> % T.0000002880.p0002.t0000.meta, T.0000002880.p0002.t0000.data |
| 151 |
>> % T.0000002880.p0003.t0000.meta, T.0000002880.p0003.t0000.data |
| 152 |
>> % use |
| 153 |
>> % >> A=rdmeta('T.0000002880'); |
| 154 |
>> % |
| 155 |
>> % A = RDMETA(FNAME,MACHINEFORMAT) allows the machine format to be specified |
| 156 |
>> % which MACHINEFORMAT is on of the following strings: |
| 157 |
>> % |
| 158 |
>> % 'native' or 'n' - local machine format - the default |
| 159 |
>> % 'ieee-le' or 'l' - IEEE floating point with little-endian |
| 160 |
>> % byte ordering |
| 161 |
>> % 'ieee-be' or 'b' - IEEE floating point with big-endian |
| 162 |
>> % byte ordering |
| 163 |
>> % 'vaxd' or 'd' - VAX D floating point and VAX ordering |
| 164 |
>> % 'vaxg' or 'g' - VAX G floating point and VAX ordering |
| 165 |
>> % 'cray' or 'c' - Cray floating point with big-endian |
| 166 |
>> % byte ordering |
| 167 |
>> % 'ieee-le.l64' or 'a' - IEEE floating point with little-endian |
| 168 |
>> % byte ordering and 64 bit long data type |
| 169 |
>> % 'ieee-be.l64' or 's' - IEEE floating point with big-endian byte |
| 170 |
>> % ordering and 64 bit long data type. |
| 171 |
>> % |
| 172 |
>> |
| 173 |
>> % Default options |
| 174 |
>> ieee='n'; |
| 175 |
>> |
| 176 |
>> % Check optional arguments |
| 177 |
>> args=char(varargin); |
| 178 |
>> while (size(args,1) > 0) |
| 179 |
>> if deblank(args(1,:)) == 'n' | deblank(args(1,:)) == 'native' |
| 180 |
>> ieee='n'; |
| 181 |
>> elseif deblank(args(1,:)) == 'l' | deblank(args(1,:)) == 'ieee-le' |
| 182 |
>> ieee='l'; |
| 183 |
>> elseif deblank(args(1,:)) == 'b' | deblank(args(1,:)) == 'ieee-be' |
| 184 |
>> ieee='b'; |
| 185 |
>> elseif deblank(args(1,:)) == 'c' | deblank(args(1,:)) == 'cray' |
| 186 |
>> ieee='c'; |
| 187 |
>> elseif deblank(args(1,:)) == 'a' | deblank(args(1,:)) == 'ieee-le.l64' |
| 188 |
>> ieee='a'; |
| 189 |
>> elseif deblank(args(1,:)) == 's' | deblank(args(1,:)) == 'ieee-be.l64' |
| 190 |
>> ieee='s'; |
| 191 |
>> else |
| 192 |
>> sprintf(['Optional argument ' args(1,:) ' is unknown']) |
| 193 |
>> return |
| 194 |
>> end |
| 195 |
>> args=args(2:end,:); |
| 196 |
>> end |
| 197 |
>> |
| 198 |
>> % Match name of all meta-files |
| 199 |
>> eval(['ls ' fname '*.meta;']); |
| 200 |
>> allfiles=ans; |
| 201 |
>> |
| 202 |
>> % Beginning and end of strings |
| 203 |
>> Iend=findstr(allfiles,'.meta')+4; |
| 204 |
>> Ibeg=[1 Iend(1:end-1)+2]; |
| 205 |
>> |
| 206 |
>> % Loop through allfiles |
| 207 |
>> for j=1:prod(size(Ibeg)), |
| 208 |
>> |
| 209 |
>> % Read meta- and data-file |
| 210 |
>> [A,N] = localrdmeta(allfiles(Ibeg(j):Iend(j)),ieee); |
| 211 |
>> |
| 212 |
>> bdims=N(1,:); |
| 213 |
>> r0=N(2,:); |
| 214 |
>> rN=N(3,:); |
| 215 |
>> ndims=prod(size(bdims)); |
| 216 |
>> if (ndims == 1) |
| 217 |
>> AA(r0(1):rN(1))=A; |
| 218 |
>> elseif (ndims == 2) |
| 219 |
>> AA(r0(1):rN(1),r0(2):rN(2))=A; |
| 220 |
>> elseif (ndims == 3) |
| 221 |
>> AA(r0(1):rN(1),r0(2):rN(2),r0(3):rN(3))=A; |
| 222 |
>> elseif (ndims == 4) |
| 223 |
>> AA(r0(1):rN(1),r0(2):rN(2),r0(3):rN(3),r0(4):rN(4))=A; |
| 224 |
>> else |
| 225 |
>> sprintf('Dimension of data set is larger than currently coded. Sorry!') |
| 226 |
>> return |
| 227 |
>> end |
| 228 |
>> |
| 229 |
>> end |
| 230 |
>> |
| 231 |
>> %------------------------------------------------------------------------------- |
| 232 |
>> |
| 233 |
>> function [A,N] = localrdmeta(fname,ieee) |
| 234 |
>> |
| 235 |
>> mname=fname; |
| 236 |
>> dname=strrep(mname,'.meta','.data'); |
| 237 |
>> |
| 238 |
>> % Read and interpret Meta file |
| 239 |
>> fid = fopen(mname,'r'); |
| 240 |
>> if (fid == -1) |
| 241 |
>> sprintf(['Fila e' mname ' could not be opened']) |
| 242 |
>> return |
| 243 |
>> end |
| 244 |
>> |
| 245 |
>> % Scan each line of the Meta file |
| 246 |
>> allstr=' '; |
| 247 |
>> keepgoing = 1; |
| 248 |
>> while keepgoing > 0, |
| 249 |
>> line = fgetl(fid); |
| 250 |
>> if (line == -1) |
| 251 |
>> keepgoing=-1; |
| 252 |
>> else |
| 253 |
>> % Strip out "(PID.TID *.*)" by finding first ")" |
| 254 |
>> ind=findstr([line ')'],')'); line=line(ind(1)+1:end); |
| 255 |
>> % Remove comments of form // |
| 256 |
>> line=[line ' //']; ind=findstr(line,'//'); line=line(1:ind(1)-1); |
| 257 |
>> % Add to total string |
| 258 |
>> allstr=[allstr line]; |
| 259 |
>> end |
| 260 |
>> end |
| 261 |
>> |
| 262 |
>> % Close meta file |
| 263 |
>> fclose(fid); |
| 264 |
>> |
| 265 |
>> % Strip out comments of form /* ... */ |
| 266 |
>> ind1=findstr(allstr,'/*'); ind2=findstr(allstr,'*/'); |
| 267 |
>> if size(ind1) ~= size(ind2) |
| 268 |
>> sprintf('The /* ... */ comments are not properly paired') |
| 269 |
>> return |
| 270 |
>> end |
| 271 |
>> while size(ind1,2) > 0 |
| 272 |
>> allstr=[allstr(1:ind1(1)-1) allstr(ind2(1)+3:end)]; |
| 273 |
>> ind1=findstr(allstr,'/*'); ind2=findstr(allstr,'*/'); |
| 274 |
>> end |
| 275 |
>> |
| 276 |
>> eval(lower(allstr)); |
| 277 |
>> |
| 278 |
>> N=reshape( dimlist , 3 , prod(size(dimlist))/3 ); |
| 279 |
>> |
| 280 |
>> A=allstr; |
| 281 |
>> % Open data file |
| 282 |
>> fid=fopen(dname,'r',ieee); |
| 283 |
>> |
| 284 |
>> % Read record size in bytes |
| 285 |
>> recsz=fread(fid,1,'uint32'); |
| 286 |
>> ldims=N(3,:)-N(2,:)+1; |
| 287 |
>> numels=prod(ldims); |
| 288 |
>> |
| 289 |
>> rat=recsz/numels; |
| 290 |
>> if rat == 4 |
| 291 |
>> A=fread(fid,numels,'real*4'); |
| 292 |
>> elseif rat == 8 |
| 293 |
>> A=fread(fid,numels,'real*8'); |
| 294 |
>> else |
| 295 |
>> sprintf('Ratio between record size and size in meta-file inconsistent') |
| 296 |
>> sprintf(' Implied size in meta-file = %d', numels ) |
| 297 |
>> sprintf(' Record size in data-file = %d', recsz ) |
| 298 |
>> return |
| 299 |
>> end |
| 300 |
>> |
| 301 |
>> erecsz=fread(fid,1,'uint32'); |
| 302 |
>> if erecsz ~= recsz |
| 303 |
>> sprintf('WARNING: Record sizes at beginning and end of file are inconsistent') |
| 304 |
>> end |
| 305 |
>> |
| 306 |
>> fclose(fid); |
| 307 |
>> |
| 308 |
>> A=reshape(A,ldims); |
| 309 |
>> |