/[MITgcm]/manual/s_software/text/sarch.tex
ViewVC logotype

Diff of /manual/s_software/text/sarch.tex

Parent Directory Parent Directory | Revision Log Revision Log | View Revision Graph Revision Graph | View Patch Patch

revision 1.9 by adcroft, Wed Apr 24 19:23:58 2002 UTC revision 1.16 by afe, Thu Jan 29 16:02:58 2004 UTC
# Line 97  and operating systems. This allows numer Line 97  and operating systems. This allows numer
97  \resizebox{!}{4.5in}{\includegraphics{part4/fit_in_wrapper.eps}}  \resizebox{!}{4.5in}{\includegraphics{part4/fit_in_wrapper.eps}}
98  \end{center}  \end{center}
99  \caption{  \caption{
100  Numerical code is written too fit within a software support  Numerical code is written to fit within a software support
101  infrastructure called WRAPPER. The WRAPPER is portable and  infrastructure called WRAPPER. The WRAPPER is portable and
102  can be specialized for a wide range of specific target hardware and  can be specialized for a wide range of specific target hardware and
103  programming environments, without impacting numerical code that fits  programming environments, without impacting numerical code that fits
# Line 120  uniprocessor and multi-processor Sun sys Line 120  uniprocessor and multi-processor Sun sys
120  (UMA) and non-uniform memory access (NUMA) designs. Significant work has also  (UMA) and non-uniform memory access (NUMA) designs. Significant work has also
121  been undertaken on x86 cluster systems, Alpha processor based clustered SMP  been undertaken on x86 cluster systems, Alpha processor based clustered SMP
122  systems, and on cache-coherent NUMA (CC-NUMA) systems from Silicon Graphics.  systems, and on cache-coherent NUMA (CC-NUMA) systems from Silicon Graphics.
123  The MITgcm code, operating within the WRAPPER, is also used routinely used on  The MITgcm code, operating within the WRAPPER, is also routinely used on
124  large scale MPP systems (for example T3E systems and IBM SP systems). In all  large scale MPP systems (for example T3E systems and IBM SP systems). In all
125  cases numerical code, operating within the WRAPPER, performs and scales very  cases numerical code, operating within the WRAPPER, performs and scales very
126  competitively with equivalent numerical code that has been modified to contain  competitively with equivalent numerical code that has been modified to contain
# Line 661  Within a {\em bi}, {\em bj} loop Line 661  Within a {\em bi}, {\em bj} loop
661  computation is performed concurrently over as many processes and threads  computation is performed concurrently over as many processes and threads
662  as there are physical processors available to compute.  as there are physical processors available to compute.
663    
664    An exception to the the use of {\em bi} and {\em bj} in loops arises in the
665    exchange routines used when the exch2 package is used with the cubed
666    sphere.  In this case {\em bj} is generally set to 1 and the loop runs from
667    1,{\em bi}.  Within the loop {\em bi} is used to retrieve the tile number,
668    which is then used to reference exchange parameters.
669    
670  The amount of computation that can be embedded  The amount of computation that can be embedded
671  a single loop over {\em bi} and {\em bj} varies for different parts of the  a single loop over {\em bi} and {\em bj} varies for different parts of the
672  MITgcm algorithm. Figure \ref{fig:bibj_extract} shows a code extract  MITgcm algorithm. Figure \ref{fig:bibj_extract} shows a code extract
# Line 781  The global domain size is again ninety g Line 787  The global domain size is again ninety g
787  forty grid points in y. The two sub-domains in each process will be computed  forty grid points in y. The two sub-domains in each process will be computed
788  sequentially if they are given to a single thread within a single process.  sequentially if they are given to a single thread within a single process.
789  Alternatively if the code is invoked with multiple threads per process  Alternatively if the code is invoked with multiple threads per process
790  the two domains in y may be computed on concurrently.  the two domains in y may be computed concurrently.
791  \item  \item
792  \begin{verbatim}  \begin{verbatim}
793        PARAMETER (        PARAMETER (
# Line 929  File: {\em eesupp/inc/MAIN\_PDIRECTIVES1 Line 935  File: {\em eesupp/inc/MAIN\_PDIRECTIVES1
935  File: {\em eesupp/inc/MAIN\_PDIRECTIVES2.h}\\  File: {\em eesupp/inc/MAIN\_PDIRECTIVES2.h}\\
936  File: {\em model/src/THE\_MODEL\_MAIN.F}\\  File: {\em model/src/THE\_MODEL\_MAIN.F}\\
937  File: {\em eesupp/src/MAIN.F}\\  File: {\em eesupp/src/MAIN.F}\\
938  File: {\em tools/genmake}\\  File: {\em tools/genmake2}\\
939  File: {\em eedata}\\  File: {\em eedata}\\
940  CPP:  {\em TARGET\_SUN}\\  CPP:  {\em TARGET\_SUN}\\
941  CPP:  {\em TARGET\_DEC}\\  CPP:  {\em TARGET\_DEC}\\
# Line 968  critical communication. However, in orde Line 974  critical communication. However, in orde
974  of controlling and coordinating the start up of a large number  of controlling and coordinating the start up of a large number
975  (hundreds and possibly even thousands) of copies of the same  (hundreds and possibly even thousands) of copies of the same
976  program, MPI is used. The calls to the MPI multi-process startup  program, MPI is used. The calls to the MPI multi-process startup
977  routines must be activated at compile time. This is done  routines must be activated at compile time.  Currently MPI libraries are
978  by setting the {\em ALLOW\_USE\_MPI} and {\em ALWAYS\_USE\_MPI}  invoked by
979  flags in the {\em CPP\_EEOPTIONS.h} file.\\  specifying the appropriate options file with the
980    {\tt-of} flag when running the {\em genmake2}
981  \fbox{  script, which generates the Makefile for compiling and linking MITgcm.
982  \begin{minipage}{4.75in}  (Previously this was done by setting the {\em ALLOW\_USE\_MPI} and
983  File: {\em eesupp/inc/CPP\_EEOPTIONS.h}\\  {\em ALWAYS\_USE\_MPI} flags in the {\em CPP\_EEOPTIONS.h} file.)  More
984  CPP:  {\em ALLOW\_USE\_MPI}\\  detailed information about the use of {\em genmake2} for specifying
985  CPP:  {\em ALWAYS\_USE\_MPI}\\  local compiler flags is located in section \ref{sect:genmake}.\\  
 Parameter:  {\em nPx}\\  
 Parameter:  {\em nPy}  
 \end{minipage}  
 } \\  
986    
 Additionally, compile time options are required to link in the  
 MPI libraries and header files. Examples of these options  
 can be found in the {\em genmake} script that creates makefiles  
 for compilation. When this script is executed with the {bf -mpi}  
 flag it will generate a makefile that includes  
 paths for search for MPI head files and for linking in  
 MPI libraries. For example the {\bf -mpi} flag on a  
  Silicon Graphics IRIX system causes a  
 Makefile with the compilation command  
 Graphics IRIX system \begin{verbatim}  
 mpif77 -I/usr/local/mpi/include -DALLOW_USE_MPI -DALWAYS_USE_MPI  
 \end{verbatim}  
 to be generated.  
 This is the correct set of options for using the MPICH open-source  
 version of MPI, when it has been installed under the subdirectory  
 /usr/local/mpi.  
 However, on many systems there may be several  
 versions of MPI installed. For example many systems have both  
 the open source MPICH set of libraries and a vendor specific native form  
 of the MPI libraries. The correct setup to use will depend on the  
 local configuration of your system.\\  
987    
988  \fbox{  \fbox{
989  \begin{minipage}{4.75in}  \begin{minipage}{4.75in}
990  File: {\em tools/genmake}  Directory: {\em tools/build\_options}\\
991    File: {\em tools/genmake2}
992  \end{minipage}  \end{minipage}
993  } \\  } \\
994  \paragraph{\bf Execution} The mechanics of starting a program in  \paragraph{\bf Execution} The mechanics of starting a program in
# Line 1024  product of the processor grid settings o Line 1006  product of the processor grid settings o
1006  in the file {\em SIZE.h}. The parameter {\em mf} specifies that a text file  in the file {\em SIZE.h}. The parameter {\em mf} specifies that a text file
1007  called ``mf'' will be read to get a list of processor names on  called ``mf'' will be read to get a list of processor names on
1008  which the sixty-four processes will execute. The syntax of this file  which the sixty-four processes will execute. The syntax of this file
1009  is specified by the MPI distribution  is specified by the MPI distribution.
1010  \\  \\
1011    
1012  \fbox{  \fbox{
# Line 1075  to processor identification are are used Line 1057  to processor identification are are used
1057  Allocation of processes to tiles in controlled by the routine  Allocation of processes to tiles in controlled by the routine
1058  {\em INI\_PROCS()}. For each process this routine sets  {\em INI\_PROCS()}. For each process this routine sets
1059  the variables {\em myXGlobalLo} and {\em myYGlobalLo}.  the variables {\em myXGlobalLo} and {\em myYGlobalLo}.
1060  These variables specify (in index space) the coordinate  These variables specify in index space the coordinates
1061  of the southern most and western most corner of the  of the southernmost and westernmost corner of the
1062  southern most and western most tile owned by this process.  southernmost and westernmost tile owned by this process.
1063  The variables {\em pidW}, {\em pidE}, {\em pidS} and {\em pidN}  The variables {\em pidW}, {\em pidE}, {\em pidS} and {\em pidN}
1064  are also set in this routine. These are used to identify  are also set in this routine. These are used to identify
1065  processes holding tiles to the west, east, south and north  processes holding tiles to the west, east, south and north
1066  of this process. These values are stored in global storage  of this process. These values are stored in global storage
1067  in the header file {\em EESUPPORT.h} for use by  in the header file {\em EESUPPORT.h} for use by
1068  communication routines.  communication routines.  The above does not hold when the
1069    exch2 package is used -- exch2 sets its own parameters to
1070    specify the global indices of tiles and their relationships
1071    to each other.  See the documentation on the exch2 package
1072    (\ref{sec:exch2})  for
1073    details.
1074  \\  \\
1075    
1076  \fbox{  \fbox{
# Line 1109  operations and that can be customized fo Line 1096  operations and that can be customized fo
1096  describes the information that is held and used.  describes the information that is held and used.
1097    
1098  \begin{enumerate}  \begin{enumerate}
1099  \item {\bf Tile-tile connectivity information} For each tile the WRAPPER  \item {\bf Tile-tile connectivity information}
1100  sets a flag that sets the tile number to the north, south, east and  For each tile the WRAPPER
1101    sets a flag that sets the tile number to the north,
1102    south, east and
1103  west of that tile. This number is unique over all tiles in a  west of that tile. This number is unique over all tiles in a
1104  configuration. The number is held in the variables {\em tileNo}  configuration. Except when using the cubed sphere and the exch2 package,
1105    the number is held in the variables {\em tileNo}
1106  ( this holds the tiles own number), {\em tileNoN}, {\em tileNoS},  ( this holds the tiles own number), {\em tileNoN}, {\em tileNoS},
1107  {\em tileNoE} and {\em tileNoW}. A parameter is also stored with each tile  {\em tileNoE} and {\em tileNoW}. A parameter is also stored with each tile
1108  that specifies the type of communication that is used between tiles.  that specifies the type of communication that is used between tiles.
# Line 1135  of the WRAPPER exchange primitive Line 1125  of the WRAPPER exchange primitive
1125  (see figure \ref{fig:communication_primitives}). The routine  (see figure \ref{fig:communication_primitives}). The routine
1126  {\em ini\_communication\_patterns()} is responsible for setting the  {\em ini\_communication\_patterns()} is responsible for setting the
1127  communication mode values for each tile.  communication mode values for each tile.
1128  \\  
1129    When using the cubed sphere configuration with the exch2 package, the
1130    relationships between tiles and their communication methods are set
1131    by the package in other variables.  See the exch2 package documentation
1132    (\ref{sec:exch2} for details.
1133    
1134    
1135    
1136  \fbox{  \fbox{
1137  \begin{minipage}{4.75in}  \begin{minipage}{4.75in}
# Line 1271  The \_EXCH operation is also central to Line 1267  The \_EXCH operation is also central to
1267  the cube-sphere grid. In this class of grid a rotation may be required  the cube-sphere grid. In this class of grid a rotation may be required
1268  between tiles. Aligning the coordinate requiring rotation with the  between tiles. Aligning the coordinate requiring rotation with the
1269  tile decomposition, allows the coordinate transformation to  tile decomposition, allows the coordinate transformation to
1270  be embedded within a custom form of the \_EXCH primitive.  be embedded within a custom form of the \_EXCH primitive.  In these
1271    cases \_EXCH is mapped to exch2 routines, as detailed in the exch2
1272    package documentation  \ref{sec:exch2}.
1273    
1274  \item {\bf Reverse Mode}  \item {\bf Reverse Mode}
1275  The communication primitives \_EXCH and \_GSUM both employ  The communication primitives \_EXCH and \_GSUM both employ
# Line 1288  operations. However, the routine argumen Line 1286  operations. However, the routine argumen
1286  is set to the value {\em REVERSE\_SIMULATION}. This signifies  is set to the value {\em REVERSE\_SIMULATION}. This signifies
1287  ti the low-level routines that the adjoint forms of the  ti the low-level routines that the adjoint forms of the
1288  appropriate communication operation should be performed.  appropriate communication operation should be performed.
1289    
1290  \item {\bf MAX\_NO\_THREADS}  \item {\bf MAX\_NO\_THREADS}
1291  The variable {\em MAX\_NO\_THREADS} is used to indicate the  The variable {\em MAX\_NO\_THREADS} is used to indicate the
1292  maximum number of OS threads that a code will use. This  maximum number of OS threads that a code will use. This
# Line 1392  a series of template files, for example Line 1391  a series of template files, for example
1391  This is done to allow a large number of variations on the exchange  This is done to allow a large number of variations on the exchange
1392  process to be maintained. One set of variations supports the  process to be maintained. One set of variations supports the
1393  cube sphere grid. Support for a cube sphere grid in MITgcm is based  cube sphere grid. Support for a cube sphere grid in MITgcm is based
1394  on having each face of the cube as a separate tile (or tiles).  on having each face of the cube as a separate tile or tiles.
1395  The exchange routines are then able to absorb much of the  The exchange routines are then able to absorb much of the
1396  detailed rotation and reorientation required when moving around the  detailed rotation and reorientation required when moving around the
1397  cube grid. The set of {\em \_EXCH} routines that contain the  cube grid. The set of {\em \_EXCH} routines that contain the

Legend:
Removed from v.1.9  
changed lines
  Added in v.1.16

  ViewVC Help
Powered by ViewVC 1.1.22