/[MITgcm]/MITgcm/lsopt/lsopt_doc.txt

Diff of /MITgcm/lsopt/lsopt_doc.txt

Parent Directory | Revision Log | View Revision Graph Revision Graph | View Patch Patch

-revision 1.1 by heimbach,
Tue Feb  5 20:34:34 2002 UTC
+revision 1.1.2.1 by heimbach,
Tue Feb  5 20:34:34 2002 UTC
-Line 0
+Line 1
+ Description of large scale optimization package, Version 2.1.0
+ ##############################################################
+ Patrick Heimbach, MIT/EAPS, 02-Mar-2000
+ reference:
+ #########
+ J.C. Gilbert & C. Lemarechal
+ Some numerical experiments with variable-storage quasi-Newton algorithms
+ Mathematical Programming 45 (1989), pp. 407-435
+ flow chart
+ ##########
+       lsopt_top
+           |
+           |---- check arguments
+           |---- CALL INSTORE
+           |       |
+           |       |---- determine whether OPWARMI available:
+           |                * if no:  cold start: create OPWARMI
+           |                * if yes: warm start: read from OPWARMI
+           |             create or open OPWARMD
+           |
+           |---- check consistency between OPWARMI and model parameters
+           |
+           |---- >>> if COLD start: <<<
+           |      |  first simulation with f.g. xx_0; output: first ff_0, gg_0
+           |      |  set first preconditioner value xdiff_0 to 1
+           |      |  store xx(0), gg(0), xdiff(0) to OPWARMD (first 3 entries)
+           |      |
+           |     >>> else: WARM start: <<<
+           |         read xx(i), gg(i) from OPWARMD (first 2 entries)
+           |         for first warm start after cold start, i=0
+           |
+           |
+           |
+           |---- /// if ITMAX > 0: perform optimization (increment loop index i)
+           |      (
+           |      )---- save current values of gg(i-1) -> gold(i-1), ff -> fold(i-1)
+           |      (---- CALL LSUPDXX
+           |      )       |
+           |      (       |---- >>> if jmax=0 <<<
+           |      )       |      |  first optimization after cold start:
+           |      (       |      |  preconditioner estimated via ff_0 - ff_(first guess)
+           |      )       |      |  dd(i-1) = -gg(i-1)*preco
+           |      (       |      |
+           |      )       |     >>> if jmax > 0 <<<
+           |      (       |         dd(i-1) = -gg(i-1)
+           |      )       |         CALL HESSUPD
+           |      (       |           |
+           |      )       |           |---- dd(i-1) modified via Hessian approx.
+           |      (       |
+           |      )       |---- >>> if <dd,gg> >= 0 <<<
+           |      (       |         ifail = 4
+           |      )       |
+           |      (       |---- compute step size: tact(i-1)
+           |      )       |---- compute update: xdiff(i) = xx(i-1) + tact(i-1)*dd(i-1)
+           |      (
+           |      )---- >>> if ifail = 4 <<<
+           |      (         goto 1000
+           |      )
+           |      (---- CALL OPTLINE / LSLINE
+           |      )       |
+           |      (       |
+           |      )       |
+           |      (       |---- /// loop over simulations
+           |      )              (
+           |      (              )---- CALL SIMUL
+           |      )              (       |
+           |      (              )       |----  input: xdiff(i)
+           |      )              (       |---- output: ff(i), gg(i)
+           |      (              )       |---- >>> if ONLINE <<<
+           |      )              (                 runs model and adjoint
+           |      (              )             >>> if OFFLINE <<<
+           |      )              (                 reads those values from file
+           |      (              )
+           |      )              (---- 1st Wolfe test:
+           |      (              )     ff(i) <= tact*xpara1*<gg(i-1),dd(i-1)>
+           |      )              (
+           |      (              )---- 2nd Wolfe test:
+           |      )              (     <gg(i),dd(i-1)> >= xpara2*<gg(i-1),dd(i-1)>
+           |      (              )
+           |      )              (---- >>> if 1st and 2nd Wolfe tests ok <<<
+           |      (              )      |  320: update xx: xx(i) = xdiff(i)
+           |      )              (      |
+           |      (              )     >>> else if 1st Wolfe test not ok <<<
+           |      )              (      |  500: INTERpolate new tact:
+           |      (              )      |  barr*tact < tact < (1-barr)*tact
+           |      )              (      |  CALL CUBIC
+           |      (              )      |
+           |      )              (     >>> else if 2nd Wolfe test not ok <<<
+           |      (              )         350: EXTRApolate new tact:
+           |      )              (         (1+barmin)*tact < tact < 10*tact
+           |      (              )         CALL CUBIC
+           |      )              (
+           |      (              )---- >>> if new tact > tmax <<<
+           |      )              (      |  ifail = 7
+           |      (              )      |
+           |      )              (---- >>> if new tact < tmin OR tact*dd < machine precision <<<
+           |      (              )      |  ifail = 8
+           |      )              (      |
+           |      (              )---- >>> else <<<
+           |      )              (         update xdiff for new simulation
+           |      (              )
+           |      )             \\\ if nfunc > 1: use inter-/extrapolated tact and xdiff
+           |      (                               for new simulation
+           |      )                               N.B.: new xx is thus not based on new gg, but
+           |      (                                     rather on new step size tact
+           |      )
+           |      (
+           |      )
+           |      (---- store new values xx(i), gg(i) to OPWARMD (first 2 entries)
+           |      )---- >>> if ifail = 7,8,9 <<<
+           |      (         goto 1000
+           |      )
+           |      (---- compute new pointers jmin, jmax to include latest values
+           |      )     gg(i)-gg(i-1), xx(i)-xx(i-1) to Hessian matrix estimate
+           |      (---- store gg(i)-gg(i-1), xx(i)-xx(i-1) to OPWARMD
+           |      )     (entries 2*jmax+2, 2*jmax+3)
+           |      (
+           |      )---- CALL DGSCALE
+           |      (       |
+           |      )       |---- call dostore
+           |      (       |       |
+           |      )       |       |---- read preconditioner of previous iteration diag(i-1)
+           |      (       |             from OPWARMD (3rd entry)
+           |      )       |
+           |      (       |---- compute new preconditioner diag(i), based upon diag(i-1),
+           |      )       |     gg(i)-gg(i-1), xx(i)-xx(i-1)
+           |      (       |
+           |      )       |---- call dostore
+           |      (               |
+           |      )               |---- write new preconditioner diag(i) to OPWARMD (3rd entry)
+           |      (
+           |---- \\\ end of optimization iteration loop
+           |
+           |
+           |
+           |---- CALL OUTSTORE
+           |       |
+           |       |---- store gnorm0, ff(i), current pointers jmin, jmax, iterabs to OPWARMI
+           |
+           |---- >>> if OFFLINE version <<<
+           |         xx(i+1) needs to be computed as input for offline optimization
+           |          |
+           |          |---- CALL LSUPDXX
+           |          |       |
+           |          |       |---- compute dd(i), tact(i) -> xdiff(i+1) = x(i) + tact(i)*dd(i)
+           |          |
+           |          |---- CALL WRITE_CONTROL
+           |          |       |
+           |          |       |---- write xdiff(i+1) to special file for offline optim.
+           |
+           |---- print final information
+           |
+           O
+ Remarks:
+ #######
+. Difference between offline/online version
+ --------------------------------------------
+ - Offline version: Every call to simul refers to a read procedure which
+                    reads the result of an offline forward and adjoint run
+                    Therefore, only one call to simul is allowed,
+                      itmax = 0, for cold start
+                      itmax = 1, for warm start
+                    Also, at the end, x(i+1) needs to be computed and saved
+                    to be available for the offline model and adjoint run
+ -  Online version: Every call to simul refers to an execution of the forward and adjoint model.
+                    Several iterations of optimization may thus be performed within
+                    a single run of the main program (main_lsopt).
+                    The following cases may occur:
+                      - cold start only (no optimization)
+                      - cold start & one or several iterations of optimization
+                      - warm start from previous cold start with one or several iterations
+                      - warm start from previous warm start with one or several iterations
+ In order to achieve minimum difference between the online and offline code
+ xdiff(i+1) is stored to file at the end of an (offline) iteration,
+ but recomputed identically at the beginning of the next iteration.
+. Number of iterations vs. number of simulations
+ -------------------------------------------------
+ - itmax: controls the max. number of iterations
+ - nfunc: controls the max. number of simulations within one iteration
+ Summary: From one iteration to the next the descent direction changes.
+          Within one iteration more than one forward and adjoint run may be performed.
+          The updated control used as input for these simulations uses the same
+          descent direction, but different step sizes.
+ In detail:
+ From one iteration to the next the descent direction dd changes using
+ the result for the adjoint vector gg of the previous iteration.
+ In lsline the updated control xdiff(i,1) = xx(i-1) + tact(i-1,1)*dd(i-1) serves as input for
+ a forward and adjoint model run yielding a new gg(i,1).
+ In general, the new solution passes the 1st and 2nd Wolfe tests
+ so xdiff(i,1) represents the solution sought: xx(i) = xdiff(i,1).
+ If one of the two tests fails, an inter- or extrapolation is invoked to determine
+ a new step size tact(i-1,2).
+ If more than one function call is permitted, the new step size is used together
+ with the "old" descent direction dd(i-1) (i.e. dd is not updated using the new gg(i)),
+ to compute a new xdiff(i,2) = xx(i-1) + tact(i-1,2)*dd(i-1) that serves as input
+ in a new forward and adjoint run, yielding gg(i,2).
+ If now, both Wolfe tests are successfull, the updated solution is given by
+ xx(i) = xdiff(i,2) = xx(i-1) + tact(i-1,2)*dd(i-1).
+. Double-usage of fields dd and xdiff
+ --------------------------------------
+ In order to save memory both the fields dd and xdiff have a double usage.
+ - xdiff: in lsopt_top: used as x(i) - x(i-1) for Hessian update
+          in lsline:    intermediate result for control update x = x + tact*dd
+ - dd   : in lsopt_top, lsline: descent vector, dd = -gg   & hessupd
+          in dgscale:           intermediate result to compute new preconditioner
+. Notice for user of old code
+ ------------------------------
+ Three relevant changes needed to switch to new version:
+   (i): OPWARMI file: two variables added:
+                gnorm0  : norm of first (cold start) gradient
+                iabsiter: total number of iterations with respect to cold start
+  (ii): routine names that are referenced by main_lsopt.f
+        lsoptv1 -> lsopt_top
+        lsline1 -> lsline
+ (iii): parameter list of lsopt_top
+        logical loffline included
+ parameter file lsopt.par
+ ########################
+ The optimization is controlled by a set of parameters
+ provided through the standard input file lsopt.par,
+ which is generated within the job script.
+   NUPDATE  : max. no. of update pairs (gg(i)-gg(i-1), xx(i)-xx(i-1))
+              to be stored in OPWARMD to estimate Hessian
+              [pair of current iter. is stored in (2*jmax+2, 2*jmax+3)
+              jmax must be > 0 to access these entries]
+              Presently NUPDATE must be > 0
+              (i.e. iteration without reference to previous
+               iterations through OPWARMD has not been tested)
+   EPSX     : relative precision on xx bellow which xx should not be improved
+   EPSG     : relative precision on gg below which optimization is considered successful
+   IPRINT   : controls verbose (>=1) or non-verbose output
+   NUMITER  : max. number of iterations of optimisation
+              NUMTER = 0: cold start only, no optimization
+   ITER_NUM : index of new restart file to be created (not necessarily = NUMITER!)
+   NFUNC    : max. no. of simulations per iteration
+              (must be > 0)
+              is used if step size tact is inter-/extrapolated;
+              in this case, if NFUNC > 1, a new simulation is performed with
+              same gradient but "improved" step size
+   FMIN     : first guess cost function value
+              (only used as long as first iteration not completed,
+              i.e. for jmax <= 0)
+ OPWARMI, OPWARMD files
+ ######################
+ Two files retain values of previous iterations which are
+ used in latest iteration to update Hessian.
+ OPWARMI: contains index settings and scalar variables
+ OPWARMD: contains vectors
+ Structure of OPWARMI file:
+ -------------------------
+     n, fc, isize, m, jmin, jmax, gnorm0, iabsiter
+     n  = nn      : no. of control variables
+     fc = ff      : cost value of last iteration
+     isize        : no. of bytes per record in OPWARMD
+     m = nupdate  : max. no. of updates for Hessian
+     jmin, jmax   : pointer indices for OPWARMD file (cf. below)
+     gnorm0       : norm of first (cold start) gradient gg
+     iabsiter     : total number of iterations with respect to cold start
+ Structure of OPWARMD file:
+ -------------------------
+    entry
+   : xx(i)         : control vector of latest iteration
+   : gg(i)         : gradient of latest iteration
+   : xdiff(i), diag: preconditioning vector; (1,...,1) for cold start
+     ---
+*jmax+2 : gold = g(i) - g(i-1) for last update (jmax)
+*jmax+3 : xdiff = tact * d = xx(i) - xx(i-1) for last update (jmax)
+ if jmax = 0: cold start; no Hessian update used to compute dd
+ if jmax > nupdate, old positions are overwritten, starting
+                 with position pair (4,5)
+ Example 1: jmin = 1, jmax = 3, mupd = 5
+  2   3   |   4   5     6   7     8   9     empty     empty
+ |___|___|___| | |___|___| |___|___| |___|___| |___|___| |___|___|
+      |     1         2         3
+ Example 2: jmin = 3, jmax = 7, mupd = 5   ---> jmax = 2
+  2   3   |
+ |___|___|___| | |___|___| |___|___| |___|___| |___|___| |___|___|
+               |     6         7         3         4         5
+ Error handling
+ ##############
+   ifail |   description
+ --------+----------------------------------------------------------
+    < 0  | should not appear (flag indic in simul.F not used)
+ | normal mode during execution
+ | an input argument is wrong
+ | warm start file is corrupted
+ | the initial gradient is too small
+ | the search direction is not a descent one
+ | maximal number of iterations reached
+ | maximal number of simulations reached (handled passively)
+ | the linesearch failed
+ | the function could not be improved
+ | optline parameters wrong
+ | cold start, no optimization done
+ | convergence achieved within precision

 Legend:



Removed from v.1.1
 


changed lines


 
Added in v.1.1.2.1
 Legend:



Removed from v.1.1
 


changed lines


 
Added in v.1.1.2.1
-Removed from v.1.1
+Added in v.1.1.2.1

	ViewVC Help
Powered by ViewVC 1.1.22