/[MITgcm]/MITgcm/lsopt/lsopt_doc.txt
ViewVC logotype

Diff of /MITgcm/lsopt/lsopt_doc.txt

Parent Directory Parent Directory | Revision Log Revision Log | View Revision Graph Revision Graph | View Patch Patch

revision 1.1 by heimbach, Tue Feb 5 20:34:34 2002 UTC revision 1.1.2.1 by heimbach, Tue Feb 5 20:34:34 2002 UTC
# Line 0  Line 1 
1    
2    Description of large scale optimization package, Version 2.1.0
3    ##############################################################
4    Patrick Heimbach, MIT/EAPS, 02-Mar-2000
5    
6    reference:
7    #########
8    
9    J.C. Gilbert & C. Lemarechal
10    Some numerical experiments with variable-storage quasi-Newton algorithms
11    Mathematical Programming 45 (1989), pp. 407-435
12    
13    flow chart
14    ##########
15    
16          lsopt_top
17              |
18              |---- check arguments
19              |---- CALL INSTORE
20              |       |
21              |       |---- determine whether OPWARMI available:
22              |                * if no:  cold start: create OPWARMI
23              |                * if yes: warm start: read from OPWARMI
24              |             create or open OPWARMD
25              |
26              |---- check consistency between OPWARMI and model parameters
27              |
28              |---- >>> if COLD start: <<<
29              |      |  first simulation with f.g. xx_0; output: first ff_0, gg_0
30              |      |  set first preconditioner value xdiff_0 to 1
31              |      |  store xx(0), gg(0), xdiff(0) to OPWARMD (first 3 entries)
32              |      |
33              |     >>> else: WARM start: <<<
34              |         read xx(i), gg(i) from OPWARMD (first 2 entries)
35              |         for first warm start after cold start, i=0
36              |
37              |
38              |
39              |---- /// if ITMAX > 0: perform optimization (increment loop index i)
40              |      (
41              |      )---- save current values of gg(i-1) -> gold(i-1), ff -> fold(i-1)
42              |      (---- CALL LSUPDXX
43              |      )       |
44              |      (       |---- >>> if jmax=0 <<<
45              |      )       |      |  first optimization after cold start:
46              |      (       |      |  preconditioner estimated via ff_0 - ff_(first guess)
47              |      )       |      |  dd(i-1) = -gg(i-1)*preco
48              |      (       |      |  
49              |      )       |     >>> if jmax > 0 <<<
50              |      (       |         dd(i-1) = -gg(i-1)
51              |      )       |         CALL HESSUPD
52              |      (       |           |
53              |      )       |           |---- dd(i-1) modified via Hessian approx.
54              |      (       |
55              |      )       |---- >>> if <dd,gg> >= 0 <<<
56              |      (       |         ifail = 4
57              |      )       |
58              |      (       |---- compute step size: tact(i-1)
59              |      )       |---- compute update: xdiff(i) = xx(i-1) + tact(i-1)*dd(i-1)
60              |      (
61              |      )---- >>> if ifail = 4 <<<
62              |      (         goto 1000
63              |      )
64              |      (---- CALL OPTLINE / LSLINE
65              |      )       |
66              |      (       |
67              |      )       |
68              |      (       |---- /// loop over simulations
69              |      )              (  
70              |      (              )---- CALL SIMUL
71              |      )              (       |
72              |      (              )       |----  input: xdiff(i)
73              |      )              (       |---- output: ff(i), gg(i)
74              |      (              )       |---- >>> if ONLINE <<<
75              |      )              (                 runs model and adjoint
76              |      (              )             >>> if OFFLINE <<<
77              |      )              (                 reads those values from file
78              |      (              )
79              |      )              (---- 1st Wolfe test:
80              |      (              )     ff(i) <= tact*xpara1*<gg(i-1),dd(i-1)>
81              |      )              (
82              |      (              )---- 2nd Wolfe test:
83              |      )              (     <gg(i),dd(i-1)> >= xpara2*<gg(i-1),dd(i-1)>
84              |      (              )
85              |      )              (---- >>> if 1st and 2nd Wolfe tests ok <<<
86              |      (              )      |  320: update xx: xx(i) = xdiff(i)
87              |      )              (      |
88              |      (              )     >>> else if 1st Wolfe test not ok <<<
89              |      )              (      |  500: INTERpolate new tact:
90              |      (              )      |  barr*tact < tact < (1-barr)*tact
91              |      )              (      |  CALL CUBIC
92              |      (              )      |
93              |      )              (     >>> else if 2nd Wolfe test not ok <<<
94              |      (              )         350: EXTRApolate new tact:
95              |      )              (         (1+barmin)*tact < tact < 10*tact
96              |      (              )         CALL CUBIC
97              |      )              (
98              |      (              )---- >>> if new tact > tmax <<<
99              |      )              (      |  ifail = 7
100              |      (              )      |
101              |      )              (---- >>> if new tact < tmin OR tact*dd < machine precision <<<
102              |      (              )      |  ifail = 8
103              |      )              (      |
104              |      (              )---- >>> else <<<
105              |      )              (         update xdiff for new simulation
106              |      (              )
107              |      )             \\\ if nfunc > 1: use inter-/extrapolated tact and xdiff
108              |      (                               for new simulation
109              |      )                               N.B.: new xx is thus not based on new gg, but
110              |      (                                     rather on new step size tact
111              |      )        
112              |      (        
113              |      )        
114              |      (---- store new values xx(i), gg(i) to OPWARMD (first 2 entries)
115              |      )---- >>> if ifail = 7,8,9 <<<
116              |      (         goto 1000
117              |      )
118              |      (---- compute new pointers jmin, jmax to include latest values
119              |      )     gg(i)-gg(i-1), xx(i)-xx(i-1) to Hessian matrix estimate
120              |      (---- store gg(i)-gg(i-1), xx(i)-xx(i-1) to OPWARMD
121              |      )     (entries 2*jmax+2, 2*jmax+3)
122              |      (
123              |      )---- CALL DGSCALE
124              |      (       |
125              |      )       |---- call dostore
126              |      (       |       |
127              |      )       |       |---- read preconditioner of previous iteration diag(i-1)
128              |      (       |             from OPWARMD (3rd entry)
129              |      )       |
130              |      (       |---- compute new preconditioner diag(i), based upon diag(i-1),
131              |      )       |     gg(i)-gg(i-1), xx(i)-xx(i-1)
132              |      (       |
133              |      )       |---- call dostore
134              |      (               |
135              |      )               |---- write new preconditioner diag(i) to OPWARMD (3rd entry)
136              |      (
137              |---- \\\ end of optimization iteration loop
138              |
139              |
140              |
141              |---- CALL OUTSTORE
142              |       |
143              |       |---- store gnorm0, ff(i), current pointers jmin, jmax, iterabs to OPWARMI
144              |
145              |---- >>> if OFFLINE version <<<
146              |         xx(i+1) needs to be computed as input for offline optimization
147              |          |
148              |          |---- CALL LSUPDXX
149              |          |       |
150              |          |       |---- compute dd(i), tact(i) -> xdiff(i+1) = x(i) + tact(i)*dd(i)
151              |          |
152              |          |---- CALL WRITE_CONTROL
153              |          |       |
154              |          |       |---- write xdiff(i+1) to special file for offline optim.
155              |
156              |---- print final information
157              |
158              O
159    
160    
161    
162    Remarks:
163    #######
164    
165    1. Difference between offline/online version
166    --------------------------------------------
167    
168    - Offline version: Every call to simul refers to a read procedure which
169                       reads the result of an offline forward and adjoint run
170                       Therefore, only one call to simul is allowed,
171                         itmax = 0, for cold start
172                         itmax = 1, for warm start
173                       Also, at the end, x(i+1) needs to be computed and saved
174                       to be available for the offline model and adjoint run
175    
176    -  Online version: Every call to simul refers to an execution of the forward and adjoint model.
177                       Several iterations of optimization may thus be performed within
178                       a single run of the main program (main_lsopt).
179                       The following cases may occur:
180                         - cold start only (no optimization)
181                         - cold start & one or several iterations of optimization
182                         - warm start from previous cold start with one or several iterations
183                         - warm start from previous warm start with one or several iterations
184    
185    In order to achieve minimum difference between the online and offline code
186    xdiff(i+1) is stored to file at the end of an (offline) iteration,
187    but recomputed identically at the beginning of the next iteration.
188    
189    2. Number of iterations vs. number of simulations
190    -------------------------------------------------
191    
192    - itmax: controls the max. number of iterations
193    - nfunc: controls the max. number of simulations within one iteration
194    
195    Summary: From one iteration to the next the descent direction changes.
196             Within one iteration more than one forward and adjoint run may be performed.
197             The updated control used as input for these simulations uses the same
198             descent direction, but different step sizes.
199    
200    In detail:
201    From one iteration to the next the descent direction dd changes using
202    the result for the adjoint vector gg of the previous iteration.
203    In lsline the updated control xdiff(i,1) = xx(i-1) + tact(i-1,1)*dd(i-1) serves as input for
204    a forward and adjoint model run yielding a new gg(i,1).
205    In general, the new solution passes the 1st and 2nd Wolfe tests
206    so xdiff(i,1) represents the solution sought: xx(i) = xdiff(i,1).
207    If one of the two tests fails, an inter- or extrapolation is invoked to determine
208    a new step size tact(i-1,2).
209    If more than one function call is permitted, the new step size is used together
210    with the "old" descent direction dd(i-1) (i.e. dd is not updated using the new gg(i)),
211    to compute a new xdiff(i,2) = xx(i-1) + tact(i-1,2)*dd(i-1) that serves as input
212    in a new forward and adjoint run, yielding gg(i,2).
213    If now, both Wolfe tests are successfull, the updated solution is given by
214    xx(i) = xdiff(i,2) = xx(i-1) + tact(i-1,2)*dd(i-1).
215    
216    3. Double-usage of fields dd and xdiff
217    --------------------------------------
218    
219    In order to save memory both the fields dd and xdiff have a double usage.
220    
221    - xdiff: in lsopt_top: used as x(i) - x(i-1) for Hessian update
222             in lsline:    intermediate result for control update x = x + tact*dd
223    
224    - dd   : in lsopt_top, lsline: descent vector, dd = -gg   & hessupd
225             in dgscale:           intermediate result to compute new preconditioner
226    
227    4. Notice for user of old code
228    ------------------------------
229    
230    Three relevant changes needed to switch to new version:
231      
232      (i): OPWARMI file: two variables added:
233                   gnorm0  : norm of first (cold start) gradient
234                   iabsiter: total number of iterations with respect to cold start
235    
236     (ii): routine names that are referenced by main_lsopt.f
237           lsoptv1 -> lsopt_top
238           lsline1 -> lsline
239    
240    (iii): parameter list of lsopt_top
241           logical loffline included
242    
243    parameter file lsopt.par
244    ########################
245    
246    The optimization is controlled by a set of parameters
247    provided through the standard input file lsopt.par,
248    which is generated within the job script.
249    
250      NUPDATE  : max. no. of update pairs (gg(i)-gg(i-1), xx(i)-xx(i-1))
251                 to be stored in OPWARMD to estimate Hessian
252                 [pair of current iter. is stored in (2*jmax+2, 2*jmax+3)
253                 jmax must be > 0 to access these entries]
254                 Presently NUPDATE must be > 0
255                 (i.e. iteration without reference to previous
256                  iterations through OPWARMD has not been tested)
257      EPSX     : relative precision on xx bellow which xx should not be improved
258      EPSG     : relative precision on gg below which optimization is considered successful
259      IPRINT   : controls verbose (>=1) or non-verbose output
260      NUMITER  : max. number of iterations of optimisation
261                 NUMTER = 0: cold start only, no optimization
262      ITER_NUM : index of new restart file to be created (not necessarily = NUMITER!)
263      NFUNC    : max. no. of simulations per iteration
264                 (must be > 0)
265                 is used if step size tact is inter-/extrapolated;
266                 in this case, if NFUNC > 1, a new simulation is performed with
267                 same gradient but "improved" step size
268      FMIN     : first guess cost function value
269                 (only used as long as first iteration not completed,
270                 i.e. for jmax <= 0)
271    
272    OPWARMI, OPWARMD files
273    ######################
274    
275    Two files retain values of previous iterations which are
276    used in latest iteration to update Hessian.
277    OPWARMI: contains index settings and scalar variables
278    OPWARMD: contains vectors
279    
280    Structure of OPWARMI file:
281    -------------------------
282      
283        n, fc, isize, m, jmin, jmax, gnorm0, iabsiter
284    
285        n  = nn      : no. of control variables
286        fc = ff      : cost value of last iteration
287        isize        : no. of bytes per record in OPWARMD
288        m = nupdate  : max. no. of updates for Hessian
289        jmin, jmax   : pointer indices for OPWARMD file (cf. below)
290        gnorm0       : norm of first (cold start) gradient gg
291        iabsiter     : total number of iterations with respect to cold start
292    
293    Structure of OPWARMD file:
294    -------------------------
295       entry
296         1    : xx(i)         : control vector of latest iteration
297         2    : gg(i)         : gradient of latest iteration
298         3    : xdiff(i), diag: preconditioning vector; (1,...,1) for cold start
299        ---
300     2*jmax+2 : gold = g(i) - g(i-1) for last update (jmax)
301     2*jmax+3 : xdiff = tact * d = xx(i) - xx(i-1) for last update (jmax)
302    
303    if jmax = 0: cold start; no Hessian update used to compute dd
304    if jmax > nupdate, old positions are overwritten, starting
305                    with position pair (4,5)
306    
307    Example 1: jmin = 1, jmax = 3, mupd = 5
308    
309      1   2   3   |   4   5     6   7     8   9     empty     empty
310    |___|___|___| | |___|___| |___|___| |___|___| |___|___| |___|___|
311          0       |     1         2         3
312    
313    Example 2: jmin = 3, jmax = 7, mupd = 5   ---> jmax = 2
314    
315      1   2   3   |  
316    |___|___|___| | |___|___| |___|___| |___|___| |___|___| |___|___|
317                  |     6         7         3         4         5
318    
319    
320    
321    Error handling
322    ##############
323    
324      ifail |   description
325    --------+----------------------------------------------------------
326       < 0  | should not appear (flag indic in simul.F not used)
327         0  | normal mode during execution
328         1  | an input argument is wrong
329         2  | warm start file is corrupted
330         3  | the initial gradient is too small
331         4  | the search direction is not a descent one
332         5  | maximal number of iterations reached
333         6  | maximal number of simulations reached (handled passively)
334         7  | the linesearch failed
335         8  | the function could not be improved
336         9  | optline parameters wrong
337        10  | cold start, no optimization done
338        11  | convergence achieved within precision
339    
340    

Legend:
Removed from v.1.1  
changed lines
  Added in v.1.1.2.1

  ViewVC Help
Powered by ViewVC 1.1.22