Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

History outputs

nuvolet

Toni Viudez
Member
Hi,
I am following this post closely, actually Tom and I are working on same project and I am wondering because F2000climo does not have hourly outputs.
Is there any compost that allows to run tidally and get history (diagnose) files?
Additionally, are you running CESM on Cheyenne? It is because the time needed for those run models.
Thanks
 

mlevy

Michael Levy
CSEG and Liaisons
Staff member
Hi,
I am following this post closely, actually Tom and I are working on same project and I am wondering because F2000climo does not have hourly outputs.
Is there any compost that allows to run tidally and get history (diagnose) files?
Additionally, are you running CESM on Cheyenne? It is because the time needed for those run models.
Thanks

You can manually change the frequency that the various components write history output - for instructions on how to do that, I would look in the various component forums on this bulletin board (e.g. questions about CAM should be asked here).

I am running on cheyenne, and if you are as well I suggest you look at the CESM 2 timing table to get a rough estimate for run time. I only see entries for the f19_f19_mg17 resolution (I ran my test with f09_f09_mg17; out of the box with that resolution, it looks like F2000climo uses 360 MPI tasks with 3 OpenMP threads per task for a total of 1080 Total PEs) but if you are running at 2 degree resolution the default layout is 72 MPI tasks with a single thread per task. You can run ./pelayout from your case directory to see exactly what you are running on.

If you do look at the timing table, you should pay attention to the ThruPut yrs/day number; that is the number of model years simulated in a 24 hour run. Since cheyenne limits jobs to a 12 hour wallclock, you should expect to be able to model half that number of years per submission; note that writing daily output will decrease throughput as the time it takes to write to disk will not be used to progress the model.

Lastly, after a successful run, you should notice a directory named timing/ in your case directory; the cesm_timing.${CASENAME}... files will give you information about your specific run. Here is the timing from when I ran for a single month to generate the monthly history files:

Code:
  grid        : a%0.9x1.25_l%0.9x1.25_oi%0.9x1.25_r%r05_g%gland4_w%null_z%null_m%gx1v7
  compset     : 2000_CAM60_CLM50%SP_CICE%PRES_DOCN%DOM_MOSART_CISM2%NOEVOLVE_SWAV_SIAC_SESP
  run type    : startup, continue_run = FALSE (inittype = TRUE)
  stop option : nmonths, stop_n = 1
  run length  : 31 days (30.979166666666668 for ocean)

  component       comp_pes    root_pe   tasks  x threads instances (stride)
  ---------        ------     -------   ------   ------  ---------  ------
  cpl = cpl        1080        0        360    x 3       1      (1     )
  atm = cam        1080        0        360    x 3       1      (1     )
  lnd = clm        1080        0        360    x 3       1      (1     )
  ice = cice       1080        0        360    x 3       1      (1     )
  ocn = docn       1080        0        360    x 3       1      (1     )
  rof = mosart     1080        0        360    x 3       1      (1     )
  glc = cism       1080        0        360    x 3       1      (1     )
  wav = swav       1080        0        360    x 3       1      (1     )
  iac = siac       1           0        1      x 1       1      (1     )
  esp = sesp       1           0        1      x 1       1      (1     )

  total pes active           : 3240
  mpi tasks per node               : 36
  pe count for cost estimate : 1080

  Overall Metrics:
    Model Cost:            1769.92   pe-hrs/simulated_year
    Model Throughput:        14.64   simulated_years/day

The important line is

Code:
Model Throughput:        14.64   simulated_years/day

If I were trying to run 100 years, I would run ./xmlchange STOP_N=7,STOP_OPTION=nyears,RESUBMIT=13 so the model runs for 7 years at a time (14.64 / 2, rounded down), restarting itself 13 times for a total of 14*7=98 years. I would also run ./xmlchange --subgroup case.run JOB_WALLCLOCK_TIME=12:00:00 to request 12 hours per submission, and then ./case.submit will take care of the rest.

I'm getting a little ahead of myself, though. I think the first thing you should do is figure out the namelist settings to get the output you want at the frequency you want. Then run for a year or two (maybe with ./xmlchange --subgroup case.run JOB_WALLCLOCK_TIME=6:00:00) and look at the timing file to get a better sense of throughput before continuing the run in bigger chunks.
 

TCNasa

Tom Caldwell
Member
Thanks for your help. We are getting history files now. Can you suggest any good info on running in parallel? Multiple tasks and threads and such?
 
Top