Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

The number of land gridcells changes over simulations

Status
Not open for further replies.

Yuan Sun

Yuan Sun
Member
Hi all,

I am running regional simulations using CESM5.2.005. In lnd.log, I found the total number of land gridcells changes over simulation. For example,
./xmlchange RUN_STARTDATE=0001-01-01
./xmlchange LND_DOMAIN_MESH="${LNDMESH}"
./xmlchange ATM_DOMAIN_MESH="${LNDMESH}"
./xmlchange MASK_MESH="${MASKMESH}"
./xmlchange DATM_YR_START=${STARTYEAR}
./xmlchange DATM_YR_END=${ENDYEAR}
./xmlchange DATM_YR_ALIGN=${STARTYEAR}
./xmlchange NTASKS=128
./xmlchange ROOTPE_LND=64
./xmlchange STOP_OPTION=nyears
./xmlchange STOP_N=${SPINUP}
./xmlchange RESUBMIT=2

in the first run:
Attempting to read global dimensions from surface dataset
(GETFIL): attempting to find local file
surfdata_005x005_UKwest_hist_1950_78pfts_c240526.nc
(GETFIL): using /mnt/iusers01/fatpou01/sees01/a16404ys/scratch/Projects/inputdata/project4/surfdata_005x005_UKwest_hist_1950_78pfts_c240526.nc
global ni,nj = 172 220
model grid is 2-dimensional

Computing land fraction and land mask by mapping mask from mesh_mask file
decomp precompute numg,nclumps,seglen1,avg_seglen,nsegspc= 29578 256 F 3.30111599 35.0000000
Surface Grid Characteristics
longitude points = 172
latitude points = 220
total number of land gridcells = 29578
Decomposition Characteristics
clumps per process = 2

in the second run (restart):
Attempting to read global dimensions from surface dataset
(GETFIL): attempting to find local file
surfdata_005x005_UKwest_hist_1950_78pfts_c240526.nc
(GETFIL): using /mnt/iusers01/fatpou01/sees01/a16404ys/scratch/Projects/inputdata/project4/surfdata_005x005_UKwest_hist_1950_78pfts_c240526.nc
global ni,nj = 172 220
model grid is 2-dimensional

Computing land fraction and land mask by mapping mask from mesh_mask file
decomp precompute numg,nclumps,seglen1,avg_seglen,nsegspc= 30253 256 F 3.37645078 35.0000000
Surface Grid Characteristics
longitude points = 172
latitude points = 220
total number of land gridcells = 30253
Decomposition Characteristics
clumps per process = 2


I tried several times and each time the total number of land gridcells was different. For example,

Computing land fraction and land mask by mapping mask from mesh_mask file
decomp precompute numg,nclumps,seglen1,avg_seglen,nsegspc= 33766 256 F 3.76852679 35.0000000
Surface Grid Characteristics
longitude points = 172
latitude points = 220
total number of land gridcells = 33766
Decomposition Characteristics
clumps per process = 1

Computing land fraction and land mask by mapping mask from mesh_mask file
decomp precompute numg,nclumps,seglen1,avg_seglen,nsegspc= 34268 256 F 3.82455349 35.0000000
Surface Grid Characteristics
longitude points = 172
latitude points = 220
total number of land gridcells = 34268
Decomposition Characteristics
clumps per process = 1


This resulted in the error: check_dim_size ERROR: mismatch of input dimension 29578 with expected value 30253 for variable gridcellDid you mean to set use_init_interp = .true. in user_nl_clm?

Is it normal? Could I use 'use_init_interp = .true.' to solve this error? Thanks for any comments.

Best,
Yuan
 

slevis

Moderator
Staff member
I agree this seems strange. I would not expect this error in a restart, but I may not have enough information about what you're doing.

Regardless, if the error message suggests use_init_interp = .true., then I recommend trying it.
 

slevis

Moderator
Staff member
It may be instructive to try the same test with the default model, i.e. nothing changed. Then you will presumably see how the model is supposed to behave. This may give you insight into what you may have done to get this result.
 

yifanc17

Yifan Cheng
New Member
Hi Yuan,

I also encountered the same error when trying to run a regional case (Compset: 2000_DATM%NLDAS2_CLM50%NWP-SP_SICE_SOCN_MOSART_SGLC_SWAV, Resolution: nldas2_rnldas2_mnldas2). And I already set use_init_interp = .true. in user_nl_clm. Did you try anything different that works? Thanks!
 

Yuan Sun

Yuan Sun
Member
It may be instructive to try the same test with the default model, i.e. nothing changed. Then you will presumably see how the model is supposed to behave. This may give you insight into what you may have done to get this result.
Hi Slevis and Yifan,

Thanks for taking the insight.

I found that the case with errors restarted with ' nrevsn = 'UKwest_spinup_i2000.clm2.r.0011-01-01-00000.nc' in the lnd_in and I did not find finidat. Even if I added 'use_init_interp = .true.' into user_nl_clm, it would not work without specifying finidat.

Another thing is that I could not use
./xmlchange RUN_REFDIR=${RESTART}
./xmlchange RUN_REFDATE=${RE_DATE}
./xmlchange RUN_REFTOD=00000
./xmlchange GET_REFCASE=TRUE
./xmlchange RUN_REFCASE=${RE_CASE} to specify the initial condition for a branch/hybrid simulation. Instead, I need to copy the restart files manually into the run directory and echo 'finidat = 'UKwest_spinup_i2000.clm2.r.0011-01-01-00000.nc' >> user_nl_clm.

It may be a porting issue? When I used another machine Archer2, I did not need to copy restart files manually and finidat is automatically added in the lnd_in.

Do you have any suggestions?

Best,
Yuan
 

oleson

Keith Oleson
CSEG and Liaisons
Staff member
It could be a porting issue. Settings of finidat should be ignored in a restart run. I would not expect the number of land grid cells to change in a restart run. We have extensive testing that verifies that, including I believe for a regional grid. On the other hand., @yifanc17 is reporting a similar problem with what appears to be an out of the box regional case. Since we don't have access to Yuan's machines, we could take a look at @yifanc17 's case and see if the problem can be replicated. @yifanc17 , can you point me to the case on Derecho that had the land gridcell error?
 

yifanc17

Yifan Cheng
New Member
It could be a porting issue. Settings of finidat should be ignored in a restart run. I would not expect the number of land grid cells to change in a restart run. We have extensive testing that verifies that, including I believe for a regional grid. On the other hand., @yifanc17 is reporting a similar problem with what appears to be an out of the box regional case. Since we don't have access to Yuan's machines, we could take a look at @yifanc17 's case and see if the problem can be replicated. @yifanc17 , can you point me to the case on Derecho that had the land gridcell error?
Hi Keith, I deleted the init_generated_files directory and rebuild to make the case work so the previous logs were already archived. Not sure if they would help, but the logs with 'check_dim_size ERROR: mismatch of input dimension 95769 with expected value 222849 for variable landunit' are here:
Code:
/glade/derecho/scratch/yifanc17/archive/i2000Clm50Sp.CTSM5.2.ERA5Land.0.125nldas2.CTRL.c240602/logs/cesm.log.4691474.desched1.240603-215825
, and the case directory is here:
Code:
/glade/work/yifanc17/cases/i2000Clm50Sp.CTSM5.2.ERA5Land.0.125nldas2.CTRL.c240602
Let me know if there's anything else I can help with! Thank you!
 

oleson

Keith Oleson
CSEG and Liaisons
Staff member
Thanks for that information. I ran that case out of the box for three years with two restarts using ctsm5.2.005 and the number of gridcells, landunits, etc. was the same throughout the simulation. Then I reran the same case from the beginning with some changes I found in the user_nl_clm in that case:

n_dom_landunits = 0
n_dom_pfts = 0

fsurdat = '/glade/work/yifanc17/02_data/cesmdata/surfdata/0.125nldas2/surfdata_0.125nldas2_hist_2005_78pfts_ctrl_c240601.nc'

calc_human_stress_indices = 'ALL'
collapse_urban = .false.
soil_layerstruct_predefined = '20SL_8.5m'

and that gave me an error:

check_dim_size ERROR: mismatch of input dimension 96639
with expected value 199682 for variable landunit

As suggested by @yifanc17 , I deleted the init_generated_files directory in the run directory and the case ran successfully.

So if you change something related to the number of gridcells etc, like the surface dataset change made above, then you either need to create a new case or delete that initial files directory in the original case.
I'm not sure if this helps with Yuan's problem however.
 

Yuan Sun

Yuan Sun
Member
Thanks for that information. I ran that case out of the box for three years with two restarts using ctsm5.2.005 and the number of gridcells, landunits, etc. was the same throughout the simulation. Then I reran the same case from the beginning with some changes I found in the user_nl_clm in that case:

n_dom_landunits = 0
n_dom_pfts = 0

fsurdat = '/glade/work/yifanc17/02_data/cesmdata/surfdata/0.125nldas2/surfdata_0.125nldas2_hist_2005_78pfts_ctrl_c240601.nc'

calc_human_stress_indices = 'ALL'
collapse_urban = .false.
soil_layerstruct_predefined = '20SL_8.5m'

and that gave me an error:

check_dim_size ERROR: mismatch of input dimension 96639
with expected value 199682 for variable landunit

As suggested by @yifanc17 , I deleted the init_generated_files directory in the run directory and the case ran successfully.

So if you change something related to the number of gridcells etc, like the surface dataset change made above, then you either need to create a new case or delete that initial files directory in the original case.
I'm not sure if this helps with Yuan's problem however.
Hi Keith,

Thanks for your efforts.

'init_generated_files' seems to be a new function in CTSM5.2 (I am not sure). It seems that 'init_generated_files' might introduce errors to restart ?

No throughout solved yet, but I set ./xmlchange resubmit=0 for the simulation to avoid this issue.

Best,
Yuan
 

oleson

Keith Oleson
CSEG and Liaisons
Staff member
It was implemented Aug, 2022. It shouldn't affect restarts at all if the model is functioning properly.
 
Status
Not open for further replies.
Top