Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

Problem in initializing cesm2 with a new grid

thakur.abubakar

Abu Bakar Siddiqui Thakur
Member
Dear All,

I'm trying to run an aquaplanet (QPC6, a compset I have used extensively with f19 resolution) simulation on a custom grid using CESM 2.1.3, but the case fails on execution.

I have setup the custom 2x20 grid following 3.2 of these instructions and with guidance on this thread. I am able to create the case (attached README.case) and build the executable successfully. I have updated cam/bld/config_files/horiz_grid.xml and the config_grids.xml file (both attached). I have specified the cam namelist parameters using the defaults in atm_in file of the same compset using f19 resolution (attached user_nl_cam). Also, I have tried to create the initial condition dataset (cam namelist parameter ncdata, attached in zipped format) for this grid using the same for the f19 resolution (/mnt/lustre/cas2/casabu/Abu/cesm21/cesm_input/atm/cam/inic/fv/aqua_0006-01-01_1.9x2.5_L32_c161020.nc). However, the case fails to execute and the job exits immediately (attached log files). The cpl log file indicates that there is an issue in the startup of cam. I feel I may be missing out something in user_nl_cam or may have made an error in making the ncdata input file for cam6 aquaplanet, but I am unable to figure out exactly what I am missing.

Any assistance in this regard will be greatly apprecated.

Thanks in advance,
Abu Bakar Siddiqui
 

Attachments

  • version_info.txt
    7.3 KB · Views: 1
  • README.case.txt
    2.8 KB · Views: 2
  • config_grids.xml.txt
    114.7 KB · Views: 2
  • horiz_grid.xml.txt
    3.9 KB · Views: 1
  • atm.log.192455.sdb.210928-235035.txt
    29.2 KB · Views: 2
  • cesm.log.192455.sdb.210928-235035.txt
    15 KB · Views: 5
  • cpl.log.192455.sdb.210928-235035.txt
    42.1 KB · Views: 1
  • ncdata.zip.txt
    803.2 KB · Views: 3
  • user_nl_cam.txt
    10.1 KB · Views: 4
  • machine_config.zip.txt
    24.3 KB · Views: 0

fischer

CSEG and Liaisons
Staff member
I looked at your ncdata.zip file. I see that it's netCDF-4 version. We've ran into issues on several different system trying to read that version.
What you can do is convert the version or the ncdata by using this command.

nccopy -k cdf5 oldfile newfile

Chris
 

thakur.abubakar

Abu Bakar Siddiqui Thakur
Member
Thank you for your response, Chris. I made the changes you suggested and tried running the case with a fresh build. However, the model still fails on execution (attached log files).
 

Attachments

  • atm.log.192796.sdb.210929-231826.txt
    29.2 KB · Views: 4
  • cesm.log.192796.sdb.210929-231826.txt
    15 KB · Views: 10
  • cpl.log.192796.sdb.210929-231826.txt
    42.1 KB · Views: 1

fischer

CSEG and Liaisons
Staff member
I think there's something wrong with your ncdata file, but I'm not sure. I'm going to move this thread to the CAM forum. They should be able
to help you.

Chris
 

patc

Patrick Callaghan
New Member
Hi Abu,
The model is failing at or near the calls:
ret = pio_inq_varid(fh_ini,cnst_name(m_cnst), varid)
ret = pio_get_att(fh_ini, varig, 'units', trunits)
I have had an error like this before, when I tried to inq a variable get an attribute that was
not present in the file. I think that you are missing a 'units' attribute on your specific humidity.

--> Patrick
 

thakur.abubakar

Abu Bakar Siddiqui Thakur
Member
Hi Patrick,
Thank you for your response. I had actually missed out on specifying the attributes. Thanks for that.
However, I'm getting a new error. PFA the log files.
 

Attachments

  • cpl.log.193289.sdb.211001-131325.txt
    53 KB · Views: 2
  • atm.log.193289.sdb.211001-131325.txt
    316.3 KB · Views: 2
  • cesm.log.193289.sdb.211001-131325.txt
    60.5 KB · Views: 3

thakur.abubakar

Abu Bakar Siddiqui Thakur
Member
Hello again,
I've managed to make some progress.
From the last post, I've switched off DEBUG and found that the model is producing the following error (taken from cesm.log):

Lagrangian levels are crossing 9.999999999999999E-012
655 Run will ABORT!
656 Suggest to increase NSPLTVRM
657 ERROR: te_map: Lagrangian levels are crossing


On increasing fv_nspltvrm from 1 to 2, I'm getting the following error (from atm.log):

ERROR: Because of loop nesting, FV dycore can't use the specified namelist set
tings for subcycling
The original namelist settings were:
fv_nsplit = 4
fv_nspltrac = 1
fv_nspltvrm = 2

fv_nsplit needs to be a multiple of fv_nspltrac
which in turn needs to be a multiple of fv_nspltvrm.
Suggested settings would be:
fv_nsplit = 4
fv_nspltrac = 2
fv_nspltvrm = 2
ERROR: Bad namelist settings for FV subcycling.


Now as I understand it, the constraint here is NSPLIT/NSPLTRAC = m and NSPLTRAC/NSPLTVRM=n , where m & n are positive integers. However, I seem to be stuck in a loop as I'm just cycling between the two errors. When I increase NSPLIT to 16 and beyond, I'm getting a segmentation error.
Can anyone tell me the optimum setting that could work here?

Thanks in advance,
Abu Bakar Siddiqui
 

ahughes

Ann-Casey Hughes
New Member
Hello again,
I've managed to make some progress.
From the last post, I've switched off DEBUG and found that the model is producing the following error (taken from cesm.log):

Lagrangian levels are crossing 9.999999999999999E-012
655 Run will ABORT!
656 Suggest to increase NSPLTVRM
657 ERROR: te_map: Lagrangian levels are crossing


On increasing fv_nspltvrm from 1 to 2, I'm getting the following error (from atm.log):

ERROR: Because of loop nesting, FV dycore can't use the specified namelist set
tings for subcycling
The original namelist settings were:
fv_nsplit = 4
fv_nspltrac = 1
fv_nspltvrm = 2

fv_nsplit needs to be a multiple of fv_nspltrac
which in turn needs to be a multiple of fv_nspltvrm.
Suggested settings would be:
fv_nsplit = 4
fv_nspltrac = 2
fv_nspltvrm = 2
ERROR: Bad namelist settings for FV subcycling.


Now as I understand it, the constraint here is NSPLIT/NSPLTRAC = m and NSPLTRAC/NSPLTVRM=n , where m & n are positive integers. However, I seem to be stuck in a loop as I'm just cycling between the two errors. When I increase NSPLIT to 16 and beyond, I'm getting a segmentation error.
Can anyone tell me the optimum setting that could work here?

Thanks in advance,
Abu Bakar Siddiqui
Hi Abu,

I'm having trouble with this same problem with "Lagrangian levels are crossing" etc etc. I've experimented with many different values of fv_nsplit, fv_npsltrac, and fv_nspltvrm, and get either the error you got, or it crashes immediately. Did you ever figure it out?

Thanks!
Ann-Casey
 

thakur.abubakar

Abu Bakar Siddiqui Thakur
Member
Hey Ann,

No, sorry. The grid I had defined was unstable for the FV dycore. This typically happens when the grids are non-square.
This is the reply I received from Cheryl from the CAM working group.
I spoke with someone today and their feeling is that your grid is probably an unstable grid for the FV dycore. The FV dycore was designed to support basically square boxes and your grid is very far from being square. Their feeling was that there is probably no easy workaround for this.

I used the Eulerian dycore for my experiment (I had to use T85 because I got the same error with the T42 grid ! ).

All the best!
 
Top