Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

error due to netcdf feature during cesm testing

stevenDH

Member
During the CESM testing of the pre-alpha tests on our tier2 hpc I got an error related to the netcdf version for the following tests:
SMS_D_Ln9.f09_f09_mg17.FCHIST.hydra_gnu.cam-outfrq9s
SMS_D_Ln9.f09_f09_mg17.FWHIST.hydra_gnu.cam-reduced_hist3s
SMS_D_Ln9.f09_f09_mg17.FWscHIST.hydra_gnu.cam-outfrq9s

the error was:
pio_support: pio_die:: myrank= -1 : ERROR: ionf_mod.F90: 235 : NetCDF: Attempt to use feature that was not turned on when netCDF was built.

However our build of NetCDF-Fortran already provides the F03 module for Fortran which is newer and backwards compatible with F90 so I then wonder what would be the problem, it does not seem as if any of the netcdf features are not enabled yet.

Anyone any idea what this might be related to?
 
Last edited:

jedwards

CSEG and Liaisons
Staff member
The problem may be that an input file is in netcdf-4 format which seems to present problems on some systems.
We have updated our inputdata to replace these files with equivalent files in alternate formats. If you can provide the
file name I think I can suggest an alternate file.
 

stevenDH

Member
The first 2 tests both fail on this file :

atm/cam/chem/emis/CMIP6_emissions_1750_2015/emissions-cmip6_so4_a2_contvolcano_vertical_850-5000_0.9x1.25_c20170724.nc

And the last one on

atm/cam/topo/fv_0.9x1.25_nc3000_Nsw042_Nrs008_Co060_Fi001_ZR_sgh30_24km_GRNL_c170103.nc

I've tried erasing these and rerunning the model, this way a newer version can be downloaded, but I guess if the fileformat will be the same this wont help too much, looking forward to your reply!

Cheers
Steven
 

stevenDH

Member
Again to return to this issue, I've started porting CESM2 on a new machine, but again I get errors similar to the above, an example test that fails is the following:

ERS_D.f09_g17.B1850.breniac_intel.allactive-defaultio

error:
Code:
/projects/climate/cesm/inputdata/atm/cam/chem/emis/CMIP6_emissions_1750_2015/emissions-cmip6_num_a1_so4_contvolcano_vertical_850-5000_0.9x1.25_c20170724.nc     1900544

 NetCDF: Variable not found

 NetCDF: Invalid dimension ID or name

The case and build seem fine, I'm using netcdf 4.6.0 so this should be fine as well,
Do you have any solutions for this, I've tried automatically regenerating these files before but this doesnt seem to change anything.

Thanks for the support!
Steven
 

sacks

Bill Sacks
CSEG and Liaisons
Staff member
Those diagnostic messages typically appear in log files even when there is no issue. Is the run failing? If so, please attach the log files from the run if you can't see the actual error yourself.
 

stevenDH

Member
The run is indeed failing, please find the log file attached to this post.
 

Attachments

  • cesm.log.40642270.moab.tier1.hpc.kuleuven.be.txt
    105.7 KB · Views: 20

mlevy

Michael Levy
CSEG and Liaisons
Staff member
Can you please run the following in your case directory and provide the output?

Code:
$ ./xmlquery NTASKS_OCN,NTHRDS_OCN

The space curve decomposition assumes the number of ocean blocks contains factors of 2, 3, and 5 (perhaps ONLY factors of 2, 3, and 5?). What grid are you running on? There is an XML file $CESMROOT/components/pop/bld/generate_pop_decomp.xml that lists pre-defined decompositions (all using cartesian instead of spacecurve) depending on your grid. For example, the default configuration on cheyenne for running POP on the gx1v7 grid without biogeochemistry uses NTASKS_OCN=144, NTHRDS_OCN=1 which corresponds to

Code:
  <decomp nproc="144" res="gx1v[67]" >
    <maxblocks >1</maxblocks>
    <bsize_x   >27</bsize_x>
    <bsize_y   >32</bsize_y>
    <nx_blocks >12</nx_blocks>
    <ny_blocks >12</ny_blocks>
    <decomptype>cartesian</decomptype>
  </decomp>

If it lines up nicely with your computer hardware (core counts, node counts, etc), we would recommend running POP with a task count that corresponds to a decomposition in that file.
 

stevenDH

Member
Thank you very much for the feedback, appearently this appears to be the case, the machine we use has 28 cores per node, when we change this in the machine config files to use only 24 per node it doesn't error anymore.
So just to be clear, this means when I run the ocean component on a different resolution I might need to change the amount of cores per node right?
 

mlevy

Michael Levy
CSEG and Liaisons
Staff member
What we typically do is update the generate_pop_decomp.xml file I mentioned previously with a reasonable decomposition for the number of tasks (nodes * cores) that we want to run with. It's probably not worth going into the math behind the breakdown, but I'm happy to provide you with reasonable decompositions if you let me know how many tasks you are running and which grid(s) you need decompositions for.
 
Top