Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

FHIST simulation:"NetCDF: Filter error: undefined filter encountered ERROR: HANDLE_NCERR "

xiaoxiaokuishu

Ru Xu
Member
hi, everyone,
when i run FHIST simulation for 1990 (for one year) in supercomputer, the error present as:

"using /scratch/snx3000/rxu/CCLM2_inputdata/cesm_inputdata/atm/cam/physprops/iceoptics_c080917.nc
reading ice cloud optics from file /scratch/snx3000/rxu/CCLM2_inputdata/cesm_inputdata/atm/cam/physprops/iceoptics_c080917.nc
checking dimensions of ext_sw_ice
NetCDF: Filter error: undefined filter encountered
ERROR: HANDLE_NCERR "

it seems there is something wrong withe inputdata, but i have checked the "iceoptics_c080917.nc" file, it seem right.
my netcdf version is netcdf-c-4.9.2 and netcdf-fortran-4.6.0.
 

Attachments

  • cscs_atm.PNG
    cscs_atm.PNG
    21 KB · Views: 9

peverley

Courtney Peverley
Moderator
Hi,

A couple things to try/questions for you:

1. Can you run ./check_input_data in your case directory? This will check to see if the perhaps your iceoptics file was changed since it was retrieved from the inputdata server. If it has changed, you'll need to re-grab it from the server.

2. If the file looks ok, can you give me the following information so I can attempt to recreate your issue: CESM version, compiler, resolution

Courtney
 

xiaoxiaokuishu

Ru Xu
Member
Hi,

A couple things to try/questions for you:

1. Can you run ./check_input_data in your case directory? This will check to see if the perhaps your iceoptics file was changed since it was retrieved from the inputdata server. If it has changed, you'll need to re-grab it from the server.

2. If the file looks ok, can you give me the following information so I can attempt to recreate your issue: CESM version, compiler, resolution

Courtney
Thanks for your reponse, i have not uploaded my log file by mistake, so i attached it now, and the detailed settings is here

COMPSET=FHIST
RES=f09_f09_mg17
gcc (SUSE Linux) 7.5.0
GNU Fortran (SUSE Linux) 7.5.0
CESM2.2.1
do_transient_crops = .true.
do_transient_pfts = .true.
!flanduse_timeseries = ''
!fsurdat = "$CESMDATAROOT/cesm_inputdata/CTSM_hcru_inputdata/surfdata_360x720cru_16pfts_Irrig_CMIP6_simyr2000_c170824.nc"
 

Attachments

  • cesm.log.49585184.zip
    249.1 KB · Views: 3

peverley

Courtney Peverley
Moderator
Thanks for the info,

I was able to run a test case past the point of reading in the iceoptics file. The main difference is that my version of GNU is 9.3.0 and my version of netcdf is 4.7.4.

Do you have access to any other compilers (or newer versions of gnu) that you could try? My guess is that the problem is with your specific combination of compiler and netcdf library versions.

Courtney
 

xiaoxiaokuishu

Ru Xu
Member
Thanks for the info,

I was able to run a test case past the point of reading in the iceoptics file. The main difference is that my version of GNU is 9.3.0 and my version of netcdf is 4.7.4.

Do you have access to any other compilers (or newer versions of gnu) that you could try? My guess is that the problem is with your specific combination of compiler and netcdf library versions.

Courtney
Thanks ! I will try your combinations
 

xiaoxiaokuishu

Ru Xu
Member
Thanks for the info,

I was able to run a test case past the point of reading in the iceoptics file. The main difference is that my version of GNU is 9.3.0 and my version of netcdf is 4.7.4.

Do you have access to any other compilers (or newer versions of gnu) that you could try? My guess is that the problem is with your specific combination of compiler and netcdf library versions.

Courtney
Hi,
I have rechecked my settings, and my model simulation is run based on gcc9.3.0 and netcdf 4.7.4, and I installed esmf 8.4.1 (compatible with netcdf 4.9),
Besides, i can run land only simulation (I2000Clm51Sp compset) well, but when I changed to FHIST, the error appears, and thus likely related to CAM.

These are the modules I have loaded when running the model
<
1) modules/3.2.11.4 15) dmapp/7.1.1-7.0.3.1_3.17__g93a7e9f.ari
2) craype-network-aries 16) gni-headers/5.0.12.0-7.0.3.1_3.7__gd0d73fe.ari
3) cray-mpich/7.7.18 17) xpmem/2.2.27-7.0.3.1_3.9__gada73ac.ari
4) slurm/20.11.8-4 18) job/2.2.4-7.0.3.1_3.14__g36b56f4.ari
5) craype-haswell 19) dvs/2.12_2.2.224-7.0.3.1_3.12__gc77db2af
6) daint-gpu/21.09 20) alps/6.6.67-7.0.3.1_3.18__gb91cd181.ari
7) spack-config/1.7 21) rca/2.2.20-7.0.3.1_3.15__g8e3fb5b.ari
8) cray-python/3.9.4.1 22) atp/3.14.5
9) gcc/9.3.0 23) perftools-base/21.09.0
10) craype/2.7.10 24) PrgEnv-gnu/6.0.10
11) cray-libsci/20.09.1 25) cray-netcdf-hdf5parallel/4.7.4.4
12) udreg/2.3.2-7.0.3.1_3.13__g5f0d670.ari 26) cray-hdf5-parallel/1.12.0.4
13) ugni/6.0.14.0-7.0.3.1_6.2__g8101a58.ari 27) cray-parallel-netcdf/1.12.1.4
>


Do you have any idea of it.
 

peverley

Courtney Peverley
Moderator
Hi,

I'm sorry you're still getting the error!

Can you try running:

./check_input_data --chksum

In your case directory?

I had previously thought that just ./check_input_data would compare the chksums for the files, but you need to include the "--chksum" argument to get that feature. If there's a mismatch for the iceoptics file, then that is your problem and you need to re-download it. If the file is OK, I'm running out of ideas! You could try a completely different compiler if you have access to that (intel?). Also, do you have DEBUG turned on? It's unlikely that it will help, but worth a shot.

To turn on debug, in your case directory, run:

./xmlchange DEBUG=TRUE
 

xiaoxiaokuishu

Ru Xu
Member
Hi,

I'm sorry you're still getting the error!

Can you try running:



In your case directory?

I had previously thought that just ./check_input_data would compare the chksums for the files, but you need to include the "--chksum" argument to get that feature. If there's a mismatch for the iceoptics file, then that is your problem and you need to re-download it. If the file is OK, I'm running out of ideas! You could try a completely different compiler if you have access to that (intel?). Also, do you have DEBUG turned on? It's unlikely that it will help, but worth a shot.

To turn on debug, in your case directory, run:
Hi,
The previous attached log file is from debut=True,

After i run ./check_input_data --chksum, below information display:

Client protocol gftp not enabled
Using protocol wget with user anonymous and passwd user@example.edu
Trying to download file: '../inputdata_checksum.dat' to path '/scratch/snx3000/rxu/cases/CESMdev.gnu.FHIST.f09_f09_mg17.glob.test_20231018-1112/run/inputdata_checksum.dat.raw' using WGET protocol.
SUCCESS

Using protocol ftp with user anonymous and passwd user@example.edu
server address ftp.cgd.ucar.edu root path cesm/inputdata
Trying to download file: '../inputdata_checksum.dat' to path '/scratch/snx3000/rxu/cases/CESMdev.gnu.FHIST.f09_f09_mg17.glob.test_20231018-1112/run/inputdata_checksum.dat.raw' using FTP protocol.
Using protocol svn with user and passwd
Client protocol None not enabled
ERROR: Undefined env var 'CESMDATAROOT'


pls ignore the ERROR: Undefined env var 'CESMDATAROOT' because i defined it already.
 

xiaoxiaokuishu

Ru Xu
Member
Hi,

I have uploaded the log file from ./check_input_data --chksum (check.log ) and the default check_input_data.log file.
 

Attachments

  • check_input_data.zip
    19.4 KB · Views: 2

peverley

Courtney Peverley
Moderator
Hi again,

Since your chksum passed, it's not the file that is causing the problem.

I took a look at my checkout of cesm2.2.1 and my version of the file that is generating the error (components/cam/src/physics/rrtmg/cloud_rad_props.F90 does not line up with the error message you are getting. The error message says the problem is at line 246, which is consistent with a more recent version of CESM than 2.2.1. Are you sure you have cesm2.2.1 checked out? If so, have you run ./manage_externals/checkout_externals recently?

Courtney
 

xiaoxiaokuishu

Ru Xu
Member
Hi again,

Since your chksum passed, it's not the file that is causing the problem.

I took a look at my checkout of cesm2.2.1 and my version of the file that is generating the error (components/cam/src/physics/rrtmg/cloud_rad_props.F90 does not line up with the error message you are getting. The error message says the problem is at line 246, which is consistent with a more recent version of CESM than 2.2.1. Are you sure you have cesm2.2.1 checked out? If so, have you run ./manage_externals/checkout_externals recently?

Courtney
Hi,

Thanks for the reminder of CESM version, actually i download CESM using git clone GitHub - ESCOMP/CESM: The Community Earth System Model,
I suspect the version should be CESM2.2.1, but after we check, it is cesm2.2.1-exp17.
Now i switch to release-cesm2.2.1 (using git clone https://github.com/ESCOMP/CESM.git --branch release-cesm2.2.1),
the error we discussed has disappear, but now the new error (i only put some part of the error from cesm.log file) has appears, besides i can run the IHIST successfully.

Experiment data directory written:
/scratch/snx3000/rxu/cases/CESMdev.gnu.FHIST.f09_f09_mg17.glob.test_20231026-1321/run/cesm.exe+16103-5668s
srun: error: nid05677: tasks 108-113,115-119: Floating point exception
srun: launch/slurm: _step_signal: Terminating StepId=49679555.0
srun: error: nid05671: tasks 36-44,46-47: Floating point exception
srun: error: nid05669: tasks 12-21,23: Floating point exception
slurmstepd: error: *** STEP 49679555.0 ON nid05668 CANCELLED AT 2023-10

I have attched all the log file for your reference.
 

peverley

Courtney Peverley
Moderator
I can't tell much from that error. Did you create a whole new case after you checked out release-cesm2.2.1?

Also, in your $CASE/run directory, are there any files of the format "PETXXXX.ESMF_LogFile"? Those files can sometimes contain more information.
 

xiaoxiaokuishu

Ru Xu
Member
I can't tell much from that error. Did you create a whole new case after you checked out release-cesm2.2.1?

Also, in your $CASE/run directory, are there any files of the format "PETXXXX.ESMF_LogFile"? Those files can sometimes contain more information.
Hi,

Yes, I have created newcase when simulating, i think i have temporaily solved the problem i mention of release-cesm2.2.1,
my previous simulation is run with debug=True, when i set it to false, the error disappear......

Thanks you for support!
 
Top