Hello, everyone.
I want to do PI experiment by choosing B1850 compset. My case could successfully build and run with no error. However, the case was crashed after 3 years(model time) with the following error in the cesm.log (ERROR: component_mod:check_fields NaN found in ATM instance: 1 field Sa_z 1d global). I don't know how to solve it. Can you give me some suggestions? Thank you.
And 'RUN_TYPE' is hybrid. 'RUN_REFCASE' is 'b.e20.B1850.f19_g17.release_cesm2_1_0.020'. These settings are defaulting. I don't know whether these can lead to errors.
1. The following is every step I took:
./create_newcase --case $CASEROOT --compset B1850 --res f19_g17 --mach NJU
cd $CASEROOT
./xmlchange --file env_run.xml --id DIN_LOC_ROOT --val $INPUTDIR
./xmlchange --file env_run.xml --id RUNDIR --val $RUNDIR
./xmlchange NTASKS_ATM=168,NTHRDS_ATM=1,ROOTPE_ATM=0
./xmlchange NTASKS_ICE=168,NTHRDS_ICE=1,ROOTPE_ICE=0
./xmlchange NTASKS_LND=168,NTHRDS_LND=1,ROOTPE_LND=0
./xmlchange NTASKS_CPL=168,NTHRDS_CPL=1,ROOTPE_CPL=0
./xmlchange NTASKS_ROF=168,NTHRDS_ROF=1,ROOTPE_ROF=0
./xmlchange NTASKS_OCN=168,NTHRDS_OCN=1,ROOTPE_OCN=0
./xmlchange NTASKS_GLC=168,NTHRDS_GLC=1,ROOTPE_GLC=0
./xmlchange NTASKS_WAV=168,NTHRDS_WAV=1,ROOTPE_WAV=0
./xmlchange NTASKS_ESP=168,NTHRDS_ESP=1,ROOTPE_ESP=0
./case.setup
./case.build
./xmlchange --file env_run.xml --id RESUBMIT --val '0'
./xmlchange --file env_run.xml --id CONTINUE_RUN --val 'FALSE'
./xmlchange --file env_run.xml --id STOP_N --val '10'
./xmlchange --file env_run.xml --id STOP_OPTION --val 'nyears'
./xmlchange --file env_run.xml --id REST_N --val '6'
./xmlchange --file env_run.xml --id REST_OPTION --val 'nmonth'
./xmlchange --file env_run.xml --id DOUT_S --val 'FALSE'
./case.submit
2. The cesm.log file is too large to upload, so I just put some certain error information:
xm_wpxp band solver: singular matrix
wp2_wp3 band solver: singular matrix
ERROR:
component_mod:check_fields NaN found in ATM instance: 1 field Sa_z 1d global
index: 4619
ERROR:
ERROR:
component_mod:check_fields NaN found in ATM instance: 1 field Faxa_bcphiwet
1d global index: 4333
component_mod:check_fields NaN found in ATM instance: 1 field Sa_z 1d global
index: 4763
ERROR:
component_mod:check_fields NaN found in ATM instance: 1 field Sa_z 1d global
ERROR:
component_mod:check_fields NaN found in ATM instance: 1 field Sa_z 1d global
index: 4331
index: 4188
ERROR:
component_mod:check_fields NaN found in ATM instance: 1 field Sa_z 1d global
index: 4621
ERROR:
component_mod:check_fields NaN found in ATM instance: 1 field Sa_z 1d global
index: 4189
ERROR:
component_mod:check_fields NaN found in ATM instance: 1 field Sa_z 1d global
index: 4907
ERROR:
component_mod:check_fields NaN found in ATM instance: 1 field Sa_z 1d global
index: 4765
ERROR:
component_mod:check_fields NaN found in ATM instance: 1 field Sa_z 1d global
index: 4477
ERROR:
component_mod:check_fields NaN found in ATM instance: 1 field Sa_z 1d global
index: 4909
xm_wpxp band solver: singular matrix
wp2_wp3 band solver: singular matrix
xm_wpxp band solver: singular matrix
wp2_wp3 band solver: singular matrix
xm_wpxp band solver: singular matrix
wp2_wp3 band solver: singular matrix
xm_wpxp band solver: singular matrix
wp2_wp3 band solver: singular matrix
ERROR:
component_mod:check_fields NaN found in ATM instance: 1 field Sa_z 1d global
index: 4473
Image PC Routine Line Source
cesm.exe 0000000002F89744 Unknown Unknown Unknown
cesm.exe 0000000002C1655E shr_abort_mod_mp_ 114 shr_abort_mod.F90
cesm.exe 0000000000435EB0 component_type_mo 257 component_type_mod.F90
cesm.exe 0000000000431C87 component_mod_mp_ 731 component_mod.F90
cesm.exe 000000000041885D cime_comp_mod_mp_ 3465 cime_comp_mod.F90
cesm.exe 0000000000431557 MAIN__ 125 cime_driver.F90
cesm.exe 0000000000414BDE Unknown Unknown Unknown
libc-2.17.so 00002B2557148B35 __libc_start_main Unknown Unknown
cesm.exe 0000000000414AE9 Unknown Unknown Unknown
application called MPI_Abort(MPI_COMM_WORLD, 1001) - process 129
I am looking forward to your replies! Thanks.
Gaya
I want to do PI experiment by choosing B1850 compset. My case could successfully build and run with no error. However, the case was crashed after 3 years(model time) with the following error in the cesm.log (ERROR: component_mod:check_fields NaN found in ATM instance: 1 field Sa_z 1d global). I don't know how to solve it. Can you give me some suggestions? Thank you.
And 'RUN_TYPE' is hybrid. 'RUN_REFCASE' is 'b.e20.B1850.f19_g17.release_cesm2_1_0.020'. These settings are defaulting. I don't know whether these can lead to errors.
1. The following is every step I took:
./create_newcase --case $CASEROOT --compset B1850 --res f19_g17 --mach NJU
cd $CASEROOT
./xmlchange --file env_run.xml --id DIN_LOC_ROOT --val $INPUTDIR
./xmlchange --file env_run.xml --id RUNDIR --val $RUNDIR
./xmlchange NTASKS_ATM=168,NTHRDS_ATM=1,ROOTPE_ATM=0
./xmlchange NTASKS_ICE=168,NTHRDS_ICE=1,ROOTPE_ICE=0
./xmlchange NTASKS_LND=168,NTHRDS_LND=1,ROOTPE_LND=0
./xmlchange NTASKS_CPL=168,NTHRDS_CPL=1,ROOTPE_CPL=0
./xmlchange NTASKS_ROF=168,NTHRDS_ROF=1,ROOTPE_ROF=0
./xmlchange NTASKS_OCN=168,NTHRDS_OCN=1,ROOTPE_OCN=0
./xmlchange NTASKS_GLC=168,NTHRDS_GLC=1,ROOTPE_GLC=0
./xmlchange NTASKS_WAV=168,NTHRDS_WAV=1,ROOTPE_WAV=0
./xmlchange NTASKS_ESP=168,NTHRDS_ESP=1,ROOTPE_ESP=0
./case.setup
./case.build
./xmlchange --file env_run.xml --id RESUBMIT --val '0'
./xmlchange --file env_run.xml --id CONTINUE_RUN --val 'FALSE'
./xmlchange --file env_run.xml --id STOP_N --val '10'
./xmlchange --file env_run.xml --id STOP_OPTION --val 'nyears'
./xmlchange --file env_run.xml --id REST_N --val '6'
./xmlchange --file env_run.xml --id REST_OPTION --val 'nmonth'
./xmlchange --file env_run.xml --id DOUT_S --val 'FALSE'
./case.submit
2. The cesm.log file is too large to upload, so I just put some certain error information:
xm_wpxp band solver: singular matrix
wp2_wp3 band solver: singular matrix
ERROR:
component_mod:check_fields NaN found in ATM instance: 1 field Sa_z 1d global
index: 4619
ERROR:
ERROR:
component_mod:check_fields NaN found in ATM instance: 1 field Faxa_bcphiwet
1d global index: 4333
component_mod:check_fields NaN found in ATM instance: 1 field Sa_z 1d global
index: 4763
ERROR:
component_mod:check_fields NaN found in ATM instance: 1 field Sa_z 1d global
ERROR:
component_mod:check_fields NaN found in ATM instance: 1 field Sa_z 1d global
index: 4331
index: 4188
ERROR:
component_mod:check_fields NaN found in ATM instance: 1 field Sa_z 1d global
index: 4621
ERROR:
component_mod:check_fields NaN found in ATM instance: 1 field Sa_z 1d global
index: 4189
ERROR:
component_mod:check_fields NaN found in ATM instance: 1 field Sa_z 1d global
index: 4907
ERROR:
component_mod:check_fields NaN found in ATM instance: 1 field Sa_z 1d global
index: 4765
ERROR:
component_mod:check_fields NaN found in ATM instance: 1 field Sa_z 1d global
index: 4477
ERROR:
component_mod:check_fields NaN found in ATM instance: 1 field Sa_z 1d global
index: 4909
xm_wpxp band solver: singular matrix
wp2_wp3 band solver: singular matrix
xm_wpxp band solver: singular matrix
wp2_wp3 band solver: singular matrix
xm_wpxp band solver: singular matrix
wp2_wp3 band solver: singular matrix
xm_wpxp band solver: singular matrix
wp2_wp3 band solver: singular matrix
ERROR:
component_mod:check_fields NaN found in ATM instance: 1 field Sa_z 1d global
index: 4473
Image PC Routine Line Source
cesm.exe 0000000002F89744 Unknown Unknown Unknown
cesm.exe 0000000002C1655E shr_abort_mod_mp_ 114 shr_abort_mod.F90
cesm.exe 0000000000435EB0 component_type_mo 257 component_type_mod.F90
cesm.exe 0000000000431C87 component_mod_mp_ 731 component_mod.F90
cesm.exe 000000000041885D cime_comp_mod_mp_ 3465 cime_comp_mod.F90
cesm.exe 0000000000431557 MAIN__ 125 cime_driver.F90
cesm.exe 0000000000414BDE Unknown Unknown Unknown
libc-2.17.so 00002B2557148B35 __libc_start_main Unknown Unknown
cesm.exe 0000000000414AE9 Unknown Unknown Unknown
application called MPI_Abort(MPI_COMM_WORLD, 1001) - process 129
I am looking forward to your replies! Thanks.
Gaya