Hello everyone, I want to use B1850 to do PI experiment. Then, I choose 'startup' for runtype and it successfully run for 60 more model years, however, it crashed at 63yr. Some main error information is listed following. I don't know how to solve it. Any advice is welcome. Thanks in advance.
1. Errors in cesm.log file:
xm_wpxp band solver: singular matrix
wp2_wp3 band solver: singular matrix
ERROR:
component_mod:check_fields NaN found in ATM instance: 1 field Sa_z 1d global
index: 4765
ERROR:
component_mod:check_fields NaN found in ATM instance: 1 field Faxa_dstwet3 1
d global index: 4621
ERROR:
component_mod:check_fields NaN found in ATM instance: 1 field Sa_z 1d global
index: 4331
ERROR:
component_mod:check_fields NaN found in ATM instance: 1 field Sa_z 1d global
index: 4188
ERROR:
component_mod:check_fields NaN found in ATM instance: 1 field Sa_z 1d global
index: 4620
ERROR:
component_mod:check_fields NaN found in ATM instance: 1 field Sa_z 1d global
index: 4189
ERROR:
component_mod:check_fields NaN found in ATM instance: 1 field Sa_z 1d global
index: 4762
ERROR:
component_mod:check_fields NaN found in ATM instance: 1 field Faxa_dstwet3 1
d global index: 4477
ERROR:
component_mod:check_fields NaN found in ATM instance: 1 field Sa_z 1d global
index: 4474
Image PC Routine Line Source
cesm.exe 0000000002F89744 Unknown Unknown Unknown
cesm.exe 0000000002C1655E shr_abort_mod_mp_ 114 shr_abort_mod.F90
cesm.exe 0000000000435EB0 component_type_mo 257 component_type_mod.F90
cesm.exe 0000000000431C87 component_mod_mp_ 731 component_mod.F90
cesm.exe 000000000041885D cime_comp_mod_mp_ 3465 cime_comp_mod.F90
cesm.exe 0000000000431557 MAIN__ 125 cime_driver.F90
cesm.exe 0000000000414BDE Unknown Unknown Unknown
libc-2.17.so 00002B0655ABAB35 __libc_start_main Unknown Unknown
cesm.exe 0000000000414AE9 Unknown Unknown Unknown
application called MPI_Abort(MPI_COMM_WORLD, 1001) - process 129
Image PC Routine Line Source
cesm.exe 0000000002F89744 Unknown Unknown Unknown
cesm.exe 0000000002C1655E shr_abort_mod_mp_ 114 shr_abort_mod.F90
cesm.exe 0000000000435EB0 component_type_mo 257 component_type_mod.F90
cesm.exe 0000000000431C87 component_mod_mp_ 731 component_mod.F90
cesm.exe 000000000041885D cime_comp_mod_mp_ 3465 cime_comp_mod.F90
cesm.exe 0000000000431557 MAIN__ 125 cime_driver.F90
cesm.exe 0000000000414BDE Unknown Unknown Unknown
libc-2.17.so 00002B060C8FBB35 __libc_start_main Unknown Unknown
cesm.exe 0000000000414AE9 Unknown Unknown Unknown
Image PC Routine Line Source
cesm.exe 0000000002F89744 Unknown Unknown Unknown
cesm.exe 0000000002C1655E shr_abort_mod_mp_ 114 shr_abort_mod.F90
cesm.exe 0000000000435EB0 component_type_mo 257 component_type_mod.F90
cesm.exe 0000000000431C87 component_mod_mp_ 731 component_mod.F90
cesm.exe 000000000041885D cime_comp_mod_mp_ 3465 cime_comp_mod.F90
cesm.exe 0000000000431557 MAIN__ 125 cime_driver.F90
cesm.exe 0000000000414BDE Unknown Unknown Unknown
libc-2.17.so 00002AD925E4DB35 __libc_start_main Unknown Unknown
cesm.exe 0000000000414AE9 Unknown Unknown Unknown
Image PC Routine Line Source
cesm.exe 0000000002F89744 Unknown Unknown Unknown
cesm.exe 0000000002C1655E shr_abort_mod_mp_ 114 shr_abort_mod.F90
cesm.exe 0000000000435EB0 component_type_mo 257 component_type_mod.F90
cesm.exe 0000000000431C87 component_mod_mp_ 731 component_mod.F90
cesm.exe 000000000041885D cime_comp_mod_mp_ 3465 cime_comp_mod.F90
cesm.exe 0000000000431557 MAIN__ 125 cime_driver.F90
cesm.exe 0000000000414BDE Unknown Unknown Unknown
libc-2.17.so 00002B0B8C5D9B35 __libc_start_main Unknown Unknown
cesm.exe 0000000000414AE9 Unknown Unknown Unknown
Image PC Routine Line Source
cesm.exe 0000000002F89744 Unknown Unknown Unknown
cesm.exe 0000000002C1655E shr_abort_mod_mp_ 114 shr_abort_mod.F90
cesm.exe 0000000000435EB0 component_type_mo 257 component_type_mod.F90
cesm.exe 0000000000431C87 component_mod_mp_ 731 component_mod.F90
cesm.exe 000000000041885D cime_comp_mod_mp_ 3465 cime_comp_mod.F90
cesm.exe 0000000000431557 MAIN__ 125 cime_driver.F90
cesm.exe 0000000000414BDE Unknown Unknown Unknown
libc-2.17.so 00002B5C4954EB35 __libc_start_main Unknown Unknown
cesm.exe 0000000000414AE9 Unknown Unknown Unknown
Image PC Routine Line Source
cesm.exe 0000000002F89744 Unknown Unknown Unknown
cesm.exe 0000000002C1655E shr_abort_mod_mp_ 114 shr_abort_mod.F90
cesm.exe 0000000000435EB0 component_type_mo 257 component_type_mod.F90
cesm.exe 0000000000431C87 component_mod_mp_ 731 component_mod.F90
cesm.exe 000000000041885D cime_comp_mod_mp_ 3465 cime_comp_mod.F90
cesm.exe 0000000000431557 MAIN__ 125 cime_driver.F90
cesm.exe 0000000000414BDE Unknown Unknown Unknown
libc-2.17.so 00002B7E56EB3B35 __libc_start_main Unknown Unknown
cesm.exe 0000000000414AE9 Unknown Unknown Unknown
application called MPI_Abort(MPI_COMM_WORLD, 1001) - process 29
application called MPI_Abort(MPI_COMM_WORLD, 1001) - process 60
Image PC Routine Line Source
cesm.exe 0000000002F89744 Unknown Unknown Unknown
cesm.exe 0000000002C1655E shr_abort_mod_mp_ 114 shr_abort_mod.F90
cesm.exe 0000000000435EB0 component_type_mo 257 component_type_mod.F90
cesm.exe 0000000000431C87 component_mod_mp_ 731 component_mod.F90
cesm.exe 000000000041885D cime_comp_mod_mp_ 3465 cime_comp_mod.F90
cesm.exe 0000000000431557 MAIN__ 125 cime_driver.F90
cesm.exe 0000000000414BDE Unknown Unknown Unknown
libc-2.17.so 00002ABDD3A94B35 __libc_start_main Unknown Unknown
cesm.exe 0000000000414AE9 Unknown Unknown Unknown
application called MPI_Abort(MPI_COMM_WORLD, 1001) - process 36
application called MPI_Abort(MPI_COMM_WORLD, 1001) - process 64
application called MPI_Abort(MPI_COMM_WORLD, 1001) - process 98
application called MPI_Abort(MPI_COMM_WORLD, 1001) - process 104
application called MPI_Abort(MPI_COMM_WORLD, 1001) - process 105
application called MPI_Abort(MPI_COMM_WORLD, 1001) - process 111
Image PC Routine Line Source
cesm.exe 0000000002F89744 Unknown Unknown Unknown
cesm.exe 0000000002C1655E shr_abort_mod_mp_ 114 shr_abort_mod.F90
cesm.exe 0000000000435EB0 component_type_mo 257 component_type_mod.F90
cesm.exe 0000000000431C87 component_mod_mp_ 731 component_mod.F90
cesm.exe 000000000041885D cime_comp_mod_mp_ 3465 cime_comp_mod.F90
cesm.exe 0000000000431557 MAIN__ 125 cime_driver.F90
cesm.exe 0000000000414BDE Unknown Unknown Unknown
libc-2.17.so 00002B70B858DB35 __libc_start_main Unknown Unknown
cesm.exe 0000000000414AE9 Unknown Unknown Unknown
application called MPI_Abort(MPI_COMM_WORLD, 1001) - process 75
application called MPI_Abort(MPI_COMM_WORLD, 1001) - process 71
2. The following is every step I took:
./create_newcase --case $CASEROOT --compset B1850 --res f19_g17 --mach NJU
cd $CASEROOT
./xmlchange --file env_run.xml --id DIN_LOC_ROOT --val $INPUTDIR
./xmlchange --file env_run.xml --id RUNDIR --val $RUNDIR
./xmlchange --file env_run.xml --id RUNTYPE --val 'startup'
./xmlchange NTASKS_ATM=168,NTHRDS_ATM=1,ROOTPE_ATM=0
./xmlchange NTASKS_ICE=168,NTHRDS_ICE=1,ROOTPE_ICE=0
./xmlchange NTASKS_LND=168,NTHRDS_LND=1,ROOTPE_LND=0
./xmlchange NTASKS_CPL=168,NTHRDS_CPL=1,ROOTPE_CPL=0
./xmlchange NTASKS_ROF=168,NTHRDS_ROF=1,ROOTPE_ROF=0
./xmlchange NTASKS_OCN=168,NTHRDS_OCN=1,ROOTPE_OCN=0
./xmlchange NTASKS_GLC=168,NTHRDS_GLC=1,ROOTPE_GLC=0
./xmlchange NTASKS_WAV=168,NTHRDS_WAV=1,ROOTPE_WAV=0
./xmlchange NTASKS_ESP=168,NTHRDS_ESP=1,ROOTPE_ESP=0
./case.setup
./case.build
./xmlchange --file env_run.xml --id RESUBMIT --val '9'
./xmlchange --file env_run.xml --id CONTINUE_RUN --val 'FALSE'
./xmlchange --file env_run.xml --id STOP_N --val '10'
./xmlchange --file env_run.xml --id STOP_OPTION --val 'nyears'
./xmlchange --file env_run.xml --id REST_N --val '5'
./xmlchange --file env_run.xml --id REST_OPTION --val 'nyears'
./xmlchange --file env_run.xml --id DOUT_S --val 'FALSE'
./case.submit
No other changes.
1. Errors in cesm.log file:
xm_wpxp band solver: singular matrix
wp2_wp3 band solver: singular matrix
ERROR:
component_mod:check_fields NaN found in ATM instance: 1 field Sa_z 1d global
index: 4765
ERROR:
component_mod:check_fields NaN found in ATM instance: 1 field Faxa_dstwet3 1
d global index: 4621
ERROR:
component_mod:check_fields NaN found in ATM instance: 1 field Sa_z 1d global
index: 4331
ERROR:
component_mod:check_fields NaN found in ATM instance: 1 field Sa_z 1d global
index: 4188
ERROR:
component_mod:check_fields NaN found in ATM instance: 1 field Sa_z 1d global
index: 4620
ERROR:
component_mod:check_fields NaN found in ATM instance: 1 field Sa_z 1d global
index: 4189
ERROR:
component_mod:check_fields NaN found in ATM instance: 1 field Sa_z 1d global
index: 4762
ERROR:
component_mod:check_fields NaN found in ATM instance: 1 field Faxa_dstwet3 1
d global index: 4477
ERROR:
component_mod:check_fields NaN found in ATM instance: 1 field Sa_z 1d global
index: 4474
Image PC Routine Line Source
cesm.exe 0000000002F89744 Unknown Unknown Unknown
cesm.exe 0000000002C1655E shr_abort_mod_mp_ 114 shr_abort_mod.F90
cesm.exe 0000000000435EB0 component_type_mo 257 component_type_mod.F90
cesm.exe 0000000000431C87 component_mod_mp_ 731 component_mod.F90
cesm.exe 000000000041885D cime_comp_mod_mp_ 3465 cime_comp_mod.F90
cesm.exe 0000000000431557 MAIN__ 125 cime_driver.F90
cesm.exe 0000000000414BDE Unknown Unknown Unknown
libc-2.17.so 00002B0655ABAB35 __libc_start_main Unknown Unknown
cesm.exe 0000000000414AE9 Unknown Unknown Unknown
application called MPI_Abort(MPI_COMM_WORLD, 1001) - process 129
Image PC Routine Line Source
cesm.exe 0000000002F89744 Unknown Unknown Unknown
cesm.exe 0000000002C1655E shr_abort_mod_mp_ 114 shr_abort_mod.F90
cesm.exe 0000000000435EB0 component_type_mo 257 component_type_mod.F90
cesm.exe 0000000000431C87 component_mod_mp_ 731 component_mod.F90
cesm.exe 000000000041885D cime_comp_mod_mp_ 3465 cime_comp_mod.F90
cesm.exe 0000000000431557 MAIN__ 125 cime_driver.F90
cesm.exe 0000000000414BDE Unknown Unknown Unknown
libc-2.17.so 00002B060C8FBB35 __libc_start_main Unknown Unknown
cesm.exe 0000000000414AE9 Unknown Unknown Unknown
Image PC Routine Line Source
cesm.exe 0000000002F89744 Unknown Unknown Unknown
cesm.exe 0000000002C1655E shr_abort_mod_mp_ 114 shr_abort_mod.F90
cesm.exe 0000000000435EB0 component_type_mo 257 component_type_mod.F90
cesm.exe 0000000000431C87 component_mod_mp_ 731 component_mod.F90
cesm.exe 000000000041885D cime_comp_mod_mp_ 3465 cime_comp_mod.F90
cesm.exe 0000000000431557 MAIN__ 125 cime_driver.F90
cesm.exe 0000000000414BDE Unknown Unknown Unknown
libc-2.17.so 00002AD925E4DB35 __libc_start_main Unknown Unknown
cesm.exe 0000000000414AE9 Unknown Unknown Unknown
Image PC Routine Line Source
cesm.exe 0000000002F89744 Unknown Unknown Unknown
cesm.exe 0000000002C1655E shr_abort_mod_mp_ 114 shr_abort_mod.F90
cesm.exe 0000000000435EB0 component_type_mo 257 component_type_mod.F90
cesm.exe 0000000000431C87 component_mod_mp_ 731 component_mod.F90
cesm.exe 000000000041885D cime_comp_mod_mp_ 3465 cime_comp_mod.F90
cesm.exe 0000000000431557 MAIN__ 125 cime_driver.F90
cesm.exe 0000000000414BDE Unknown Unknown Unknown
libc-2.17.so 00002B0B8C5D9B35 __libc_start_main Unknown Unknown
cesm.exe 0000000000414AE9 Unknown Unknown Unknown
Image PC Routine Line Source
cesm.exe 0000000002F89744 Unknown Unknown Unknown
cesm.exe 0000000002C1655E shr_abort_mod_mp_ 114 shr_abort_mod.F90
cesm.exe 0000000000435EB0 component_type_mo 257 component_type_mod.F90
cesm.exe 0000000000431C87 component_mod_mp_ 731 component_mod.F90
cesm.exe 000000000041885D cime_comp_mod_mp_ 3465 cime_comp_mod.F90
cesm.exe 0000000000431557 MAIN__ 125 cime_driver.F90
cesm.exe 0000000000414BDE Unknown Unknown Unknown
libc-2.17.so 00002B5C4954EB35 __libc_start_main Unknown Unknown
cesm.exe 0000000000414AE9 Unknown Unknown Unknown
Image PC Routine Line Source
cesm.exe 0000000002F89744 Unknown Unknown Unknown
cesm.exe 0000000002C1655E shr_abort_mod_mp_ 114 shr_abort_mod.F90
cesm.exe 0000000000435EB0 component_type_mo 257 component_type_mod.F90
cesm.exe 0000000000431C87 component_mod_mp_ 731 component_mod.F90
cesm.exe 000000000041885D cime_comp_mod_mp_ 3465 cime_comp_mod.F90
cesm.exe 0000000000431557 MAIN__ 125 cime_driver.F90
cesm.exe 0000000000414BDE Unknown Unknown Unknown
libc-2.17.so 00002B7E56EB3B35 __libc_start_main Unknown Unknown
cesm.exe 0000000000414AE9 Unknown Unknown Unknown
application called MPI_Abort(MPI_COMM_WORLD, 1001) - process 29
application called MPI_Abort(MPI_COMM_WORLD, 1001) - process 60
Image PC Routine Line Source
cesm.exe 0000000002F89744 Unknown Unknown Unknown
cesm.exe 0000000002C1655E shr_abort_mod_mp_ 114 shr_abort_mod.F90
cesm.exe 0000000000435EB0 component_type_mo 257 component_type_mod.F90
cesm.exe 0000000000431C87 component_mod_mp_ 731 component_mod.F90
cesm.exe 000000000041885D cime_comp_mod_mp_ 3465 cime_comp_mod.F90
cesm.exe 0000000000431557 MAIN__ 125 cime_driver.F90
cesm.exe 0000000000414BDE Unknown Unknown Unknown
libc-2.17.so 00002ABDD3A94B35 __libc_start_main Unknown Unknown
cesm.exe 0000000000414AE9 Unknown Unknown Unknown
application called MPI_Abort(MPI_COMM_WORLD, 1001) - process 36
application called MPI_Abort(MPI_COMM_WORLD, 1001) - process 64
application called MPI_Abort(MPI_COMM_WORLD, 1001) - process 98
application called MPI_Abort(MPI_COMM_WORLD, 1001) - process 104
application called MPI_Abort(MPI_COMM_WORLD, 1001) - process 105
application called MPI_Abort(MPI_COMM_WORLD, 1001) - process 111
Image PC Routine Line Source
cesm.exe 0000000002F89744 Unknown Unknown Unknown
cesm.exe 0000000002C1655E shr_abort_mod_mp_ 114 shr_abort_mod.F90
cesm.exe 0000000000435EB0 component_type_mo 257 component_type_mod.F90
cesm.exe 0000000000431C87 component_mod_mp_ 731 component_mod.F90
cesm.exe 000000000041885D cime_comp_mod_mp_ 3465 cime_comp_mod.F90
cesm.exe 0000000000431557 MAIN__ 125 cime_driver.F90
cesm.exe 0000000000414BDE Unknown Unknown Unknown
libc-2.17.so 00002B70B858DB35 __libc_start_main Unknown Unknown
cesm.exe 0000000000414AE9 Unknown Unknown Unknown
application called MPI_Abort(MPI_COMM_WORLD, 1001) - process 75
application called MPI_Abort(MPI_COMM_WORLD, 1001) - process 71
2. The following is every step I took:
./create_newcase --case $CASEROOT --compset B1850 --res f19_g17 --mach NJU
cd $CASEROOT
./xmlchange --file env_run.xml --id DIN_LOC_ROOT --val $INPUTDIR
./xmlchange --file env_run.xml --id RUNDIR --val $RUNDIR
./xmlchange --file env_run.xml --id RUNTYPE --val 'startup'
./xmlchange NTASKS_ATM=168,NTHRDS_ATM=1,ROOTPE_ATM=0
./xmlchange NTASKS_ICE=168,NTHRDS_ICE=1,ROOTPE_ICE=0
./xmlchange NTASKS_LND=168,NTHRDS_LND=1,ROOTPE_LND=0
./xmlchange NTASKS_CPL=168,NTHRDS_CPL=1,ROOTPE_CPL=0
./xmlchange NTASKS_ROF=168,NTHRDS_ROF=1,ROOTPE_ROF=0
./xmlchange NTASKS_OCN=168,NTHRDS_OCN=1,ROOTPE_OCN=0
./xmlchange NTASKS_GLC=168,NTHRDS_GLC=1,ROOTPE_GLC=0
./xmlchange NTASKS_WAV=168,NTHRDS_WAV=1,ROOTPE_WAV=0
./xmlchange NTASKS_ESP=168,NTHRDS_ESP=1,ROOTPE_ESP=0
./case.setup
./case.build
./xmlchange --file env_run.xml --id RESUBMIT --val '9'
./xmlchange --file env_run.xml --id CONTINUE_RUN --val 'FALSE'
./xmlchange --file env_run.xml --id STOP_N --val '10'
./xmlchange --file env_run.xml --id STOP_OPTION --val 'nyears'
./xmlchange --file env_run.xml --id REST_N --val '5'
./xmlchange --file env_run.xml --id REST_OPTION --val 'nyears'
./xmlchange --file env_run.xml --id DOUT_S --val 'FALSE'
./case.submit
No other changes.