Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

ERROR -Model did not complete

Dear all, I am running CLM 4.0 and I have an error after the firs 2 months of the performance.The ERROR in poe.stdout.94354.0: -------------------------------------------------------------------------
 CCSM BUILDNML SCRIPT STARTING
 - To prestage restarts, untar a restart.tar file into /gpfs/scratch/userexternal/igonzale/EXES/TEST_OZONE_2/run
 - Create modelio namelist input files
 CCSM BUILDNML SCRIPT HAS FINISHED SUCCESSFULLY
-------------------------------------------------------------------------
 CCSM PRESTAGE SCRIPT STARTING
 - CCSM input data directory, DIN_LOC_ROOT_CSMDATA, is /gpfs/scratch/userexternal/igonzale/CESM_FORCING/Inputdata
 - Case input data directory, DIN_LOC_ROOT, is /gpfs/scratch/userexternal/igonzale/CESM_FORCING/Inputdata
 - Checking the existence of input datasets in DIN_LOC_ROOT
 CCSM PRESTAGE SCRIPT HAS FINISHED SUCCESSFULLY
Mon Jun 17 13:34:02 CEST 2013 -- CSM EXECUTION BEGINS HERE
Mon Jun 17 13:39:25 CEST 2013 -- CSM EXECUTION HAS FINISHED
Model did not complete - see /gpfs/scratch/userexternal/igonzale/EXES/TEST_OZONE_2/run/cpl.log.130617-133400

And when I check in the run directory the following files are empty: cpl.log, lnd.log, atm.log.In the ccsm.log file the following statement:  column cbalance error =  -0.312500000000000000 7570
 BalanceCheck: soil balance error nstep =       560 point =   452 imbalance =   -0.000003 W/m2
 begcb       =  5972.42725145265285
 endcb       =  -49139579381009344.0
 delta store =  -49139579381015320.0
 input mass  =  -49210455589249256.0
 output mass =  -70876208233942.0938
 net flux    =  -49139579381015320.0
 nee         =  49139579381015320.0
 gpp         =  -49210455589249256.0
 er          =  -70876208233942.0938
 col_fire_closs         =  0.000000000000000000E+00
 col_hrv_xsmrpool_to_atm =  0.000000000000000000E+00
 dwt_closs         =  0.000000000000000000E+00
 product_closs         =  0.596358312833793116E-03
 ENDRUN: called without a message string
Abort(1) on node 141 (rank 141 in comm -2080374782): application called MPI_Abort(comm=0x84000002, 1) - process 141
2013-06-17 15:42:53.280 (WARN ) [0x40001058b00] :317776:ibm.runjob.client.Job: terminated by signal 6
2013-06-17 15:42:53.281 (WARN ) [0x40001058b00] :317776:ibm.runjob.client.Job: abnormal termination by signal 6 from rank 141

 Please, could anybody help? Thank you very much in advance, Iratxe  
 

slevis

Moderator
Staff member
I suspect that you have made code changes that result in this carbon balance error. You will have to revisit your code changes and debug. Sam Levis
 
slevis, thank you for your answer. however I tried to run the model without any modifications and the problem is still there. I change:> ./xmlchange -file env_build.xml -id DEBUG -val TRUE

and now the message in the ccsm log file is different:

(seq_frac_check) [atm init] sum ncnt/maxerr = 0 0.00000000000000000

Signal received: SIGFPE - Floating-point exception
Signal generated for floating-point exception:
FP invalid operation

Instruction that generated the exception:

fmadd fr01,fr00,fr06,fr06
Signal received: SIGFPE - Floating-point exception
Source Operand values:

fr00 = -1.52584665028013e-02
Signal generated for floating-point exception:
fr06 = nan
Signal received: SIGFPE - Floating-point exception
fr06 = nan
FP invalid operation

Signal generated for floating-point exception:
Traceback:

FP invalid operation

Instruction that generated the exception:
Signal received: SIGFPE - Floating-point exception

Signal generated for floating-point exception:
fmadd fr01,fr00,fr06,fr06
FP invalid operation
Instruction that generated the exception:

Source Operand values:
Instruction that generated the exception:
fmadd fr01,fr00,fr06,fr06
Location 0x0000000001919c10
fr00 = 3.46547550751238e-02
fmadd fr01,fr00,fr06,fr06
Source Operand values:
Source Operand values:
fr06 = -nan
fr00 = -2.52377928500718e-03
fr00 = -2.44399455932834e-02
fr06 = nan
fr06 = nan
fr06 = -nan

fr06 = -nan
Traceback:

open: No such file or directory
fr06 = -nan
Location 0x0000000001919c10
Traceback:

Traceback:
open: No such file or directory
Location 0x0000000001919c10
open: No such file or directory
Location 0x0000000001919c10
Offset 0x0000140c in procedure __dustmod_NMOD_dustdrydep, near line 522 in file DUSTMod.F90
Offset 0x00000a48 in procedure __clm_driver_NMOD_clm_drv$$OL$$5, near line 533 in file clm_driver.F90
Location 0x00000000021fb040
Location 0x00000000022128f0
Location 0x000000000210159c
--- End of call chain ---

Offset 0x0000140c in procedure __dustmod_NMOD_dustdrydep, near line 522 in file DUSTMod.F90
Offset 0x00000a48 in procedure __clm_driver_NMOD_clm_drv$$OL$$5, near line 533 in file clm_driver.F90
Location 0x00000000021fb040
Location 0x00000000021fe504
Offset 0x00000910 in procedure __clm_driver_NMOD_clm_drv, near line 410 in file clm_driver.F90
Offset 0x00001674 in procedure __lnd_comp_mct_NMOD_lnd_run_mct, near line 693 in file lnd_comp_mct.F90
2013-06-25 11:57:28.489 (WARN ) [0x40001058b00] :329786:ibm.runjob.client.Job: terminated by signal 8
2013-06-25 11:57:28.489 (WARN ) [0x40001058b00] :329786:ibm.runjob.client.Job: abnormal termination by signal 8 from rank 32

Any idea what is happening?
thank you again ...
Iratxe
 

slevis

Moderator
Staff member
If you're using the model with no changes and failing, then I wonder whether you're trying to run an unsupported configuration or on an unsupported platform. Sam Levis
 

jedwards

CSEG and Liaisons
Staff member
After switching to debug mode did you run *clean_build and *.build?    If not you should try that.   Also if you are using threads you should trying turning them off.  
 
Hi,Im getting a similar error while running:I was following the Quick USer guide given in the CLM documentation and was using the same compset used i.e I1850CRUCLM45BGCI for 1 degree resolution.
On submitting the job I am undergoing the following problem:
ERROR: (shr_stream_getCalendar)  ERROR: nf90_open file USERDEFINED_optional_build/atm_forcing.datm7.cruncep_qianFill.0.5d.V4.c130305/Solar6Hrly/clmforc.cruncep.V4.c2011.0.5d.Solr.1901-01.nc
(shr_sys_abort) WARNING: calling shr_mpi_abort() and stopping If you are able to solve yours and share with me then it might possibly help me restore my run..
Thanks
 

jedwards

CSEG and Liaisons
Staff member
> USERDEFINED_optional_buildThis indicatates that you haven't defined an input data path in your env_run.xml 
 
 Dear allI am running CLM 4.5 and I have a similar problem,there is an error after the 175 timesteps of the performance.The error in lnd.log:------------------------------------------------------------------------------------------------------------------------ BalanceCheck: soil balance error nstep =       175 point =   853 imbalance =************ W/m2 nstep =          175  indexc=          853  errsoi_col= -2.866175917345276E+018 lwrad :   0.000000000000000E+000 solar :   0.000000000000000E+000 t     :    253.595138549805 wind  :    7.03126025199890 q     :   4.701063880929723E-004 pbot  :    58765.9238281250 rain  :   0.000000000000000E+000 snow  :   0.000000000000000E+000 pgridcell :          146 clm model is stopping ENDRUN: called without a message string----------------------------------------------------------------------------------------------------------------------------I didn't change any code of the model but the Balancechedk.F90 to export the variables ,as shown above,the longwave and shortwave is zero,but I checked the forcing atmosphere inputdata and original data,I'm sure they are not zero.I don't know what happened and how to deal with the problem.PS:I used different data ran the CLM for 3 times and got the same error every time.Please, could anybody help?Thank you very much in advance,Sincerely Luna                                                   
 
Top