no direct error in atm/lnd.log for single-point simulation,but cesm.log says the dimension error

Status
Not open for further replies.

xiaoxiaokuishu

Ru Xu
Member
Hi, all

I run a single-point simulation at lon:103.3 ,lat: -1.83 with clm5, the surdata is set as follow:
PCT_CROP=0, PCT_NATVEG=100, PCT_NAT_PFT[4]=100, so it means it is a pure 100% tropical forest site.

When I do the cold startup (acclerated ),

./xmlchange CLM_FORCE_COLDSTART=on,CLM_ACCELERATED_SPINUP=on,


the error from cesm.log , but I did not find any clue of error under atm.log and lnd.log...
NetCDF: Invalid dimension ID or name
NetCDF: Invalid dimension ID or name
NetCDF: Invalid dimension ID or name

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Backtrace for this error:
#0 0x15137a2dcd4f in ???
#1 0xb237a5 in ???
#2 0xaa3e29 in ???
#3 0x535c3e in ???
#4 0x53e8d4 in ???
#5 0x57f287 in ???
#6 0x4addc3 in ???
#7 0x49fad1 in ???
#8 0x422894 in ???
#9 0x413c03 in ???
#10 0x407be7 in ???
#11 0x15137a2c729c in ???
#12 0x407d79 in ???
at ../sysdeps/x86_64/start.S:120
#13 0xffffffffffffffff in ???
srun: error: nid006517: task 0: Segmentation fault (core dumped)
srun: launch/slurm: _step_signal: Terminating StepId=8557614.0

Can you help to debug the error, I attach the surfdata and log file for your reference.

Best
 

Attachments

slevis

Moderator
Staff member
@xiaoxiaokuishu unfortunately you have to troubleshoot this yourself because it is a custom case rather than a supported case.

I will share some troubleshooting ideas:
1) I looked at the lnd and cesm log files briefly and the model appears to stop when reading the domain file. This may be a clue, although sometimes lacking an explicit error can lead us in the wrong direction. If you do not see a problem with your domain file, then try (2) or (3).
2) If you have a different single-point case that ran successfully, then I suggest comparing to it until you discover the difference that explains the problem.
3) If you do not, then I suggest setting up one of the supported single-point cases (examples may be found in /cime_config/testdefs/testlist_clm.xml and more info here System Testing Guide) and using that for the comparison until you discover the problem.

You may also think of other troubleshooting approaches. I want to acknowledge that troubleshooting is difficult and becomes easier with experience.
 
Last edited:

xiaoxiaokuishu

Ru Xu
Member
@xiaoxiaokuishu unfortunately you have to troubleshoot this yourself because it is a custom case rather than a supported case.

I will share some troubleshooting ideas:
1) I looked at the lnd and cesm log files briefly and the model appears to stop when reading the domain file. This may be a clue, although sometimes lacking an explicit error can lead us in the wrong direction. If you do not see a problem with your domain file, then try (2) or (3).
2) If you have a different single-point case that ran successfully, then I suggest comparing to it until you discover the difference that explains the problem.
3) If you do not, then I suggest setting up one of the supported single-point cases (examples may be found in /cime_config/testdefs/testlist_clm.xml and more info here System Testing Guide) and using that for the comparison until you discover the problem.

You may also think of other troubleshooting approaches. I want to acknowledge that troubleshooting is difficult and becomes easier with experience.
Hi, Slevis,

Thanks so much for the reply. It is a really stupid problem, that is the variable of PCT_NATVEG should be
PCT_NATVEG(lsmlon,lsmlat) in surfdata,
but I only have PCT_NATVEG... not dimensions there....


Best
Ru
 

slevis

Moderator
Staff member
I'm glad you solved it, and I hope that my suggestions give you ideas for how to troubleshoot next time that you encounter a problem.
 
Status
Not open for further replies.
Back
Top