Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

Error when executing ./case.submit in CESM2.1.3, ERROR: Segmentation fault (core dumped)

Lei

Lei
New Member
Dear all,
I am currently running the case with --res f19_g17 --compset BSSP585cmip6 on my machine. After successfully created and built the case, when execute command ./case.submit, it will die immediately. It shows the following error information in the cesm.log has only one line: srun: error: c8n1: task 2: Segmentation fault (core dumped)

I don't know why the case with --res f09_g17 --compset BSSP126cmip6 can run successfully on my machine but this one fails, and I really appreciate it if anyone can help me with this. Thank a lot!!
 

Attachments

  • config_compiler.txt
    1.9 KB · Views: 1
  • config_machine.txt
    3.7 KB · Views: 2
  • cesm.log.29898.230925-144820.txt
    60 bytes · Views: 3

jedwards

CSEG and Liaisons
Staff member
I don't know either - but it's dying before the model starts. Can you compare
./preview_run for the two cases and see if anything is apparent there?
Also is it possible that the f19_g17 run just landed on a bad node? (c8n1 might be suspect)
 

Lei

Lei
New Member
I don't know either - but it's dying before the model starts. Can you compare
./preview_run for the two cases and see if anything is apparent there?
Also is it possible that the f19_g17 run just landed on a bad node? (c8n1 might be suspect)
Thanks for your reply! The results in both cases are the same, and when I avoid the C8N1 node to submit the job, a new error in cesm.log has occurred: /share/home/intel/compilers_and_libraries_2020.0.139/linux/mpi/intel64/bin/mpirun: line 103: 155770 Segmentation fault (core dumped) mpiexec.hydra "$@" 0<&0. What's even worse is that the same error occurred when I resubmitted the case that I could have successfully run.
How can this be resolved? Thank you in advance!
 

Attachments

  • preview_run_case2.txt
    2.6 KB · Views: 0
  • preview_run_case1.txt
    2.6 KB · Views: 0
  • cesm.log.29950.230926-150437.txt
    191 bytes · Views: 3

jedwards

CSEG and Liaisons
Staff member
I'm sorry but this is a system error and not a cesm error, you will need to discuss with your local system administrator.
 

Lei

Lei
New Member
I'm sorry but this is a system error and not a cesm error, you will need to discuss with your local system administrator.
Thank you for your suggestion. As you said, this is not a model error. I found a solution on the intel community: export I_MPI_HYDRA_TOPOLIB=ipl.
 
Top