Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

CESM2.0.1 failed finishing FC2000climo and FC2010climo cases after producing all the outputs, missing rpointer.atm

Hi, I have been trying to get CESM2.0.1 running on our own machine. It works fine with I compets, but there are problems when running F compets (FC2000climo and FC2010climo, with resolution f19_f19). After the simulations, CESM can output the results normally, but it cannot produce the proper pointer file for CAM and it just hangs there forever without a single error message. I tried to stop the process manually and force a CONTINUE RUN but it failed to start.

Has anyone encountered a similar problem? I will be much appreciated if someone can give me some hints in tackling this issue.

I reran my simulations with DEBUG option ON. Please check the attached logs for your reference.

Some other additional info:

Our machine is a using netcdf-4.6.1, mpich-3.2.intel.
 

jedwards

CSEG and Liaisons
Staff member
If you have a parallel file system I would recommend pnetcdf over netcdf.   Otherwise try different values of PIO_STRIDE.  
 

jedwards

CSEG and Liaisons
Staff member
If you have a parallel file system I would recommend pnetcdf over netcdf.   Otherwise try different values of PIO_STRIDE.  
 

jedwards

CSEG and Liaisons
Staff member
If you have a parallel file system I would recommend pnetcdf over netcdf.   Otherwise try different values of PIO_STRIDE.  
 
Thanks, @jedwards. I tried again with parallel-netcdf-1.9.0. The model is still not finishing its job. Then I try to use less CPUs and only 1 node (previous tests were on 2 nodes) and the model works just fine. Probably some setting problem related to our cluster and the F compset. Will do a further investigation later.
 
Thanks, @jedwards. I tried again with parallel-netcdf-1.9.0. The model is still not finishing its job. Then I try to use less CPUs and only 1 node (previous tests were on 2 nodes) and the model works just fine. Probably some setting problem related to our cluster and the F compset. Will do a further investigation later.
 
Thanks, @jedwards. I tried again with parallel-netcdf-1.9.0. The model is still not finishing its job. Then I try to use less CPUs and only 1 node (previous tests were on 2 nodes) and the model works just fine. Probably some setting problem related to our cluster and the F compset. Will do a further investigation later.
 

jedwards

CSEG and Liaisons
Staff member
If it works on one node but not more than one it's possible that the filesystem is not mounted on one of the nodes - make sure that you can see therun directory and that it's writable from all of the compute tasks. 
 

jedwards

CSEG and Liaisons
Staff member
If it works on one node but not more than one it's possible that the filesystem is not mounted on one of the nodes - make sure that you can see therun directory and that it's writable from all of the compute tasks. 
 

jedwards

CSEG and Liaisons
Staff member
If it works on one node but not more than one it's possible that the filesystem is not mounted on one of the nodes - make sure that you can see therun directory and that it's writable from all of the compute tasks. 
 

ohmpawat

ohmpawat chen
Member
Hi, I have been trying to get CESM2.0.1 running on our own machine. It works fine with I compets, but there are problems when running F compets (FC2000climo and FC2010climo, with resolution f19_f19). After the simulations, CESM can output the results normally, but it cannot produce the proper pointer file for CAM and it just hangs there forever without a single error message. I tried to stop the process manually and force a CONTINUE RUN but it failed to start.

Has anyone encountered a similar problem? I will be much appreciated if someone can give me some hints in tackling this issue.

I reran my simulations with DEBUG option ON. Please check the attached logs for your reference.

Some other additional info:

Our machine is a using netcdf-4.6.1, mpich-3.2.intel.
Hi, do you know the difference between FC2000climo and FC2010climo? Besides, according my knowledge, FC2000climo and FC2010climo supported resolution is f09_f09_mg17. What should I do so that I can use the f19_f19_mg17 resolution? Any tips would be appreciated! Thanks a lot!
 
Top