Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

Large restart file output error in a high resolution configuration for CAM5-EUL dynamic core

I configured the CAM5-EUL dynamical core  with an experimental high resolution {400(lat)x800(lon)x60(lev)} for a research. The model runs ok, but when it comes to write the restart file for cam (e.g., xxxx.cam2.r.xxxx), it hangs there forever...(till I kill it)The  generated restart file is about 17gb.   I found this problem may be associted with one single field's size. For example, in restart_dynamics.F90, the Q and Q_fcst seem to be too large to be generated. The model hangs when writing these two vars, so I splitted them to 5 smaller vars (according to pcnst dim), say from Q1 to Q5 and Q1_fcst to Q5_fcst, then the model can pass this part of codes. However, it still hangs at the restart_physics.F90 codes. If I comment call pbuf_write_restart(File) in restart_physics.F90, the model can successfully write a restart file, or it will hang when writing those physics buffer variables. I've also tested a resolution as {400(lat)x800(lon)x30(lev)}, all restart files can be successfully generated.  Any suggestion will be highly appreciated!!!

 
 

eaton

CSEG and Liaisons
The problem seems to be a memory limitation.  Since the 30 level grid works, then the 60 level grid should work if you double the number of nodes being used to run the simulation.  Is that possible?
 

eaton

CSEG and Liaisons
The problem seems to be a memory limitation.  Since the 30 level grid works, then the 60 level grid should work if you double the number of nodes being used to run the simulation.  Is that possible?
 

eaton

CSEG and Liaisons
The problem seems to be a memory limitation.  Since the 30 level grid works, then the 60 level grid should work if you double the number of nodes being used to run the simulation.  Is that possible?
 

eaton

CSEG and Liaisons
The problem seems to be a memory limitation.  Since the 30 level grid works, then the 60 level grid should work if you double the number of nodes being used to run the simulation.  Is that possible?
 

eaton

CSEG and Liaisons
The problem seems to be a memory limitation.  Since the 30 level grid works, then the 60 level grid should work if you double the number of nodes being used to run the simulation.  Is that possible?
 

eaton

CSEG and Liaisons
The problem seems to be a memory limitation.  Since the 30 level grid works, then the 60 level grid should work if you double the number of nodes being used to run the simulation.  Is that possible?
 

eaton

CSEG and Liaisons
The problem seems to be a memory limitation.  Since the 30 level grid works, then the 60 level grid should work if you double the number of nodes being used to run the simulation.  Is that possible?
 

eaton

CSEG and Liaisons
The problem seems to be a memory limitation.  Since the 30 level grid works, then the 60 level grid should work if you double the number of nodes being used to run the simulation.  Is that possible?
 

jedwards

CSEG and Liaisons
Staff member
I think that the pbuf make need to be broken up in the same way that the Q field was.   You might also try using a netcdf4/hdf5 output file format if that is avalable to you:Build and link the model against a netcdf4 library built with hdf5 parallel support and set PIO_TYPENAME = 'netcdf4p' in the env_run.xml file.  
 

jedwards

CSEG and Liaisons
Staff member
I think that the pbuf make need to be broken up in the same way that the Q field was.   You might also try using a netcdf4/hdf5 output file format if that is avalable to you:Build and link the model against a netcdf4 library built with hdf5 parallel support and set PIO_TYPENAME = 'netcdf4p' in the env_run.xml file.  
 

jedwards

CSEG and Liaisons
Staff member
I think that the pbuf make need to be broken up in the same way that the Q field was.   You might also try using a netcdf4/hdf5 output file format if that is avalable to you:Build and link the model against a netcdf4 library built with hdf5 parallel support and set PIO_TYPENAME = 'netcdf4p' in the env_run.xml file.  
 
Top