random pio error

Dear all,running CMCC coupled model based on CAM 5.3, CLM4.5, NEMO-3.4, CICE-4.0, RTM, CPL-7in a present climate simulation,I get the following random error when writing restart files:" pio_support::pio_die:: myrank=          -1 : ERROR: nf_mod.F90:        1456 :
 Attribute value is inconsistent among processes. "

This error appears randomly: rerunning the same executable starting from the same conditions,it works.It happens  every 10-20 jobs (1 job=1 month of simulation), without any preference for the time of the year.Remarkably, in the previous version  of the model (same as the mentioned one, but with CAM5.2 and CLM4.0)I didn't have any problem in a 100 year run.Thank you in advance for any feedback.enrico 
 

jedwards

CSEG and Liaisons
Staff member
This issue is due to writing an attribute with pnetcdf that is not consistant across iO tasks.   In older versions of cice and pop there was an attribute written which was the time of file creation -  if the timestamp on different tasks was different this would cause an error.  The solution is to find the offending variable and get the timestamp on one node and broadcast it to all the nodes before calling the put_attribute function
 
Thanks for replying.  I also got this error message when I was running the coupled simulation with 0.1 degree ocean: pio_support::pio_die:: myrank=           -1 : ERROR: nf_mod.F90:         1456  : Overflow when type cast to 4-byte integer. Any suggestion on he possible causess for this error?  Thanks in advance for the input.   
 

jedwards

CSEG and Liaisons
Staff member
This error is due to trying to express something as a 4-byte integer that doesn't fit into that format.   
 
Hi Jim,I'm helping Hui with this.  Could you please point us in the right direction?  I'm having a hard time figuring out exactly where the problem is.  -O2 simulations optimize out too many things to follow exactly what's going on, and it takes many hours at -O0 to reach the error, making it hard to debug.Also, do you happen to know in which version of CESM this error was fixed?Thanks,Ryan
 
Back
Top