zhangyi@lasg_iap_ac_cn
Member
On a IBM-flex machine, I'm running the CAM5 at a customized resolution of 400x800 grids.It is found that the generated restart files can not restart the model.Every time I try to do this, the clm model says stopping and there is an longwave energy imbalance error of ~ -390 W/m2.More weird, if I run the model using a low resolution with 64x128 grid, the problem disappears...-------------------------Note, this problem will never happen on a Linux/intel compiler based system for both 64x128 and 400x800 grids. (I have run for many times...) So I move the restart files generated on the IBM to the Linux-intel, and I try to restart the model using these IBM files.Again, it fails. So I begin to test by using the combined intel and ibm generated restart files to check which restart file of IBM is bad.Finally, it is the xxx.clm2.r.xxxx-xx file. Except this clm restart file, all other IBM generated restart file can be used to conduct a continue run on Linux-intel, successfully.Meanwhile, the Linux intel generated restart files are not able to restart the model on IBM, same error occur.So i conclude there is a read/write error on xxx.clm.r.xxxx when running the model on the IBM-flex cluster. I'm a CAM user, and not very familiar with CLM.I know this may not be a pure code problem, also related to the platform and compiler.So I wonder if some one can give me some suggestion on this werid problem, some help about how to fix it. Thanks in advance.