ESMF: Resolution mismatch b/w anomaly forcing and simulation gives unhelpful error

samrabin

Sam Rabin
Member
I was trying to run a 10x15 (f10_f10_mg37) case with the rcp45 anomaly forcings, and got the following error:

Code:
dec1134.hsn.de.hpc.ucar.edu 48:  ESMF_Finalize: Error closing trace stream
dec1134.hsn.de.hpc.ucar.edu 48: MPICH ERROR [Rank 48] [job id b7eaa9b1-7f62-4ada-accc-bc63cda70931] [Fri May 10 17:52:33 2024] [dec1134] - Abort(1) (rank 48 in comm 496): application called MPI_Abort(comm=0x84000001, 1) - process 48
dec1134.hsn.de.hpc.ucar.edu 48:
dec1134.hsn.de.hpc.ucar.edu 48: forrtl: severe (174): SIGSEGV, segmentation fault occurred
dec1134.hsn.de.hpc.ucar.edu 48: Image              PC                Routine            Line        Source
dec1134.hsn.de.hpc.ucar.edu 48: libpthread-2.31.s  0000152B82C698C0  Unknown               Unknown  Unknown
dec1134.hsn.de.hpc.ucar.edu 48: libmpi_intel.so.1  0000152B80C28E7E  Unknown               Unknown  Unknown
dec1134.hsn.de.hpc.ucar.edu 48: libmpi_intel.so.1  0000152B80A3722F  Unknown               Unknown  Unknown
dec1134.hsn.de.hpc.ucar.edu 48: libmpi_intel.so.1  0000152B7F0646A8  MPI_Abort             Unknown  Unknown
dec1134.hsn.de.hpc.ucar.edu 48: libesmf.so         0000152B8BD7EE82  abort                     863  ESMCI_VMKernel.C
dec1134.hsn.de.hpc.ucar.edu 48: libesmf.so         0000152B8BD78D03  abort                    3634  ESMCI_VM.C
dec1134.hsn.de.hpc.ucar.edu 48: libesmf.so         0000152B8BDA3431  c_esmc_vmabort_          1252  ESMCI_VM_F.C
dec1134.hsn.de.hpc.ucar.edu 48: libesmf.so         0000152B8D31ED87  esmf_vmmod_mp_esm        9521  ESMF_VM.F90
dec1134.hsn.de.hpc.ucar.edu 48: libesmf.so         0000152B8CE21A58  esmf_initmod_mp_e        1684  ESMF_Init.F90
dec1134.hsn.de.hpc.ucar.edu 48: cesm.exe           00000000004493A3  MAIN__                    132  esmApp.F90
dec1134.hsn.de.hpc.ucar.edu 48: cesm.exe           0000000000421A3D  Unknown               Unknown  Unknown
dec1134.hsn.de.hpc.ucar.edu 48: libc-2.31.so       0000152B7E55B29D  __libc_start_main     Unknown  Unknown
dec1134.hsn.de.hpc.ucar.edu 48: cesm.exe           000000000042196A  Unknown               Unknown  Unknown

It turns out the forcings are at "1-degree" (f09_g17) resolution. If I run at that instead, it works.

So two things:
  1. Would it be possible to add a more helpful error message in this case?
  2. Is there some kind of namelist setting I can add to make this work?
Thanks!
 

samrabin

Sam Rabin
Member
Well I guess I spoke too soon about it working once I switched to f09. There was a CLM error so I thought I was good as far as ESMF goes. Turns out, after fixing that, I get the following:

Code:
dec0609.hsn.de.hpc.ucar.edu 106:  ESMF_Finalize: Error closing trace stream
dec0609.hsn.de.hpc.ucar.edu 106: MPICH ERROR [Rank 106] [job id 4e2e913a-0f06-4271-9027-b4f368076b92] [Mon May 13 11:28:37 2024] [dec0609] - Abort(1) (rank 106 in comm 496): application called MPI_Abort(comm=0x84000001, 1) - process 106
dec0609.hsn.de.hpc.ucar.edu 106:

There's no traceback this time, even though I have debug on.

Case on Derecho at /glade/derecho/scratch/samrabin/tests_0513-112341de/SMS_D_Ld5.f09_g17.ISSP245Clm50BgcCrop.derecho_intel.clm-datm_rcp45_anom_forc.G.0513-112341de in case anybody can have a look.
 

dbailey

CSEG and Liaisons
Staff member
This might be good to bring up at a CSEG meeting. Ideally, we try to keep all of the forcings on a 1x1 degree grid so they can be interpolated by the data models. I wonder if this is a mask issue. You are using the mg37 which is the POP gx3v7 grid. What if you change to mg17? I think the online generated mapping files also generate the mask?
 
Back
Top