Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

ESMF: Resolution mismatch b/w anomaly forcing and simulation gives unhelpful error

samrabin

Sam Rabin
Member
I was trying to run a 10x15 (f10_f10_mg37) case with the rcp45 anomaly forcings, and got the following error:

Code:
dec1134.hsn.de.hpc.ucar.edu 48:  ESMF_Finalize: Error closing trace stream
dec1134.hsn.de.hpc.ucar.edu 48: MPICH ERROR [Rank 48] [job id b7eaa9b1-7f62-4ada-accc-bc63cda70931] [Fri May 10 17:52:33 2024] [dec1134] - Abort(1) (rank 48 in comm 496): application called MPI_Abort(comm=0x84000001, 1) - process 48
dec1134.hsn.de.hpc.ucar.edu 48:
dec1134.hsn.de.hpc.ucar.edu 48: forrtl: severe (174): SIGSEGV, segmentation fault occurred
dec1134.hsn.de.hpc.ucar.edu 48: Image              PC                Routine            Line        Source
dec1134.hsn.de.hpc.ucar.edu 48: libpthread-2.31.s  0000152B82C698C0  Unknown               Unknown  Unknown
dec1134.hsn.de.hpc.ucar.edu 48: libmpi_intel.so.1  0000152B80C28E7E  Unknown               Unknown  Unknown
dec1134.hsn.de.hpc.ucar.edu 48: libmpi_intel.so.1  0000152B80A3722F  Unknown               Unknown  Unknown
dec1134.hsn.de.hpc.ucar.edu 48: libmpi_intel.so.1  0000152B7F0646A8  MPI_Abort             Unknown  Unknown
dec1134.hsn.de.hpc.ucar.edu 48: libesmf.so         0000152B8BD7EE82  abort                     863  ESMCI_VMKernel.C
dec1134.hsn.de.hpc.ucar.edu 48: libesmf.so         0000152B8BD78D03  abort                    3634  ESMCI_VM.C
dec1134.hsn.de.hpc.ucar.edu 48: libesmf.so         0000152B8BDA3431  c_esmc_vmabort_          1252  ESMCI_VM_F.C
dec1134.hsn.de.hpc.ucar.edu 48: libesmf.so         0000152B8D31ED87  esmf_vmmod_mp_esm        9521  ESMF_VM.F90
dec1134.hsn.de.hpc.ucar.edu 48: libesmf.so         0000152B8CE21A58  esmf_initmod_mp_e        1684  ESMF_Init.F90
dec1134.hsn.de.hpc.ucar.edu 48: cesm.exe           00000000004493A3  MAIN__                    132  esmApp.F90
dec1134.hsn.de.hpc.ucar.edu 48: cesm.exe           0000000000421A3D  Unknown               Unknown  Unknown
dec1134.hsn.de.hpc.ucar.edu 48: libc-2.31.so       0000152B7E55B29D  __libc_start_main     Unknown  Unknown
dec1134.hsn.de.hpc.ucar.edu 48: cesm.exe           000000000042196A  Unknown               Unknown  Unknown

It turns out the forcings are at "1-degree" (f09_g17) resolution. If I run at that instead, it works.

So two things:
  1. Would it be possible to add a more helpful error message in this case?
  2. Is there some kind of namelist setting I can add to make this work?
Thanks!
 

samrabin

Sam Rabin
Member
Well I guess I spoke too soon about it working once I switched to f09. There was a CLM error so I thought I was good as far as ESMF goes. Turns out, after fixing that, I get the following:

Code:
dec0609.hsn.de.hpc.ucar.edu 106:  ESMF_Finalize: Error closing trace stream
dec0609.hsn.de.hpc.ucar.edu 106: MPICH ERROR [Rank 106] [job id 4e2e913a-0f06-4271-9027-b4f368076b92] [Mon May 13 11:28:37 2024] [dec0609] - Abort(1) (rank 106 in comm 496): application called MPI_Abort(comm=0x84000001, 1) - process 106
dec0609.hsn.de.hpc.ucar.edu 106:

There's no traceback this time, even though I have debug on.

Case on Derecho at /glade/derecho/scratch/samrabin/tests_0513-112341de/SMS_D_Ld5.f09_g17.ISSP245Clm50BgcCrop.derecho_intel.clm-datm_rcp45_anom_forc.G.0513-112341de in case anybody can have a look.
 

dbailey

CSEG and Liaisons
Staff member
This might be good to bring up at a CSEG meeting. Ideally, we try to keep all of the forcings on a 1x1 degree grid so they can be interpolated by the data models. I wonder if this is a mask issue. You are using the mg37 which is the POP gx3v7 grid. What if you change to mg17? I think the online generated mapping files also generate the mask?
 
Top