Hi all,
I'm trying to create a high-resolution surface dataset (1km over Contiguous US) for regional simulations following the instructions in Setting up (high-res sparse) regional-grid CTSM simulations #1919. I have succesfully created the 1km masked mesh file (please see attached CONUS_1km_mesh.png), when using mksurfdata_esmf to create surface dataset:
I was able to generate the .namelist, however, I got MPI errors when generating the .nc file (I didn't make any change to the default .namelist) like below:
I have attached the .log and the batch job error files. Not sure if it's because this high resolution is too computationally intensive. I wonder if creating surface dataset at this resolution on Derecho is feasible. If yes, can anyone share some experience or successful cases with me?
Thanks a lot!
I'm trying to create a high-resolution surface dataset (1km over Contiguous US) for regional simulations following the instructions in Setting up (high-res sparse) regional-grid CTSM simulations #1919. I have succesfully created the 1km masked mesh file (please see attached CONUS_1km_mesh.png), when using mksurfdata_esmf to create surface dataset:
Code:
./gen_mksurfdata_namelist --start-year 2005 --end-year 2005 --nocrop --model-mesh-nx 6464 --model-mesh-ny 2781 --model-mesh /glade/work/yifanc17/02_data/cesmdata/meshdata/CONUS_1kmx1km/lnd_mesh_CONUS_1km_c240729.nc --res CONUS_1km
Code:
MPICH ERROR [Rank 0] [job id e36be614-3f1c-4b4c-aedb-649bde738d9f] [Tue Jul 30 15:32:47 2024] [dec2401] - Abort(874109199) (rank 0 in comm 0): Fatal error in PMPI_Send: Other MPI error, error stack:
PMPI_Send(163)............: MPI_Send(buf=0x15174dcd8010, count=14803696, MPI_DOUBLE, dest=1, tag=17, comm=0xc4000011) failed
MPID_Send(499)............:
MPIDI_send_unsafe(58).....:
MPIDI_OFI_send_normal(372): OFI tagged senddata failed (ofi_send.h:372:MPIDI_OFI_send_normal:Bad address)
aborting job:
Fatal error in PMPI_Send: Other MPI error, error stack:
PMPI_Send(163)............: MPI_Send(buf=0x15174dcd8010, count=14803696, MPI_DOUBLE, dest=1, tag=17, comm=0xc4000011) failed
MPID_Send(499)............:
MPIDI_send_unsafe(58).....:
MPIDI_OFI_send_normal(372): OFI tagged senddata failed (ofi_send.h:372:MPIDI_OFI_send_normal:Bad address)
dec2401.hsn.de.hpc.ucar.edu: rank 0 exited with code 255
forrtl: error (78): process killed (SIGTERM)
Image PC Routine Line Source
libpthread-2.31.s 000014839FD278C0 Unknown Unknown Unknown
libmpi_intel.so.1 000014839DC0B94A Unknown Unknown Unknown
libmpi_intel.so.1 000014839C9CD8C5 Unknown Unknown Unknown
libmpi_intel.so.1 000014839CA35FAF Unknown Unknown Unknown
libmpi_intel.so.1 000014839D796FA6 Unknown Unknown Unknown
libmpi_intel.so.1 000014839D675A83 Unknown Unknown Unknown
libmpi_intel.so.1 000014839BBEA79F Unknown Unknown Unknown
libmpi_intel.so.1 000014839BBEB315 PMPI_Alltoallw Unknown Unknown
libpioc.so 00001483A755445C pio_swapm Unknown Unknown
libpioc.so 00001483A7557D11 rearrange_io2comp Unknown Unknown
libpioc.so 00001483A757167E PIOc_read_darray Unknown Unknown
mksurfdata 0000000000FCCA76 Unknown Unknown Unknown
mksurfdata 0000000000EAE81C Unknown Unknown Unknown
mksurfdata 0000000000EADDEE Unknown Unknown Unknown
mksurfdata 00000000007D55E2 Unknown Unknown Unknown
mksurfdata 00000000007D1CA6 Unknown Unknown Unknown
mksurfdata 00000000004F5833 Unknown Unknown Unknown
mksurfdata 0000000000448D91 Unknown Unknown Unknown
mksurfdata 00000000004B711A Unknown Unknown Unknown
mksurfdata 00000000004292AD Unknown Unknown Unknown
libc-2.31.so 000014839AB8E29D __libc_start_main Unknown Unknown
mksurfdata 00000000004291DA Unknown Unknown Unknown
I have attached the .log and the batch job error files. Not sure if it's because this high resolution is too computationally intensive. I wonder if creating surface dataset at this resolution on Derecho is feasible. If yes, can anyone share some experience or successful cases with me?
Thanks a lot!