What version of the code are you using?
CESM2.2.2
Describe your problem or question:
I am running a regional refined case using MUSICA grid & configurations. The simulation successfully ran from 2024-01-01 to 2024-02-06, and all previous resubmissions completed without issues. However, after archiving the 2024-02-06 outputs and restart files, and starting to run for 2024-02-07, I got below PIO errors in cesm.log. There are no errors in any other component log files:
The cesm.log itself is not very informative (attached below), so I am not sure what went wrong since the previous one-month simulation completed successfully. Disk quota is not an issue, and the library path '/glade/u/apps/derecho/23.09/spack/opt/spack/parallelio/2.6.2/cray-mpich/8.1.27/oneapi/2023.2.1/zyhu/lib/libpioc.so' is also valid.
Any suggestions or insights on this PIO failure would be greatly appreciated!
CESM2.2.2
Describe your problem or question:
I am running a regional refined case using MUSICA grid & configurations. The simulation successfully ran from 2024-01-01 to 2024-02-06, and all previous resubmissions completed without issues. However, after archiving the 2024-02-06 outputs and restart files, and starting to run for 2024-02-07, I got below PIO errors in cesm.log. There are no errors in any other component log files:
Code:
dec2257.hsn.de.hpc.ucar.edu 1797: Obtained 10 stack frames.
dec0800.hsn.de.hpc.ucar.edu 298: /var/run/palsd/87e4d01e-5d2d-4f93-9ea1-ce698c059f0a/files/cesm.exe() [0x41dcf3]
dec0805.hsn.de.hpc.ucar.edu 386: /var/run/palsd/87e4d01e-5d2d-4f93-9ea1-ce698c059f0a/files/cesm.exe() [0x41dcf3]
dec0818.hsn.de.hpc.ucar.edu 521: /var/run/palsd/87e4d01e-5d2d-4f93-9ea1-ce698c059f0a/files/cesm.exe() [0xf6065e]
dec1040.hsn.de.hpc.ucar.edu 642: /var/run/palsd/87e4d01e-5d2d-4f93-9ea1-ce698c059f0a/files/cesm.exe() [0xf6065e]
dec1085.hsn.de.hpc.ucar.edu 770: /var/run/palsd/87e4d01e-5d2d-4f93-9ea1-ce698c059f0a/files/cesm.exe() [0x43a4fe]
dec1123.hsn.de.hpc.ucar.edu 899: /var/run/palsd/87e4d01e-5d2d-4f93-9ea1-ce698c059f0a/files/cesm.exe() [0xf6982d]
dec1124.hsn.de.hpc.ucar.edu 1030: /glade/u/apps/derecho/23.09/spack/opt/spack/parallelio/2.6.2/cray-mpich/8.1.27/oneapi/2023.2.1/zyhu/lib/libpiof.so(piodarray_mp_write_darray_1d_double_+0xe6) [0x14c42702c846]
dec1159.hsn.de.hpc.ucar.edu 1162: /glade/u/apps/derecho/23.09/spack/opt/spack/parallelio/2.6.2/cray-mpich/8.1.27/oneapi/2023.2.1/zyhu/lib/libpiof.so(piodarray_mp_write_darray_1d_double_+0xe6) [0x15468858f846]
dec1500.hsn.de.hpc.ucar.edu 1281: /var/run/palsd/87e4d01e-5d2d-4f93-9ea1-ce698c059f0a/files/cesm.exe() [0x51dae4]
dec1516.hsn.de.hpc.ucar.edu 1416: /var/run/palsd/87e4d01e-5d2d-4f93-9ea1-ce698c059f0a/files/cesm.exe() [0x509b81]
dec1534.hsn.de.hpc.ucar.edu 1561: /var/run/palsd/87e4d01e-5d2d-4f93-9ea1-ce698c059f0a/files/cesm.exe() [0xf6982d]
dec2257.hsn.de.hpc.ucar.edu 1797: /glade/u/apps/derecho/23.09/spack/opt/spack/parallelio/2.6.2/cray-mpich/8.1.27/oneapi/2023.2.1/zyhu/lib/libpioc.so(pio_err+0x7a) [0x153652b7e10a]
dec0594.hsn.de.hpc.ucar.edu 142: Obtained 10 stack frames.
dec0800.hsn.de.hpc.ucar.edu 298: MPICH ERROR [Rank 298] [job id 87e4d01e-5d2d-4f93-9ea1-ce698c059f0a] [Mon Oct 13 15:38:14 2025] [dec0800] - Abort(-1) (rank 298 in comm 0): application called MPI_Abort(MPI_COMM_WORLD, -1) - process 298
dec0805.hsn.de.hpc.ucar.edu 386: MPICH ERROR [Rank 386] [job id 87e4d01e-5d2d-4f93-9ea1-ce698c059f0a] [Mon Oct 13 15:38:14 2025] [dec0805] - Abort(-1) (rank 386 in comm 0): application called MPI_Abort(MPI_COMM_WORLD, -1) - process 386
The cesm.log itself is not very informative (attached below), so I am not sure what went wrong since the previous one-month simulation completed successfully. Disk quota is not an issue, and the library path '/glade/u/apps/derecho/23.09/spack/opt/spack/parallelio/2.6.2/cray-mpich/8.1.27/oneapi/2023.2.1/zyhu/lib/libpioc.so' is also valid.
Any suggestions or insights on this PIO failure would be greatly appreciated!