Dear Scientists,
I want to run a regional SP spin-up case using CTSM-ctsm5.2.005. The build process was successful, but I met an error when running ./case.submit:
run command is mpirun -np 128 /home/user/cesm/scratch/shanxispinup5/bld/cesm.exe >> cesm.log.$LID 2>&1
Exception from case_run: ERROR: RUN FAIL: Command 'mpirun -np 128 /home/user/cesm/scratch/shanxispinup5/bld/cesm.exe >> cesm.log.$LID 2>&1 ' failed
See log file for details: /home/user/cesm/scratch/shanxispinup5/run/cesm.log.240717-102328
Submit job case.st_archive
Starting job script case.st_archive
st_archive starting
moving /home/user/cesm/scratch/shanxispinup5/run/cesm.log.240717-102328 to /home/user/cesm/archive/shanxispinup5/logs/cesm.log.240717-102328
moving /home/user/cesm/scratch/shanxispinup5/run/drv.log.240717-102328 to /home/user/cesm/archive/shanxispinup5/logs/drv.log.240717-102328
Cannot find a shanxispinup5.cpl*.r.*.nc file in directory /home/user/cesm/scratch/shanxispinup5/run
Archiving history files for datm (atm)
Archiving history files for clm (lnd)
Archiving history files for mosart (rof)
Archiving history files for drv (cpl)
Archiving history files for dart (esp)
st_archive completed
Submitted job case.run with id None
Submitted job case.st_archive with id None
The cesm.log file shows:
--------------------------------------------------------------------------
prterun detected that one or more processes exited with non-zero status,
thus causing the job to be terminated. The first process to do so was:
Process name: [prterun-H3C-AMD0-02-2244577@1,5] Exit code: 1
--------------------------------------------------------------------------
And I found a PET0.ESMF_LogFile in cesm/scratch/shanxispinup5/run/ dir which shows:
20240717 102330.411 ERROR PET0 esm.F90:948 Not valid - Invalid NTASKS value specified for component: cpl ntasks: 128 1
20240717 102330.411 ERROR PET0 esm.F90:203 Not valid - Passing error in return code
20240717 102330.411 ERROR PET0 ESM0001:src/addon/NUOPC/src/NUOPC_Driver.F90:794 Not valid - Passing error in return code
20240717 102330.411 ERROR PET0 ensemble:src/addon/NUOPC/src/NUOPC_Driver.F90:2898 Not valid - Phase 'IPDv02p1' Initialize for modelComp 1: ESM0001 did not return ESMF_SUCCESS
20240717 102330.411 ERROR PET0 ensemble:src/addon/NUOPC/src/NUOPC_Driver.F90:1326 Not valid - Passing error in return code
20240717 102330.411 ERROR PET0 ensemble:src/addon/NUOPC/src/NUOPC_Driver.F90:483 Not valid - Passing error in return code
20240717 102330.411 ERROR PET0 esmApp.F90:134 Not valid - Passing error in return code
20240717 102330.411 INFO PET0 Finalizing ESMF
1)Following previous post, I re-installed ESMF8.5.0 but met the same error. Could you please give some advice? Note that the installation process of ESMF and build&submit process of CTSM-ctsm5.2.005 were successful on the old machine before and now I want to run on the new machine.
2)Also, I wonder what does the hint "Cannot find a shanxispinup5.cpl*.r.*.nc file in directory /home/user/cesm/scratch/shanxispinup5/run" mean when running ./case.submit? I was running a cold start spin-up case so I don't think the r.*.nc file is needed.
I want to run a regional SP spin-up case using CTSM-ctsm5.2.005. The build process was successful, but I met an error when running ./case.submit:
run command is mpirun -np 128 /home/user/cesm/scratch/shanxispinup5/bld/cesm.exe >> cesm.log.$LID 2>&1
Exception from case_run: ERROR: RUN FAIL: Command 'mpirun -np 128 /home/user/cesm/scratch/shanxispinup5/bld/cesm.exe >> cesm.log.$LID 2>&1 ' failed
See log file for details: /home/user/cesm/scratch/shanxispinup5/run/cesm.log.240717-102328
Submit job case.st_archive
Starting job script case.st_archive
st_archive starting
moving /home/user/cesm/scratch/shanxispinup5/run/cesm.log.240717-102328 to /home/user/cesm/archive/shanxispinup5/logs/cesm.log.240717-102328
moving /home/user/cesm/scratch/shanxispinup5/run/drv.log.240717-102328 to /home/user/cesm/archive/shanxispinup5/logs/drv.log.240717-102328
Cannot find a shanxispinup5.cpl*.r.*.nc file in directory /home/user/cesm/scratch/shanxispinup5/run
Archiving history files for datm (atm)
Archiving history files for clm (lnd)
Archiving history files for mosart (rof)
Archiving history files for drv (cpl)
Archiving history files for dart (esp)
st_archive completed
Submitted job case.run with id None
Submitted job case.st_archive with id None
The cesm.log file shows:
--------------------------------------------------------------------------
prterun detected that one or more processes exited with non-zero status,
thus causing the job to be terminated. The first process to do so was:
Process name: [prterun-H3C-AMD0-02-2244577@1,5] Exit code: 1
--------------------------------------------------------------------------
And I found a PET0.ESMF_LogFile in cesm/scratch/shanxispinup5/run/ dir which shows:
20240717 102330.411 ERROR PET0 esm.F90:948 Not valid - Invalid NTASKS value specified for component: cpl ntasks: 128 1
20240717 102330.411 ERROR PET0 esm.F90:203 Not valid - Passing error in return code
20240717 102330.411 ERROR PET0 ESM0001:src/addon/NUOPC/src/NUOPC_Driver.F90:794 Not valid - Passing error in return code
20240717 102330.411 ERROR PET0 ensemble:src/addon/NUOPC/src/NUOPC_Driver.F90:2898 Not valid - Phase 'IPDv02p1' Initialize for modelComp 1: ESM0001 did not return ESMF_SUCCESS
20240717 102330.411 ERROR PET0 ensemble:src/addon/NUOPC/src/NUOPC_Driver.F90:1326 Not valid - Passing error in return code
20240717 102330.411 ERROR PET0 ensemble:src/addon/NUOPC/src/NUOPC_Driver.F90:483 Not valid - Passing error in return code
20240717 102330.411 ERROR PET0 esmApp.F90:134 Not valid - Passing error in return code
20240717 102330.411 INFO PET0 Finalizing ESMF
1)Following previous post, I re-installed ESMF8.5.0 but met the same error. Could you please give some advice? Note that the installation process of ESMF and build&submit process of CTSM-ctsm5.2.005 were successful on the old machine before and now I want to run on the new machine.
2)Also, I wonder what does the hint "Cannot find a shanxispinup5.cpl*.r.*.nc file in directory /home/user/cesm/scratch/shanxispinup5/run" mean when running ./case.submit? I was running a cold start spin-up case so I don't think the r.*.nc file is needed.