Main menu

Navigation

Issues porting CESM 1.0.4 (run failure) and 1.1 (setup failure)

1 post / 0 new
atp42@...
Issues porting CESM 1.0.4 (run failure) and 1.1 (setup failure)

I'm trying to port 2 versions of the CESM to our local machine and I've reached two errors that have stumped me for a little bit. Both are using GNU compilers (4.4.7) and openMPI. 

The first involved version 1.0.4. I have included the same code changes to shr_sys_mod.F90 as 1.0.3 so the model will compile with GNU compilers. The model builds successfully, but when I try to run it I get the following error. 

 *** An error occurred in MPI_Waitall
 *** on communicator MPI_COMM_WORLD
 *** MPI_ERR_TRUNCATE: message truncated
 *** MPI_ERRORS_ARE_FATAL: your MPI job will now abort
--------------------------------------------------------------------------
mpiexec has exited due to process rank 0 with PID 2161 on
node en-bgc01.eas.cornell.edu exiting improperly. There are two reasons this could occur:

1. this process did not call "init" before exiting, but others in
the job did. This can cause a job to hang indefinitely while it waits
for all processes to call "init". By rule, if one process calls "init",
then ALL processes must call "init" prior to termination.

2. this process called "init", but exited without calling "finalize".
By rule, all processes that call "init" MUST call "finalize" prior to
exiting or it will be considered an "abnormal termination"

This may have caused other processes in the application to be
terminated by signals sent by mpiexec (as reported here).
--------------------------------------------------------------------------

I have tried running with mpirun and mpiexec, and changing the number of processes per node, but both get the same error. I have also run a test mpi program to confirm that openmpi compilers work. 

 

The other issue involves an issue declaring a variable, EXEROOT. In CESM 1.1.1 I set EXEROOT in the env_build.xml file, however when I attempt to setup/configure the model I get the following error. 

Macros script already created ...skipping
Machine/Decomp/Pes configuration has already been done ...skipping
Running preview_namelist script
EXEROOT: Undefined variable.
EXEROOT: Undefined variable.
ERROR: /home/atp42/aarontest01/preview_namelists failed: 256

Which is different from the expect error in the porting guide if EXEROOT was unset "ERROR: must set xml variable EXEROOT to build the model".

Thank you for any advice you can give. 

 

Who's new

  • federico
  • shreya.dhame@...
  • nooned@...
  • rjallen@...
  • sunjzh13@...