Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

CESM 1.2 on Stempede problem

I am trying to run a CESM1.2 with G compset on Stampede@TACC:./create_newcase -case $1 -res T62_g37 -compset G -mach stampedeThe model has been successfully compiled and built. Howver, when I submit it, it shows the following errors.What does it mean? Is it the model's problem or Stampede's configuration problem?I have successfully run CESM1.1.1 in our school's cluster several weeks ago. Now I'm trying to run CESM1.2 on Stampede for a test drive.Shane-----------------------------------login1$ vi output.933992
TACC: Starting up job 933992
TACC: Setting up parallel environment for MVAPICH2+mpispawn.
TACC: Starting parallel tasks...
rm: cannot remove `env_case': No such file or directory
env_case: No such file or directory.
BUILD_COMPLETE: Undefined variable.
[c559-302.stampede.tacc.utexas.edu:mpispawn_0][child_handler] MPI process (rank: 13, pid: 107015) exited with status 1
rm: cannot remove `env_case': No such file or directory
ccsm_getenv error: problem removing env_case
rm: cannot remove `env_case': No such file or directory
ccsm_getenv error: problem removing env_case
rm: cannot remove `env_case': No such file or directory
ccsm_getenv error: problem removing env_case
[c559-302.stampede.tacc.utexas.edu:mpispawn_0][child_handler] MPI process (rank: 0, pid: 107002) exited with status 254
ccsm_getenv error: problem removing env_case
rm: cannot remove `env_case': No such file or directory
ccsm_getenv error: problem removing env_case
....
 

jedwards

CSEG and Liaisons
Staff member
The Stampede machine is brand new to us and should not have been included in the release.  There are a number of issues thatwe still need to work out with respect to the port.        In particular the file scripts/ccsm_utils/Machines/config_machines.xml file has severaldirectories pointing at a particular users $WORK directory and you don't have write permission there.    If you really want to run on Stampede you canstart by changing those paths to your own.    The following seems to work, but then there are problems in the module load command.    I apologize, stampede should not have been included in our list of supported machines.    
        TACC DELL, os is Linux, 16 pes/node, batch system is SLURM
        LINUX
        intel,intel-mic
        mvapich2,impi,mpi-serial
        $WORK/$CASE/run
        $WORK/$CASE/bld
        $WORK/inputdata
        $WORK/lmwg
        $WORK/archive/$CASE
        csm/$CASE             
        $WORK/ccsm_baselines
        $WORK/tools/cprnc/cprnc
        squeue
        sbatch
        srinathv -at- ucar.edu
        8
        32
        16
 
 
I have already noticed the directory problem and fixed it in config_machines.xml. Here's my version.

        TACC DELL, os is Linux, 16 pes/node, batch system is SLURM
        LINUX
        intel,intel-mic
        mvapich2,impi,mpi-serial
        /home1/02489/xm7303/CESM/cesm1_2_0/scripts/$CASE/run
        /home1/02489/xm7303/CESM/cesm1_2_0/scripts/$CASE/bld
        /home1/02489/xm7303/CESM/inputdata
        /home1/02489/xm7303/CESM/lmwg
        /home1/02489/xm7303/CESM/archive/$CASE
        csm/$CASE            
        /home1/02489/xm7303/CESM/ccsm_baselines
        /home1/02489/xm7303/CESM/cesm1_2_0/tools/cprnc/cprnc
        squeue
        sbatch
        srinathv -at- ucar.edu
        8
        32
        16

Also, TACC team has solved another compile problem when loading module, fixed the env_mach_specific.stampede, see below:
-------------------------------------
From: Robert McLay
Date: Tue, 11 Jun 2013 11:16:55
Subject: Errors when Compiling CESM model
 Response:
 I did miss the top of the shell script.  It is run with "-f" which means that it ignores the system cshrc as well as ~/.cshrc.  In that case it must to something to define the module command.  In that case I would recommend that they do:
   
Code:
source /etc/profile.d/tacc_modules.csh

 This is much safer as this will always define the module command for the csh and is very unlikely to change as Lmod gets updated.
------------------------------------------------------
However, the phenomenon is the same. I still don't understand why the error below happened (in my log):
ccsm_getenv error: problem removing env_case
rm: cannot remove `env_case'env_case: No such file or directory.
I am wondering whatelse I should try? Thanks,
Shane
 

jedwards

CSEG and Liaisons
Staff member
Hi Shane,
I think that you are just running out of disk space - the $HOME filesystem on stampede is very small - you should use $WORK and or $SCRATCH.
 
I rebuilt the model in my $WORK directory, and it has been successfully compiled. However, the phenomenon is the same. See below. So the disk space is not the major problem. I don't understand why the model wants to remove `env_case' anyway. What is it?-------------------------------------------login4$ vi output.942179
TACC: Starting up job 942179
TACC: Setting up parallel environment for MVAPICH2+mpispawn.
TACC: Starting parallel tasks...
rm: cannot remove `env_case': No such file or directory
ccsm_getenv error: problem removing env_case
[c557-604.stampede.tacc.utexas.edu:mpispawn_0][child_handler] MPI process (rank: 8, pid: 6553) exited with status 254
rm: cannot remove `env_case': No such file or directory
ccsm_getenv error: problem removing env_case
rm: cannot remove `env_case': No such file or directory
ccsm_getenv error: problem removing env_case
.....
 

jedwards

CSEG and Liaisons
Staff member
Hi Shane,  The STAMPEDE port is incomplete, it should not have been in the 1.2.0 release code.   I will work over the next week or so to complete the port and provide you with instructions on running there as soon as they are ready.   
 

jedwards

CSEG and Liaisons
Staff member
We are preparing cesm 1.2.2 for a planned release date of June 1, it will include support for Stampede.  
 
Hi, I just downloaded the CESM 1.2.2 svn trunk onto Stampede and tried to compile a new case with:login2$ ./create_newcase -list But I get a warning with an XML error. I have loaded the PERL modules and MPICH2 with the compatible intel compilers. The warning error is :WARNING:    The perl module XML::LibXML is needed for XML parsing in the CESM script system.        Please contact your local systems administrators or IT staff and have them install it for         you, or install the module locally. If anyone has successfully been able to compile and run CESM 1.2.2 on Stampede, could you please share your env variables to be loaded before compilation? Thanks.
 

jedwards

CSEG and Liaisons
Staff member
. /etc/profile.d/tacc_modules.sh
module load perl
export CESMDATAROOT=/scratch/projects/xsede/CESM/
export PERL_LOCAL_LIB_ROOT="$CESMDATAROOT/perl5";
export PERL_MB_OPT="--install_base $CESMDATAROOT/perl5";
export PERL_MM_OPT="INSTALL_BASE=$CESMDATAROOT/perl5";
export PERL5LIB="$CESMDATAROOT/perl5/lib/perl5/x86_64-linux-thread-multi:$CESMDATAROOT/perl5/lib/perl5";
 
Hi,    I have same problem about the perl.  The error is :        WARNING:The perl module XML::LibXML is needed for XML parsing in the CESM script system.        Please contact your local systems administrators or IT staff and have them install it for   you, or install the module locally.    Could you help me?Thank you very much!
 

jedwards

CSEG and Liaisons
Staff member
If you are on stampede follow the instructions in the post above.  If you are on another system get XML::libXML from cpan:http://search.cpan.org/~shlomif/XML-LibXML-2.0117/LibXML.pod
 
Top