Hi Cheryl,
Thank you for this detailed instruction!
1. I tried this set of commands with the fully coupled case: BHIST_BPRP. However, after I set its ntasks to be 1 and tried building the case, it reminds that build fail:
ERROR: BUILD FAIL: buildexe failed, cat /glade/scratch/xygao/case1_BHIST_BPRP_singlecore/bld/cesm.bldlog.210516-170841
In the log file, the last several lines say that:
ld: failed to convert GOTPCREL relocation; relink with --no-relax
/glade/u/home/xygao/cases/case1_BHIST_BPRP_singlecore/Tools/Makefile:874: recipe for target '/glade/scratch/xygao/case1_BHIST_BPRP_singlecore/bld/cesm.exe' failed
gmake: *** [/glade/scratch/xygao/case1_BHIST_BPRP_singlecore/bld/cesm.exe] Error 1
Do you have any idea about why the build would fail under ntasks=1?
2. And because ntasks=1 makes the model build fail, I just left its pe-layout as the original. That is to say, the debugger will also use 20 nodes, 720 tasks, and one thread for debugging. Here is the command set that I used:
cd my_cesm_sandbox/cime/scripts
./create_newcase --case ~/cases/case4_BHIST_BPRP --compset BHIST_BPRP --res f09_g17 --project $PROJECT
cd ~/cases/case4_BHIST_BPRP
./case.setup
./xmlchange DEBUG=TRUE
qcmd -- ./case.build
np=720
nthreads=1
source .env_mach_specific.sh
RUNDIR=`./xmlquery RUNDIR --value`
EXEROOT=`./xmlquery EXEROOT --value`
LID=`date '+%y%m%d-%H%M%S'`
cd $RUNDIR
mkdir timing
mkdir timing/checkpoints
echo `pwd`
export OMP_NUM_THREADS=$nthreads
ddt --connect ${EXEROOT}/cesm.exe
However, the debugging stopped at line 58 in the main function (cime_driver.F90): call cime_pre_init1(esmf_logfile_option)
and reported error:
MPT ERROR: MPI_COMM_WORLD rank 0 has terminated without calling MPI_Finalize()
aborting job
Do you have any idea about this debugging fail? I guess there are still some issues with the environment settings. I would appreciate it if you could give me some clues.
Best
Eric