Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

My CESM2.1 no longer builds; esmf_wrf_timemgr ESMF errors

raeder

Member
I'm trying to build a case using release-cesm2.1.0-2-g976a1e1, which was modified slightly to run (well) with DART.
(/glade/work/raeder/Models/cesm2_1_relsd_m5.6)
The CIME started as
commit 170db73e00130294bac98fde5dbbcdcde06a33d6 (origin/maint-5.6)
Author: jedwards4b <jedwards@ucar.edu>
Date: Tue Mar 12 05:20:53 2019 +0800
and also has small modifications, which worked fine through 2020-6.

CASEROOT = /glade/work/raeder/Exp/PMO_test0
EXEROOT = /glade/scratch/raeder/PMO_test0/bld

Now case.build fails with with many errors from ESMF in esmf_wrf_timemgr.
My suspicion is that some change in modules or other environment variable is causing ESMF code,
that worked before, to no longer compile. I haven't been able to identify the inconsistency.

The first apparent sign of trouble is
/glade/work/raeder/Exp/PMO_test0/Tools/Makefile:906: recipe for target 'ESMF_ClockMod.o' failed
It looks like the inability to find the compiled ESMF modules leads to most of the errors reported in
/glade/scratch/raeder/PMO_test0/bld/csm_share.bldlog.210507-170428

I searched the CESM Forums, but didn't find anything helpful.

(I tried to look at the Trouble Shooting guide linked from the
linked from the guidelines page,
but I got a "file not found" error. Maybe it should be

$ ./manage_externals/checkout_externals --status --verbose
Processing externals description file : Externals.cfg
Processing externals description file : Externals_CLM.cfg
Processing externals description file : Externals_POP.cfg
Processing externals description file : Externals_CISM.cfg
Checking status of externals: clm, fates, ptclm, mosart, ww3, ERROR:root:SVN returned invalid XML message

ERROR: SVN returned invalid XML message

Running checkout_externals with -d does not yield any more information.
 

jedwards

CSEG and Liaisons
Staff member
This build does not use ESMF. It uses the esmf_wrf_timemgr which is distributed with CESM
cime/src/share/esmf_wrf_timemgr

Looking at your csm_share bld output and your /glade/scratch directory it almost looks as if you have exceeded your quota on /glade/scratch?

Try removing the bld directory and building again. If it fails the same way I'll take a closer look.
 

raeder

Member
Thanks for the quick reply!
I realize that it doesn't use ESMF for the coupling, but all of the error messages come from some form of ESMF
and the missing compiled modules are named ESMF_..., so it's not unfair to say that ESMF is having trouble.

I'm using < 1 Tb of 25 available on scratch/raeder.
I've run `case.build --skip-provenance-check`, which gave almost no failure information,
then the same with `-d` (in the original bld directory), which generated the ESMF error messages.

Below here I've not been able to get the ESMF messages, but I got a different, reproducible error,
which seems to be related to the --skip-provenance-check argument.

Now I've moved the original bld out of the way and interactively run the debugging version in a new bld.
It hung in or after
Calling /glade/work/raeder/Models/cesm2_1_relsd_m5.6/cime/src/build_scripts/buildlib.csm_share
ar: creating libcsm_share.a
> /glade/work/raeder/Models/cesm2_1_relsd_m5.6/cime/scripts/lib/CIME/utils.py(126)expect()
-> try:
(Pdb)
I killed it.

I tried building a new case (PMO_test1) from scratch in a batch job, which called case.build with -d.
It died after the same place, but wrote a traceback:
> /glade/work/raeder/Models/cesm2_1_relsd_m5.6/cime/scripts/lib/CIME/utils.py(126)expect()
-> try:
(Pdb)
Traceback (most recent call last):
File "./case.build", line 147, in <module>
_main_func(__doc__)
File "./case.build", line 142, in _main_func
save_build_provenance=save_build_provenance)
File "/glade/work/raeder/Models/cesm2_1_relsd_m5.6/cime/scripts/Tools/../../scripts/lib/CIME/build.py",
line 570, in case_build
return run_and_log_case_status(functor, "case.build", caseroot=caseroot)
File "/glade/work/raeder/Models/cesm2_1_relsd_m5.6/cime/scripts/Tools/../../scripts/lib/CIME/utils.py",
line 1667, in run_and_log_case_status
rv = func()
File "/glade/work/raeder/Models/cesm2_1_relsd_m5.6/cime/scripts/Tools/../../scripts/lib/CIME/build.py",
line 569, in <lambda>
save_build_provenance)
File "/glade/work/raeder/Models/cesm2_1_relsd_m5.6/cime/scripts/Tools/../../scripts/lib/CIME/build.py",
line 520, in _case_build_impl
cimeroot, libroot, lid, compiler, buildlist, comp_interface)
File "/glade/work/raeder/Models/cesm2_1_relsd_m5.6/cime/scripts/Tools/../../scripts/lib/CIME/build.py",
line 243, in _build_libraries
[full_lib_path, os.path.join(exeroot, sharedpath), caseroot], logfile=file_build)
File "/glade/work/raeder/Models/cesm2_1_relsd_m5.6/cime/scripts/Tools/../../scripts/lib/CIME/utils.py",
line 347, in run_sub_or_cmd
getattr(mod, subname)(*subargs)
File "/glade/work/raeder/Models/cesm2_1_relsd_m5.6/cime/scripts/Tools/../../scripts/lib/CIME/utils.py",
line 126, in expect
try:
File "/glade/work/raeder/Models/cesm2_1_relsd_m5.6/cime/scripts/Tools/../../scripts/lib/CIME/utils.py",
line 126, in expect
try:
File "/usr/lib64/python2.7/bdb.py", line 49, in trace_dispatch
return self.dispatch_line(frame)
File "/usr/lib64/python2.7/bdb.py", line 68, in dispatch_line
if self.quitting: raise BdbQuit
bdb.BdbQuit
 

jedwards

CSEG and Liaisons
Staff member
I cloned your case and tried a build myself. In the csm_share directory you have two copies of shr_stream_mod: shr_stream_mod.F90 and shr_stream_mod_debug.F90. Because of the way the build system works all files with extension .F90 in the source directory are compiled but since both these have the same module name that causes an error. Rename the one that you don't want to compile, that is instead of shr_stream_mod_debug.F90 if you call it shr_stream_mod.F90.debug it won't get in the way. But why not just put it into the SourceMods directory? Files in csm_share would go into
src.share/
 

raeder

Member
That was it! Thanks, you saved me hours of anguish.
I should have seen that, but I was distracted by all of the explicit "error"s and "warning"s.
 
Top