Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

Porting CESM to a new machine: configuration problem, not using module

Nuanliang

Nuanliang
New Member
Hi everyone,

I'm trying to port cesm to a new machine (linux server with no batch system). First, I installed the cesm2.1.5 following the steps described here (Downloading CESM2 (CESM2.1) — CESM CESM2.1 documentation), then created a ~/.cime/config_machines.xml specifically for this machine by referring to the description here (6. Porting and validating CIME on a new platform — CIME master documentation). The configuration template is /data/warmcold/my_cesm/cime/config/xml_schemas/config_machines_template.xml.

The part of the configuration confusing to me is:

<module_system type="module"> <init_path lang="perl">/glade/u/apps/ch/opt/lmod/7.2.1/lmod/lmod/init/perl</init_path> <init_path lang="python">/glade/u/apps/ch/opt/lmod/7.2.1/lmod/lmod/init/env_modules_python.py</init_path> <init_path lang="csh">/glade/u/apps/ch/opt/lmod/7.2.1/lmod/lmod/init/csh</init_path> <init_path lang="sh">/glade/u/apps/ch/opt/lmod/7.2.1/lmod/lmod/init/sh</init_path> <cmd_path lang="perl">/glade/u/apps/ch/opt/lmod/7.2.1/lmod/lmod/libexec/lmod perl</cmd_path> <cmd_path lang="python">/glade/u/apps/ch/opt/lmod/7.2.1/lmod/lmod/libexec/lmod python</cmd_path> <cmd_path lang="sh">module</cmd_path> <cmd_path lang="csh">module</cmd_path> ......

The perl (v5.32.1) and python (3.9.18) that I'm using are in my conda environment. I don't know how to specify these in the configuration file ~/.cime/config_machines.xml. Besides, for csh and sh, I'm not sure if I included them correctly. When I run the scripts_regression_tests.py to test my setup, it shows many failures (The tests output is attached below).

It would be great if someone can give me some help on these problems. Thank you in advance!
Best,
Nuanliang

~/.cime/config_machines.xml is shown here below:
<?xml version="1.0"?> <config_machines version="2.0"> <machine MACH="whirls"> <DESC>group server with 104 pes in 2 nodes, no batch system</DESC> <NODENAME_REGEX>whirls.uchicago.edu</NODENAME_REGEX> <OS>LINUX</OS> <COMPILERS>intel</COMPILERS> <MPILIBS>intelmpi</MPILIBS> <PROJECT>nobatch</PROJECT> <SAVE_TIMING_DIR> </SAVE_TIMING_DIR> <SAVE_TIMING_DIR_PROJECTS> </SAVE_TIMING_DIR_PROJECTS> <CIME_OUTPUT_ROOT>/data/warmcold/project2/dry_held_suarez</CIME_OUTPUT_ROOT> <DIN_LOC_ROOT>/data/warmcold/project2/inputdata</DIN_LOC_ROOT> <DIN_LOC_ROOT_CLMFORC>/data/warmcold/project2/clmforcing</DIN_LOC_ROOT_CLMFORC> <DOUT_S_ROOT>/data/warmcold/project2/archive</DOUT_S_ROOT> <BASELINE_ROOT>/data/warmcold/project2/cesm_baselines</BASELINE_ROOT> <CCSM_CPRNC>$ENV{CESMDATAROOT}/tools/cime/tools/cprnc/cprnc.cheyenne</CCSM_CPRNC> <GMAKE></GMAKE> <GMAKE_J>8</GMAKE_J> <BATCH_SYSTEM>none</BATCH_SYSTEM> <SUPPORTED_BY>nuanliang: warmcold@uchicago.edu</SUPPORTED_BY> <MAX_TASKS_PER_NODE>40</MAX_TASKS_PER_NODE> <MAX_MPITASKS_PER_NODE>40</MAX_MPITASKS_PER_NODE> <PROJECT_REQUIRED>FALSE</PROJECT_REQUIRED> <mpirun mpilib="intelmpi"> <executable>mpirun</executable> <arguments> <arg name="num_tasks">-n {{ total_tasks }}</arg> </arguments> </mpirun> <module_system type="module"> <init_path lang="sh">/usr/share/Modules/init/sh</init_path> <init_path lang="csh">/usr/share/Modules/init/csh</init_path> <cmd_path lang="sh">module</cmd_path> <cmd_path lang="csh">module</cmd_path> <modules compiler='intel'> <command name="purge"/> <command name="load">intel/2020</command> <command name="load">hdf5/1.12.2</command> <command name="load">mkl/2020</command> <command name="load">intelmpi/2020</command> <command name="load">netcdf/4.9.0+intel-2020</command> </modules> </module_system> <!-- environment variables, a blank entry will unset a variable --> <environment_variables> <env name="OMP_STACKSIZE">64M</env> <env name="MPI_TYPE_DEPTH">16</env> </environment_variables> <!-- resource settings as defined in https://docs.python.org/2/library/resource.html --> <resource_limits> <resource name="RLIMIT_STACK">-1</resource> </resource_limits> </machine>
 

Attachments

  • test_output.txt
    406.3 KB · Views: 1

jedwards

CSEG and Liaisons
Staff member
These are not the paths to the perl and python executables, they are the paths to the lmod init and libexec files to use python and perl with lmod modules
if you are using lmod you should find this in the lmod install tree, if you are not using lmod then you can set <module_system type="none"> and ignore this section.
 

Nuanliang

Nuanliang
New Member
These are not the paths to the perl and python executables, they are the paths to the lmod init and libexec files to use python and perl with lmod modules
if you are using lmod you should find this in the lmod install tree, if you are not using lmod then you can set <module_system type="none"> and ignore this section.
Hi jedwards,

Thanks for the clarification. I don't think we use lmod on our server. But I do find similar files inside directory /usr/share/Modules/.
$ ls -l /usr/share/Modules/init/ total 116K -rw-r--r--. 1 root root 4.1K Apr 25 2023 bash -rw-r--r--. 1 root root 12K Apr 25 2023 bash_completion -rw-r--r--. 1 root root 1.9K Apr 25 2023 cmake -rw-r--r--. 1 root root 3.8K Apr 25 2023 csh -rw-r--r--. 1 root root 1.7K Apr 25 2023 fish -rw-r--r--. 1 root root 9.5K Apr 25 2023 fish_completion -rw-r--r--. 1 root root 4.1K Apr 25 2023 ksh drwxr-xr-x. 2 root root 46 Jan 20 01:58 ksh-functions -rw-r--r--. 1 root root 3.2K Apr 25 2023 lisp -rw-r--r--. 1 root root 1.1K Apr 25 2023 perl.pm -rw-r--r--. 1 root root 197 Apr 25 2023 profile-compat.csh -rw-r--r--. 1 root root 309 Apr 25 2023 profile-compat.sh -rw-r--r--. 1 root root 105 Apr 25 2023 profile.csh -rw-r--r--. 1 root root 448 Apr 25 2023 profile.sh -rw-r--r--. 1 root root 1.6K Apr 25 2023 python.py -rw-r--r--. 1 root root 961 Apr 25 2023 r.R -rw-r--r--. 1 root root 1.4K Apr 25 2023 ruby.rb -rw-r--r--. 1 root root 4.0K Apr 25 2023 sh -rw-r--r--. 1 root root 1.1K Apr 25 2023 tcl -rw-r--r--. 1 root root 3.8K Apr 25 2023 tcsh -rw-r--r--. 1 root root 5.4K Apr 25 2023 tcsh_completion -rw-r--r--. 1 root root 4.1K Apr 25 2023 zsh drwxr-xr-x. 2 root root 21 Jan 20 01:58 zsh-functions $ ls -l /usr/share/Modules/libexec/ total 388K -rwxr-xr-x. 1 root root 385K Apr 25 2023 modulecmd.tcl

So I modified this part of my local configuration file to be like:
<module_system type="module"> <init_path lang="perl">/usr/share/Modules/init/perl.pm</init_path> <init_path lang="python">/usr/share/Modules/init/python.py</init_path> <init_path lang="sh">/usr/share/Modules/init/sh</init_path> <init_path lang="csh">/usr/share/Modules/init/tcsh</init_path> <cmd_path lang="perl">/usr/share/Modules/libexec/modulecmd.tcl perl </cmd_path> <cmd_path lang="python">/usr/share/modules/libexec/modulecmd.tcl python</cmd_path> <cmd_path lang="sh">/usr/share/modules/libexec/modulecmd.tcl sh</cmd_path> <cmd_path lang="csh">/usr/share/modules/libexec/modulecmd.tcl tcsh</cmd_path> <!-- <module_system type="none"> --> <modules compiler='intel'> <command name="purge"/> <command name="load">intel/2020</command> <command name="load">hdf5/1.12.2</command> <command name="load">mkl/2020</command> <command name="load">intelmpi/2020</command> <command name="load">netcdf/4.9.0+intel-2020</command> </modules> </module_system>

Rerun the scripts_regression_tests.py script, it shows similar failure error messages. If I comment out the init_path and cmd_path part of the configuration (for perl, python, sh, tcsh), similar error appears again. The errors may be divided into 3 parts:

(1) "SystemExit: ERROR: No machine cori-haswell found" and "NameError: name 'machobj' is not defined"
(2) "NameError: name 'sys' is not defined"
(3) "b'ERROR: module command None purge failed with message:\n/bin/sh: None: command not found'"

Fo (2), if run the python interactively, I can successfully import pylint and sys packages. For (3), /bin/sh is soft linked to /usr/bin/bash. I'm not sure why these errors show up and wonder if these are related to my configuration on init_path and cmd_path. Could you give me some help on these problems? Thanks! I also attached the total tests output below.

$ ls -l /bin/sh lrwxrwxrwx. 1 root root 4 Jun 20 2022 /bin/sh -> bash

Nuanliang
 

Attachments

  • test_out.txt
    503.6 KB · Views: 0

jedwards

CSEG and Liaisons
Staff member
If you have that path on your system you do have lmod. You should read your system documentation on how to properly use it
it will ultimately make your life easier. I just double checked our 2.1.5 release and I can't find any reference to cori-haswell there.
I suspect that somehow you didn't fully update to 2.1.5? The modules errors should not otherwise have anything to do with the python errors that you are getting.
 

Nuanliang

Nuanliang
New Member
If you have that path on your system you do have lmod. You should read your system documentation on how to properly use it
it will ultimately make your life easier. I just double checked our 2.1.5 release and I can't find any reference to cori-haswell there.
I suspect that somehow you didn't fully update to 2.1.5? The modules errors should not otherwise have anything to do with the python errors that you are getting.
Hi jedwards,

Thanks for the suggestions. Yes, we do have lmod on our server. I modified the part of configuration file ~/.cime/config_machines.xml to:

<init_path lang="perl">/usr/share/Modules/init/perl.pm</init_path> <init_path lang="python">/usr/share/Modules/init/python.py</init_path> <init_path lang="sh">/usr/share/Modules/init/sh</init_path> <init_path lang="csh">/usr/share/Modules/init/tcsh</init_path> <cmd_path lang="perl">/usr/share/Modules/libexec/modulecmd.tcl perl </cmd_path> <cmd_path lang="python">/usr/share/Modules/libexec/modulecmd.tcl python</cmd_path> <cmd_path lang="sh">/usr/share/Modules/libexec/modulecmd.tcl sh autoinit</cmd_path> <cmd_path lang="csh">/usr/share/Modules/libexec/modulecmd.tcl tcsh</cmd_path>

As for the version of cesm, I follow the exact steps described here (Downloading CESM2 (CESM2.1) — CESM CESM2.1 documentation):
$ git clone -b release-cesm2.1.5 https://github.com/ESCOMP/CESM.git my_cesm $ cd my_cesm $ git checkout release-cesm2.1.5 $ ./manage_externals/checkout_externals

I also doubled checked with all the individual model components:
$ ./manage_externals/checkout_externals -S Processing externals description file : Externals.cfg (/data/warmcold/my_cesm) Processing externals description file : Externals_CAM.cfg (/data/warmcold/my_cesm/components/cam) Processing externals description file : Externals_CISM.cfg (/data/warmcold/my_cesm/components/cism) Processing externals description file : Externals_CLM.cfg (/data/warmcold/my_cesm/components/clm) Processing externals description file : Externals_POP.cfg (/data/warmcold/my_cesm/components/pop) Checking local status of required & optional components: cam, chem_proc, carma, clubb, cosp2, cice, cime, cism, source_cism, clm, fates, mosart, pop, cvmix, marbl, rtm, ww3, ./cime ./components/cam ./components/cam/chem_proc ./components/cam/src/physics/carma/base ./components/cam/src/physics/clubb ./components/cam/src/physics/cosp2/src ./components/cice ./components/cism ./components/cism/source_cism ./components/clm ./components/clm/src/fates ./components/mosart ./components/pop ./components/pop/externals/CVMix ./components/pop/externals/MARBL ./components/rtm ./components/ww3
So I think it means fully update to 2.1.5? Then rerun the scripts_regression_tests.py script in /data/warmcold/my_cesm/cime/scripts/tests, same errors show up again (the 3rd is different):
(1) "SystemExit: ERROR: No machine cori-haswell found" and "NameError: name 'machobj' is not defined"
(2) "NameError: name 'sys' is not defined"
(3) some compilation errors associated with files not found: " ERRPUT: File not found: lnd2rof_fmapname = "lnd/clm2/mappingdata/maps/1.9x2.5/map_1.9x2.5_nomask_to_0.5x0.5_nomask_aave_da_c120522.nc", will attempt to download in check_input_data phase"

For (2), as indicated by this post (python: sys is not defined), it seems to indicate some errors show up before the python program can import the sys package, which probably related to (1).

I just copy and paste some part of the error messages related to (1) down below:
FAIL: test_CIMEXML_doctests (__main__.A_RunUnitTests) ---------------------------------------------------------------------- Traceback (most recent call last): File "/data/warmcold/my_cesm/cime/scripts/tests/scripts_regression_tests.py", line 139, in test_CIMEXML_doctests run_cmd_assert_result(self, "PYTHONPATH=%s:$PYTHONPATH python -m doctest *.py 2>&1" % LIB_DIR, from_dir=os.path.join(LIB_DIR,"CIME","XML")) File "/data/warmcold/my_cesm/cime/scripts/tests/scripts_regression_tests.py", line 76, in run_cmd_assert_result test_obj.assertEqual(stat, expected_stat, msg=msg) AssertionError: 1 != 0 : COMMAND: PYTHONPATH=/data/warmcold/my_cesm/cime/scripts/lib:$PYTHONPATH python -m doctest *.py 2>&1 FROM_DIR: /data/warmcold/my_cesm/cime/scripts/lib/CIME/XML SHOULD HAVE WORKED, INSTEAD GOT STAT 1 OUTPUT: ********************************************************************** File "/data/warmcold/my_cesm/cime/scripts/lib/CIME/XML/machines.py", line 282, in machines.Machines.has_batch_system Failed example: machobj = Machines(machine="cori-haswell") Exception raised: Traceback (most recent call last): File "/data/warmcold/miniconda3/envs/cesm/lib/python3.9/doctest.py", line 1334, in __run exec(compile(example.source, filename, "single", File "<doctest machines.Machines.has_batch_system[0]>", line 1, in <module> machobj = Machines(machine="cori-haswell") File "/data/warmcold/my_cesm/cime/scripts/lib/CIME/XML/machines.py", line 56, in __init__ self.set_machine(machine) File "/data/warmcold/my_cesm/cime/scripts/lib/CIME/XML/machines.py", line 170, in set_machine self.machine_node = super(Machines,self).get_child("machine", {"MACH" : machine}, err_msg="No machine {} found".format(machine)) File "/data/warmcold/my_cesm/cime/scripts/lib/CIME/XML/generic_xml.py", line 255, in get_child expect(len(children) == 1, err_msg if err_msg else "Expected one child") File "/data/warmcold/my_cesm/cime/scripts/lib/CIME/utils.py", line 131, in expect raise exc_type(msg) SystemExit: ERROR: No machine cori-haswell found ********************************************************************** File "/data/warmcold/my_cesm/cime/scripts/lib/CIME/XML/machines.py", line 283, in machines.Machines.has_batch_system Failed example: machobj.has_batch_system() Exception raised: Traceback (most recent call last): File "/data/warmcold/miniconda3/envs/cesm/lib/python3.9/doctest.py", line 1334, in __run exec(compile(example.source, filename, "single", File "<doctest machines.Machines.has_batch_system[1]>", line 1, in <module> machobj.has_batch_system() NameError: name 'machobj' is not defined

I wonder if I miss something, or I didn't set up my python environment correctly? Thanks for your help!

Nuanliang

PS, the total tests output is attached.
 

Attachments

  • test_out.txt
    422.3 KB · Views: 0

jedwards

CSEG and Liaisons
Staff member
Try updateing cime to the head of the 2.1 branch as follows:
cd cesm/cime
git pull origin maint-5.6
git checkout maint-5.6

This will remove the reference to cori-haswell
issue 3 is a warning you should be able to safely ignore
issue 2 may be a side affect of issue 1
 

Nuanliang

Nuanliang
New Member
Try updateing cime to the head of the 2.1 branch as follows:
cd cesm/cime
git pull origin maint-5.6
git checkout maint-5.6
Hi jedwards,

Thanks for the your prompt response! I updated my_cesm/cime using the 3 commands you listed. Rerun scripts_regression_tests.py,
(1) "SystemExit: ERROR: No machine cori-haswell found" and "NameError: name 'machobj' is not defined"
(2) "NameError: name 'sys' is not defined"
(3) some compilation errors associated with files not found: " ERRPUT: File not found: lnd2rof_fmapname = "lnd/clm2/mappingdata/maps/1.9x2.5/map_1.9x2.5_nomask_to_0.5x0.5_nomask_aave_da_c120522.nc", will attempt to download in check_input_data phase"
error (1) is completely solved. Yet error (2) and (3) still remain. For (2), it all seems to be related to pylint package. All the tests associated with pylint failed.
test_pylint_config_e3sm_tests_py (__main__.B_CheckCode) ... FAIL test_pylint_scripts_Tools___init___py (__main__.B_CheckCode) ... FAIL test_pylint_scripts_Tools_archive_metadata (__main__.B_CheckCode) ... FAIL test_pylint_scripts_Tools_bless_test_results (__main__.B_CheckCode) ... FAIL test_pylint_scripts_Tools_case_build (__main__.B_CheckCode) ... FAIL test_pylint_scripts_Tools_case_cmpgen_namelists (__main__.B_CheckCode) ... FAIL test_pylint_scripts_Tools_case_diff (__main__.B_CheckCode) ... FAIL test_pylint_scripts_Tools_case_qstatus (__main__.B_CheckCode) ... FAIL test_pylint_scripts_Tools_case_setup (__main__.B_CheckCode) ... FAIL test_pylint_scripts_Tools_case_submit (__main__.B_CheckCode) ... FAIL test_pylint_scripts_Tools_check_case (__main__.B_CheckCode) ... FAIL ......

Under my conda environment, python is 3.9.18 and pylint is 3.0.3.
$ pylint --version pylint 3.0.3 astroid 3.0.3 Python 3.9.18 | packaged by conda-forge | (main, Dec 23 2023, 16:33:10) [GCC 12.3.0]

The detailed error messages read like:
FAIL: test_pylint_config_e3sm_tests_py (__main__.B_CheckCode) ---------------------------------------------------------------------- Traceback (most recent call last): File "/data/warmcold/my_cesm/cime/scripts/tests/scripts_regression_tests.py", line 2758, in test self.assertTrue(result == "", msg=result) AssertionError: False is not true : Traceback (most recent call last): File "/data/warmcold/miniconda3/envs/cesm/bin/pylint", line 10, in <module> sys.exit(run_pylint()) File "/data/warmcold/miniconda3/envs/cesm/lib/python3.9/site-packages/pylint/__init__.py", line 34, in run_pylint PylintRun(argv or sys.argv[1:]) File "/data/warmcold/miniconda3/envs/cesm/lib/python3.9/site-packages/pylint/lint/run.py", line 136, in __init__ args = _preprocess_options(self, args) File "/data/warmcold/miniconda3/envs/cesm/lib/python3.9/site-packages/pylint/config/utils.py", line 256, in _preprocess_options cb(run, value) File "/data/warmcold/miniconda3/envs/cesm/lib/python3.9/site-packages/pylint/config/utils.py", line 152, in _init_hook exec(value) # pylint: disable=exec-used File "<string>", line 1, in <module> NameError: name 'sys' is not defined ====================================================================== FAIL: test_pylint_scripts_Tools___init___py (__main__.B_CheckCode) ---------------------------------------------------------------------- Traceback (most recent call last): File "/data/warmcold/my_cesm/cime/scripts/tests/scripts_regression_tests.py", line 2758, in test self.assertTrue(result == "", msg=result) AssertionError: False is not true : Traceback (most recent call last): File "/data/warmcold/miniconda3/envs/cesm/bin/pylint", line 10, in <module> sys.exit(run_pylint()) File "/data/warmcold/miniconda3/envs/cesm/lib/python3.9/site-packages/pylint/__init__.py", line 34, in run_pylint PylintRun(argv or sys.argv[1:]) File "/data/warmcold/miniconda3/envs/cesm/lib/python3.9/site-packages/pylint/lint/run.py", line 136, in __init__ args = _preprocess_options(self, args) File "/data/warmcold/miniconda3/envs/cesm/lib/python3.9/site-packages/pylint/config/utils.py", line 256, in _preprocess_options cb(run, value) File "/data/warmcold/miniconda3/envs/cesm/lib/python3.9/site-packages/pylint/config/utils.py", line 152, in _init_hook exec(value) # pylint: disable=exec-used File "<string>", line 1, in <module> NameError: name 'sys' is not defined ......

I wonder if there is a compatibility issue, or the pylint package itself that I'm using has some issues? On the other hand, I also tried removing pylint and skipping those tests associated with pylint, there are some errors or tests failed. Are these failures can be safely ignore?

The part of the failed tests are:
...... test_a_createnewcase (__main__.J_TestCreateNewcase) ... FAIL test_aa_no_flush_on_instantiate (__main__.J_TestCreateNewcase) ... ok test_b_user_mods (__main__.J_TestCreateNewcase) ... ok test_c_create_clone_keepexe (__main__.J_TestCreateNewcase) ... ok test_d_create_clone_new_user (__main__.J_TestCreateNewcase) ... ok test_e_xmlquery (__main__.J_TestCreateNewcase) ... FAIL test_f_createnewcase_with_user_compset (__main__.J_TestCreateNewcase) ... FAIL ...... test_save_timings (__main__.L_TestSaveTimings) ... FAIL test_save_timings_manual (__main__.L_TestSaveTimings) ... FAIL ...... test_b_full (__main__.O_TestTestScheduler) ... FAIL test_c_use_existing (__main__.O_TestTestScheduler) ... FAIL test_d_retry (__main__.O_TestTestScheduler) ... FAIL test_jenkins_generic_job (__main__.P_TestJenkinsGenericJob) ... skipped 'Skipping Jenkins tests. E3SM feature' test_jenkins_generic_job_kill (__main__.P_TestJenkinsGenericJob) ... skipped 'Skipping Jenkins tests. E3SM feature' test_bless_test_results (__main__.Q_TestBlessTestResults) ... FAIL ...... test_run_restart (__main__.T_TestRunRestart) ... FAIL test_run_restart_too_many_fails (__main__.T_TestRunRestart) ... FAIL test_query_components (__main__.X_TestQueryConfig) ... ok test_query_compsets (__main__.X_TestQueryConfig) ... ok test_query_grids (__main__.X_TestQueryConfig) ... ok test_query_machines (__main__.X_TestQueryConfig) ... ok test_single_submit (__main__.X_TestSingleSubmit) ... skipped 'Skipping single submit. Not valid without batch' test_full_system (__main__.Z_FullSystemTest) ... FAIL ......

Thanks,
Nuanliang
 

jedwards

CSEG and Liaisons
Staff member
Ah okay - try creating a virtual environment using
pylint 2.17.4
astroid 2.15.5
Python 3.8.18 | packaged by conda-forge | (default, Dec 23 2023, 17:21:28)
 

Nuanliang

Nuanliang
New Member
Ah okay - try creating a virtual environment using
pylint 2.17.4
astroid 2.15.5
Python 3.8.18 | packaged by conda-forge | (default, Dec 23 2023, 17:21:28)
Hi jedwards,

Thanks for the suggestions. I tried creating a conda environment with the specific versions of pylint.

$ pylint --version pylint 2.17.4 astroid 2.15.5 Python 3.8.18 | packaged by conda-forge | (default, Dec 23 2023, 17:21:28) [GCC 12.3.0]

Yet the same errors (2) and (3) still show up.
(2) "NameError: name 'sys' is not defined"
(3) some compilation errors associated with files not found: " ERRPUT: File not found: lnd2rof_fmapname = "lnd/clm2/mappingdata/maps/1.9x2.5/map_1.9x2.5_nomask_to_0.5x0.5_nomask_aave_da_c120522.nc", will attempt to download in check_input_data phase"

test_pylint_config_e3sm_tests_py (__main__.B_CheckCode) ... FAIL test_pylint_scripts_Tools___init___py (__main__.B_CheckCode) ... FAIL test_pylint_scripts_Tools_archive_metadata (__main__.B_CheckCode) ... FAIL test_pylint_scripts_Tools_bless_test_results (__main__.B_CheckCode) ... FAIL test_pylint_scripts_Tools_case_build (__main__.B_CheckCode) ... FAIL test_pylint_scripts_Tools_case_cmpgen_namelists (__main__.B_CheckCode) ... FAIL test_pylint_scripts_Tools_case_diff (__main__.B_CheckCode) ... FAIL test_pylint_scripts_Tools_case_qstatus (__main__.B_CheckCode) ... FAIL test_pylint_scripts_Tools_case_setup (__main__.B_CheckCode) ... FAIL test_pylint_scripts_Tools_case_submit (__main__.B_CheckCode) ... FAIL test_pylint_scripts_Tools_check_case (__main__.B_CheckCode) ... FAIL test_pylint_scripts_Tools_check_input_data (__main__.B_CheckCode) ... FAIL test_pylint_scripts_Tools_check_lockedfiles (__main__.B_CheckCode) ... FAIL test_pylint_scripts_Tools_cime_bisect (__main__.B_CheckCode) ... FAIL test_pylint_scripts_Tools_cimeteststatus (__main__.B_CheckCode) ... FAIL test_pylint_scripts_Tools_code_checker (__main__.B_CheckCode) ... FAIL ......

... ============================================================== FAIL: test_pylint_scripts_Tools___init___py (__main__.B_CheckCode) ---------------------------------------------------------------------- Traceback (most recent call last): File "scripts_regression_tests.py", line 2758, in test self.assertTrue(result == "", msg=result) AssertionError: False is not true : Traceback (most recent call last): File "/data/warmcold/miniconda3/envs/cesm1/bin/pylint", line 10, in <module> sys.exit(run_pylint()) File "/data/warmcold/miniconda3/envs/cesm1/lib/python3.8/site-packages/pylint/__init__.py", line 36, in run_pylint PylintRun(argv or sys.argv[1:]) File "/data/warmcold/miniconda3/envs/cesm1/lib/python3.8/site-packages/pylint/lint/run.py", line 140, in __init__ args = _preprocess_options(self, args) File "/data/warmcold/miniconda3/envs/cesm1/lib/python3.8/site-packages/pylint/config/utils.py", line 273, in _preprocess_options cb(run, value) File "/data/warmcold/miniconda3/envs/cesm1/lib/python3.8/site-packages/pylint/config/utils.py", line 169, in _init_hook exec(value) # pylint: disable=exec-used File "<string>", line 1, in <module> NameError: name 'sys' is not defined ......

Weirdly they seem all display the same python error message. If we look into the source code a little deeper, the error traces back to

File "scripts_regression_tests.py", line 2758
...... def make_pylint_test(pyfile, all_files): def test(self): if B_CheckCode.all_results is None: B_CheckCode.all_results = check_code(all_files) #pylint: disable=unsubscriptable-object result = B_CheckCode.all_results[pyfile] self.assertTrue(result == "", msg=result) return test ......

File "/data/warmcold/miniconda3/envs/cesm1/lib/python3.8/site-packages/pylint/config/utils.py", line 169
...... # pylint: disable-next=unused-argument def _init_hook(run: Run, value: str | None) -> None: """Execute arbitrary code from the init_hook. This can be used to set the 'sys.path' for example. """ assert value is not None exec(value) # pylint: disable=exec-used ......

What do you think where the problem could be? I really don't have a good idea. Thank you!

Nuanliang
 

Nuanliang

Nuanliang
New Member
Hi everyone,

Now that I've managed to install the cesm2.1.5 and build a Held-Suarez configuration (Held Suarez | Community Earth System Model) successfully on the local server. I'd like to write down some problems that I met after the last post in this thread. Special thanks to jedwards for helping me with porting problems!


First of all, the remaining problems of running the test script scripts_regression_tests.py were not critical and can be safely ignored.
(2) "NameError: name 'sys' is not defined"
(3) some compilation errors associated with files not found: " ERRPUT: File not found: lnd2rof_fmapname = "lnd/clm2/mappingdata/maps/1.9x2.5/map_1.9x2.5_nomask_to_0.5x0.5_nomask_aave_da_c120522.nc", will attempt to download in check_input_data phase"


Then after creating a Held-Suarez case by running the commnad
./create_newcase --case $CASEDIR --compset FHS94 --res T42z30_T42_mg17

There were 2 build errors when trying to build the executable.

The first one is related to the configuration of intel compiler. The error message looks something similar like this
Code:
gcc: error: precise: No such file or directory
gcc: error: minimal: No such file or directory
gcc: error: unrecognized command line option ‘-qno-opt-dynamic-align’
gcc: error: unrecognized command line option ‘-fp-model’
The solution to this problem is given by this post ("case.build" error). I modified the compiler configuration file /data/warmcold/my_cesm_sandbox/cime/config/cesm/machines/config_compilers.xml, and changed 3 lines of the section for intel compiler, namely
XML:
  <MPICC> mpiicc  </MPICC>
  <MPICXX> mpiicpc </MPICXX>
  <MPIFC> mpiifort </MPIFC>

The second is related to the setup of the path to netcdf library. The problem and corresponding solutions have been discussed in this post ([case.build error]:NETCDF not found). In my case, I simply add a line to my bashrc file.
Bash:
NETCDF_PATH='/software/netcdf-4.9.0+intel-2020'

After fixing these 2 problems, I can run the Held-Suarez setup and get similar test output as described in the webpage (Held Suarez | Community Earth System Model).

There is one additional error line shows up after the model finished 1 run,
Code:
ERROR: No result from jobs [('case.run', None), ('case.st_archive', 'case.run or case.test')]
which doesn't seem to be a problem and can be ignored, according to the post here (ERROR: No result from jobs [('case.run', None), ('case.st_archive', 'case.run or case.test')]).

Nice! Looking forward to more discussion on using the cesm model:)

Nuanliang
 
Top