Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

Lib needed for Py 2.7.15

wvsi3w

wvsi3w
Member
Hi.
I am going to do the ensembled consistency test on beluga and narval.
It needs python 2.7.15 to run it and I installed it manually on beluga but for the test to run it needs some libraries to be loaded.
Do you know what specific libraries are needed for this version of python?

Let me know if there is another way to do the test.

Thanks in advance
 

erik

Erik Kluzek
CSEG and Liaisons
Staff member
Hmmm. I'm not sure about the answer to this one. I'll try to track someone down.
 

jedwards

CSEG and Liaisons
Staff member
Please be more specific about the test you are running and the error you are encountering.
The latest version of the ECT does not require python2.
 

wvsi3w

wvsi3w
Member
Hello again
sorry for the delay I was trying several stuff.
I tried the ECT with latest version of Python before (and tried again recently) but it doesn't work. So I tried it with 2.7.18 and the following is what happened on Narval and Beluga clusters of Canada:

Last time it failed on beluga because of some missing libraries or modules (e.g. python 2.7.15) and it failed even by using a virtual environment and using the manually loaded py2.7.15 .
This time I used py2.7.18 and didn't have the previous issue but I am facing another problem with beluga which seems like other missing libraries:

ERROR: /lustre03/project/6001010/my_cesm_sandbox/cime/src/build_scripts/buildlib.mct FAILED, cat /home/meisam/projects/def-hbeltram/cesm2_1_3_OUT/ensemble.cesm_tag.000/bld/mct.bldlog.231115-164556
Error building...

I tried ECT on Narval too, and shockingly it passed the steps of ensemble.py and submitted all of the ECTs on Narval but unfortunately it all failed (2 min of running 4 ect) with the following message:

case.run error
'ascii' codec can't encode character u'\xe9' in position 852: ordinal not in range(128)
case.setup error
'ascii' codec can't encode character u'\xe9' in position 787: ordinal not in range(128)

- P.S. 1. I am using different configurations on Narval and Beluga and the one I use on Narval doesn't require adding python version in the bash_profile file but the one I used in beluga requires that.

P.S. 2. I am going to try this on another cluster (Cedar) and will let you know of the results of doing ECT.
 

jedwards

CSEG and Liaisons
Staff member
The lastest version of pysect used in ECT is available at GitHub - NCAR/PyCECT: The Community Earth System Model Ensemble Consistency Test (CESM-ECT) suite is an alternative to requiring bitwise identical output for quality assurance. This objective test provides a statistical measurement of consistency between an accepted ensemble and a test set of CESM simulations.
I recommend that you update to that version and try again. Also it looks like you are failing to
build cesm as a part of that test - have you confirmed that you can build the requested cesm compset
outside of the ECT test?
 

wvsi3w

wvsi3w
Member
The lastest version of pysect used in ECT is available at GitHub - NCAR/PyCECT: The Community Earth System Model Ensemble Consistency Test (CESM-ECT) suite is an alternative to requiring bitwise identical output for quality assurance. This objective test provides a statistical measurement of consistency between an accepted ensemble and a test set of CESM simulations.
I recommend that you update to that version and try again. Also it looks like you are failing to
build cesm as a part of that test - have you confirmed that you can build the requested cesm compset
outside of the ECT test?
Thanks for the link. I will try it but first may I ask if I can simply download it in my cesm sandbox (the original path for the ECT) or should I remove the content of my current ect path?

About the build fail; yes I tried several compsets on beluga (historical like IHIST...) and it ran after submission.
 

jedwards

CSEG and Liaisons
Staff member
You selected a compset to run in the ECT - did you test that same compset outside of the ECT framework?
 

wvsi3w

wvsi3w
Member
You selected a compset to run in the ECT - did you test that same compset outside of the ECT framework?
Sorry I think I didnt understand your question;
this is the command line I used for the ECT:
python ensemble.py --case /home/meisam/scratch/cases/ensemble.cesm_tag.000 --mach beluga --ensemble 4 --ect cam --project P99999999

this does the job and uses its own compsets (I guess), and I think this ect uses land and atmosphere compsets. I personally did try some other compsets for my own simulation (CLM5),

Did I understand and responded correctly to your question?
 

jedwards

CSEG and Liaisons
Staff member
For cesm 2.1 the defaults are:
print ' --compiler <name> Compiler to use (default = same as Machine default) '
print ' --compset <name> Compset to use (default = F2000climo (CAM-ECT) or G (POP-ECT))'
print ' --res <name> Resolution to run (default = f19_f19 (CAM-ECT) or T62_g17 (POP-ECT))'

So you should try running res=f19_f19 compset=F2000climo
 

wvsi3w

wvsi3w
Member
For cesm 2.1 the defaults are:
print ' --compiler <name> Compiler to use (default = same as Machine default) '
print ' --compset <name> Compset to use (default = F2000climo (CAM-ECT) or G (POP-ECT))'
print ' --res <name> Resolution to run (default = f19_f19 (CAM-ECT) or T62_g17 (POP-ECT))'

So you should try running res=f19_f19 compset=F2000climo
Hello,
I tried it and I was able to build the --compset F2000climo --res f19_f19 on beluga.
"
MODEL BUILD HAS FINISHED SUCCESSFULLY
"

I haven't updated the PyCECT yet as I was not totally sure how to do it and the installation link was not available at the time of my search. The installation guide on PyCECT's documentation page is listed as "Coming Soon".

the pyCECT is in this path on my cluster (projects/def-hbeltram/my_cesm_sandbox/cime/tools/statistical_ensemble_test/pyCECT/). how should I update this? should I remove this first then download the updated version inside this path? Or maybe should I backup the existing one in case of any probable issues.
 

wvsi3w

wvsi3w
Member
Hello,
I tried it and I was able to build the --compset F2000climo --res f19_f19 on beluga.
"
MODEL BUILD HAS FINISHED SUCCESSFULLY
"

I haven't updated the PyCECT yet as I was not totally sure how to do it and the installation link was not available at the time of my search. The installation guide on PyCECT's documentation page is listed as "Coming Soon".

the pyCECT is in this path on my cluster (projects/def-hbeltram/my_cesm_sandbox/cime/tools/statistical_ensemble_test/pyCECT/). how should I update this? should I remove this first then download the updated version inside this path? Or maybe should I backup the existing one in case of any probable isses.
Dear @jedwards

May I know your opinion on the previous question (regarding how to update that pyCECT).
Your guidance is much appreciated.

P.S. I tried porting the model on cedar (another cluster) and it failed in the build process, I will try fixing it so that I can use that for the ECT later.
 

wvsi3w

wvsi3w
Member
Thanks for the guidance, I did this and tried ECT again on beluga. It fails again. I belive it is related to the StdEnv and the loaded modules.

These are the loaded modules for the model right now:
Currently Loaded Modules:
1) nixpkgs/16.09 (S) 4) icc/.2018.3.222 (H) 7) StdEnv/2018.3 (S) 10) python/3.7.4 (t) 13) hdf5-mpi/1.10.3 (io)
2) imkl/2018.3.222 (math) 5) ifort/.2018.3.222 (H) 8) mii/1.1.1 11) cmake/3.16.3 (t) 14) netcdf-mpi/4.4.1.1 (io)
3) gcccore/.7.3.0 (H) 6) intel/2018.3 (t) 9) perl/5.22.4 (t) 12) intelmpi/2018.3.222 (m) 15) netcdf-fortran-mpi/4.4.4 (io)

after I updated the PyCECT, I tried the "python ensemble.py --case /home/meisam/scratch/cases/ensemble.cesm_tag.000 --mach beluga --ensemble 4 --ect cam --project P99999999" on beluga and this showed up:

File "ensemble.py", line 64
print 'Error: cannot have an ensemble size greater than 999.'
^
SyntaxError: Missing parentheses in call to 'print'. Did you mean print('Error: cannot have an ensemble size greater than 999.')?
[meisam@beluga3 statistical_ensemble_test]$ module load python/2.7.18
Lmod has detected the following error: These module(s) or extension(s) exist but cannot be loaded as requested: "python/2.7.18"
Try: "module spider python/2.7.18" to see how to load the module(s).

I did try to unload the StdEnv/2018.3 (with --force) and loaded the 2020 version but the python ensemble.py shows this which is another error:
STATUS: stat_dir = /lustre03/project/6001010/my_cesm_sandbox/cime/tools/statistical_ensemble_test
Error: Need a valid full path with the case name (--case).

I am guessing this is because of the StdEnv and the current modules because these are the "Inactive Modules:
1) perl/5.22.4 2) cmake/3.16.3 3) intelmpi/2018.3.222 4) hdf5-mpi/1.10.3 5) netcdf-mpi/4.4.1.1 6) netcdf-fortran-mpi/4.4.4"

And for loading each one of them I have to load bunch of other modules too. like for example:
You will need to load all module(s) on any one of the lines below before the "hdf5-mpi/1.10.3" module is available to load.

nixpkgs/16.09 gcc/7.3.0 cuda/9.2.148 openmpi/3.1.2
nixpkgs/16.09 gcc/7.3.0 openmpi/3.1.2
nixpkgs/16.09 gcc/7.3.0 openmpi/3.1.4
nixpkgs/16.09 gcc/8.3.0 openmpi/4.0.1
nixpkgs/16.09 intel/2018.3 cuda/10.0.130 openmpi/3.1.2
nixpkgs/16.09 intel/2018.3 impi/2018.3.222
nixpkgs/16.09 intel/2018.3 intelmpi/2018.3.222
nixpkgs/16.09 intel/2018.3 openmpi/3.1.2
nixpkgs/16.09 intel/2018.3 openmpi/3.1.4
nixpkgs/16.09 intel/2019.3 openmpi/4.0.1

or for python: You will need to load all module(s) on any one of the lines below before the "python/3.7.4" module is available to load.

nixpkgs/16.09

I did load this one (nixpkgs/16.09) to check if it works, but it didn't and showed the first error again.


P.S. I will update the PyCECT on narval and let you know how that goes, however on narval it submitted the job for ECT and did run for 2min before and failed, I will try this updated PyCECT on narval to check if it would fail after submission or not.
 

wvsi3w

wvsi3w
Member
Thanks for the guidance, I did this and tried ECT again on beluga. It fails again. I belive it is related to the StdEnv and the loaded modules.

These are the loaded modules for the model right now:
Currently Loaded Modules:
1) nixpkgs/16.09 (S) 4) icc/.2018.3.222 (H) 7) StdEnv/2018.3 (S) 10) python/3.7.4 (t) 13) hdf5-mpi/1.10.3 (io)
2) imkl/2018.3.222 (math) 5) ifort/.2018.3.222 (H) 8) mii/1.1.1 11) cmake/3.16.3 (t) 14) netcdf-mpi/4.4.1.1 (io)
3) gcccore/.7.3.0 (H) 6) intel/2018.3 (t) 9) perl/5.22.4 (t) 12) intelmpi/2018.3.222 (m) 15) netcdf-fortran-mpi/4.4.4 (io)

after I updated the PyCECT, I tried the "python ensemble.py --case /home/meisam/scratch/cases/ensemble.cesm_tag.000 --mach beluga --ensemble 4 --ect cam --project P99999999" on beluga and this showed up:

File "ensemble.py", line 64
print 'Error: cannot have an ensemble size greater than 999.'
^
SyntaxError: Missing parentheses in call to 'print'. Did you mean print('Error: cannot have an ensemble size greater than 999.')?
[meisam@beluga3 statistical_ensemble_test]$ module load python/2.7.18
Lmod has detected the following error: These module(s) or extension(s) exist but cannot be loaded as requested: "python/2.7.18"
Try: "module spider python/2.7.18" to see how to load the module(s).

I did try to unload the StdEnv/2018.3 (with --force) and loaded the 2020 version but the python ensemble.py shows this which is another error:
STATUS: stat_dir = /lustre03/project/6001010/my_cesm_sandbox/cime/tools/statistical_ensemble_test
Error: Need a valid full path with the case name (--case).

I am guessing this is because of the StdEnv and the current modules because these are the "Inactive Modules:
1) perl/5.22.4 2) cmake/3.16.3 3) intelmpi/2018.3.222 4) hdf5-mpi/1.10.3 5) netcdf-mpi/4.4.1.1 6) netcdf-fortran-mpi/4.4.4"

And for loading each one of them I have to load bunch of other modules too. like for example:
You will need to load all module(s) on any one of the lines below before the "hdf5-mpi/1.10.3" module is available to load.

nixpkgs/16.09 gcc/7.3.0 cuda/9.2.148 openmpi/3.1.2
nixpkgs/16.09 gcc/7.3.0 openmpi/3.1.2
nixpkgs/16.09 gcc/7.3.0 openmpi/3.1.4
nixpkgs/16.09 gcc/8.3.0 openmpi/4.0.1
nixpkgs/16.09 intel/2018.3 cuda/10.0.130 openmpi/3.1.2
nixpkgs/16.09 intel/2018.3 impi/2018.3.222
nixpkgs/16.09 intel/2018.3 intelmpi/2018.3.222
nixpkgs/16.09 intel/2018.3 openmpi/3.1.2
nixpkgs/16.09 intel/2018.3 openmpi/3.1.4
nixpkgs/16.09 intel/2019.3 openmpi/4.0.1

or for python: You will need to load all module(s) on any one of the lines below before the "python/3.7.4" module is available to load.

nixpkgs/16.09

I did load this one (nixpkgs/16.09) to check if it works, but it didn't and showed the first error again.


P.S. I will update the PyCECT on narval and let you know how that goes, however on narval it submitted the job for ECT and did run for 2min before and failed, I will try this updated PyCECT on narval to check if it would fail after submission or not.
Narval ECT failed after submission again with the same error message:
case.run error
'ascii' codec can't encode character u'\xe9' in position 852: ordinal not in range(128)
case.setup error
'ascii' codec can't encode character u'\xe9' in position 787: ordinal not in range(128)
 

wvsi3w

wvsi3w
Member
updates:
I tried doing the ECT (after updating the PyCECT) on beluga again and it fails every time. I tried several StdEnv (2023, 2020, 2018) with lots of different dependant modules and loaded several versions of pythons (2.7.18, 2.7.14, 3.7.4, 3.8.0, 3.10.13) and none of them worked. It keeps showing the same error when I hit the "python ensemble.py --case ..." command:

File "/lustre03/project/6001010/my_cesm_sandbox/cime/tools/statistical_ensemble_test/ensemble.py", line 64
print 'Error: cannot have an ensemble size greater than 999.'
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
SyntaxError: Missing parentheses in call to 'print'. Did you mean print(...)?

==============================================================
File "ensemble.py", line 64
print 'Error: cannot have an ensemble size greater than 999.'
^
SyntaxError: Missing parentheses in call to 'print'. Did you mean print('Error: cannot have an ensemble size greater than 999.')?

==============================================================
.
.
.
 

jedwards

CSEG and Liaisons
Staff member
On line 64 of ensemble.py add parenthesis

diff --git a/tools/statistical_ensemble_test/ensemble.py b/tools/statistical_ensemble_test/ensemble.py
index 042d335c4..0b15d0604 100644
--- a/tools/statistical_ensemble_test/ensemble.py
+++ b/tools/statistical_ensemble_test/ensemble.py
@@ -61,7 +61,7 @@ def main(argv):
run_type = 'ensemble'
clone_count = ens_size - 1
if ens_size > 999:
- print 'Error: cannot have an ensemble size greater than 999.'
+ print('Error: cannot have an ensemble size greater than 999.')
sys.exit()
print('STATUS: ensemble size = ' + str(ens_size))
 

wvsi3w

wvsi3w
Member
On line 64 of ensemble.py add parenthesis

diff --git a/tools/statistical_ensemble_test/ensemble.py b/tools/statistical_ensemble_test/ensemble.py
index 042d335c4..0b15d0604 100644
--- a/tools/statistical_ensemble_test/ensemble.py
+++ b/tools/statistical_ensemble_test/ensemble.py
@@ -61,7 +61,7 @@ def main(argv):
run_type = 'ensemble'
clone_count = ens_size - 1
if ens_size > 999:
- print 'Error: cannot have an ensemble size greater than 999.'
+ print('Error: cannot have an ensemble size greater than 999.')
sys.exit()
print('STATUS: ensemble size = ' + str(ens_size))
I initially changed them all (a month ago) and added all the needed parenthesis. I did this again right now. it keeps asking for multiple missing parentheses which I edited in both ensemble.py and single_run.py.

But it fails with this error:
python ensemble.py --case /home/meisam/scratch/cases/ensemble.cesm_tag.000 --mach beluga --ensemble 4 --ect cam --project P99999999
STATUS: stat_dir = /lustre03/project/6001010/my_cesm_sandbox/cime/tools/statistical_ensemble_test
Error: Need a valid full path with the case name (--case).
 

Attachments

  • py ().png
    py ().png
    193.2 KB · Views: 2

jedwards

CSEG and Liaisons
Staff member
Try updating cime to maint-5.6 as follows:

cd cime
git checkout maint-5.6
git pull origin maint-5.6
 

wvsi3w

wvsi3w
Member
Try updating cime to maint-5.6 as follows:

cd cime
git checkout maint-5.6
git pull origin maint-5.6
the checkout passed but the git pull showed this error (should I proceed anyway?):

git pull origin maint-5.6
remote: Enumerating objects: 122, done.
remote: Counting objects: 100% (122/122), done.
remote: Compressing objects: 100% (63/63), done.
remote: Total 122 (delta 68), reused 106 (delta 56), pack-reused 0
Receiving objects: 100% (122/122), 90.05 KiB | 3.60 MiB/s, done.
Resolving deltas: 100% (68/68), completed with 17 local objects.
From GitHub - ESMCI/cime: Common Infrastructure for Modeling the Earth
* branch maint-5.6 -> FETCH_HEAD
38dfe3211..5696f1b87 maint-5.6 -> origin/maint-5.6
Updating 38dfe3211..5696f1b87
error: Your local changes to the following files would be overwritten by merge:
tools/statistical_ensemble_test/ensemble.py
tools/statistical_ensemble_test/single_run.py
Please commit your changes or stash them before you merge.
Aborting
 
Top