Thanks for the guidance, I did this and tried ECT again on beluga. It fails again. I belive it is related to the StdEnv and the loaded modules.
These are the loaded modules for the model right now:
Currently Loaded Modules:
1) nixpkgs/16.09 (S) 4) icc/.2018.3.222 (H) 7) StdEnv/2018.3 (S) 10) python/3.7.4 (t) 13) hdf5-mpi/1.10.3 (io)
2) imkl/2018.3.222 (math) 5) ifort/.2018.3.222 (H) 8) mii/1.1.1 11) cmake/3.16.3 (t) 14) netcdf-mpi/4.4.1.1 (io)
3) gcccore/.7.3.0 (H) 6) intel/2018.3 (t) 9) perl/5.22.4 (t) 12) intelmpi/2018.3.222 (m) 15) netcdf-fortran-mpi/4.4.4 (io)
after I updated the PyCECT, I tried the "python ensemble.py --case /home/meisam/scratch/cases/ensemble.cesm_tag.000 --mach beluga --ensemble 4 --ect cam --project P99999999" on beluga and this showed up:
File "ensemble.py", line 64
print 'Error: cannot have an ensemble size greater than 999.'
^
SyntaxError: Missing parentheses in call to 'print'. Did you mean print('Error: cannot have an ensemble size greater than 999.')?
[meisam@beluga3 statistical_ensemble_test]$ module load python/2.7.18
Lmod has detected the following error: These module(s) or extension(s) exist but cannot be loaded as requested: "python/2.7.18"
Try: "module spider python/2.7.18" to see how to load the module(s).
I did try to unload the StdEnv/2018.3 (with --force) and loaded the 2020 version but the python ensemble.py shows this which is another error:
STATUS: stat_dir = /lustre03/project/6001010/my_cesm_sandbox/cime/tools/statistical_ensemble_test
Error: Need a valid full path with the case name (--case).
I am guessing this is because of the StdEnv and the current modules because these are the "Inactive Modules:
1) perl/5.22.4 2) cmake/3.16.3 3) intelmpi/2018.3.222 4) hdf5-mpi/1.10.3 5) netcdf-mpi/4.4.1.1 6) netcdf-fortran-mpi/4.4.4"
And for loading each one of them I have to load bunch of other modules too. like for example:
You will need to load all module(s) on any one of the lines below before the "hdf5-mpi/1.10.3" module is available to load.
nixpkgs/16.09 gcc/7.3.0 cuda/9.2.148 openmpi/3.1.2
nixpkgs/16.09 gcc/7.3.0 openmpi/3.1.2
nixpkgs/16.09 gcc/7.3.0 openmpi/3.1.4
nixpkgs/16.09 gcc/8.3.0 openmpi/4.0.1
nixpkgs/16.09 intel/2018.3 cuda/10.0.130 openmpi/3.1.2
nixpkgs/16.09 intel/2018.3 impi/2018.3.222
nixpkgs/16.09 intel/2018.3 intelmpi/2018.3.222
nixpkgs/16.09 intel/2018.3 openmpi/3.1.2
nixpkgs/16.09 intel/2018.3 openmpi/3.1.4
nixpkgs/16.09 intel/2019.3 openmpi/4.0.1
or for python: You will need to load all module(s) on any one of the lines below before the "python/3.7.4" module is available to load.
nixpkgs/16.09
I did load this one (nixpkgs/16.09) to check if it works, but it didn't and showed the first error again.
P.S. I will update the PyCECT on narval and let you know how that goes, however on narval it submitted the job for ECT and did run for 2min before and failed, I will try this updated PyCECT on narval to check if it would fail after submission or not.