Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

CESM2.3.beta08 porting issue

James King

James King
Member
That's why you should really try to avoid hardcoded paths. All of these ESMF_ variables
are set by ESMFMKFILE and they should not need to be set individually. Just make sure
that ESMFMKFILE is in the environment.

Yes agreed! I will add ESMFMKFILE to my path and see if that makes any difference.

I've submitted a support ticket to ARCHER2's help desk summarising this discussion - thanks a lot for your advice. If it's an issue around compiler installations and/or the content of centrally located module files then it'll be beyond my power to fix but hopefully we've been able to identify some of the reasons why the case is failing.
 

viswanathvelamuri

Viswanath
New Member
EDIT - fixed this, ignore. Previous question about cmake macros still stands.
Hi James,
I am facing a similar issue when I try to create a new case for the local HPC. Can you please share how you rectified the mistake?
I am attaching the error below.

Using project from config_machines.xml: USER_REQUESTED_PROJECT
No charge_account info available, using value from PROJECT
cesm model version found: release-cesm2.2.2-2-g488ecf9
Batch_system_type is pbs
job is case.run USER_REQUESTED_WALLTIME None USER_REQUESTED_QUEUE None WALLTIME_FORMAT %H:%M:%S
WARNING: No queue on this system met the requirements for this job. Falling back to defaults

ERROR: No queues found

Thanks in advance
 

James King

James King
Member
Hi James,
I am facing a similar issue when I try to create a new case for the local HPC. Can you please share how you rectified the mistake?
I am attaching the error below.

Using project from config_machines.xml: USER_REQUESTED_PROJECT
No charge_account info available, using value from PROJECT
cesm model version found: release-cesm2.2.2-2-g488ecf9
Batch_system_type is pbs
job is case.run USER_REQUESTED_WALLTIME None USER_REQUESTED_QUEUE None WALLTIME_FORMAT %H:%M:%S
WARNING: No queue on this system met the requirements for this job. Falling back to defaults

ERROR: No queues found

Thanks in advance
Hi Viswanath,

This error is probably because you need to provide details of your HPC's batch system, including the names and properties of the various job queues, in cime/config/cesm/machines/config_batch.xml.

Hope that helps,

James
 

viswanathvelamuri

Viswanath
New Member
Hi James,

Thank you for the answer.

Do you know why the following error is occuring while building the case:
[cez238111@login02 ~/scratch/new_case/test1]
$ ./case.build
Building case in directory /scratch/civil/phd/cez238111/new_case/test1
sharedlib_only is False
model_only is False
Setting Environment NETCDF=/home/soft/centOS/lib/intel/2020/netcdf/c-4.8.0-cxx-4.3.1-f-4.5.3
Setting Environment PNETCDF=/home/soft/centOS/lib/intel/2020/pnetcdf/1.12.2
Setting Environment HDF5=/home/soft/centOS/lib/intel/2020/hdf5/1.12.1
Setting Environment PHDF5=/home/soft/centOS/lib/intel/2020/hdf5/1.12.1
Setting Environment MPIROOT=/home/soft/intel2020u4/impi/2019.9.304/intel64
Setting Environment CIMEROOT=/home/soft/centOS/apps/cesm/2.2.0/my_cesm_sandbox/cime
Generating component namelists as part of build
Creating component namelists
2024-05-20 19:10:42 atm
Calling /scratch/civil/phd/cez238111/{my_cesm_sandbox}/components/cam//cime_config/buildnml
...calling cam buildcpp to set build time options
ERROR: Command /scratch/civil/phd/cez238111/{my_cesm_sandbox}/components/cam/bld/build-namelist -ntasks 160 -csmdata /home/civil/phd/cez238111/scratch/inputdata -infile /home/civil/phd/cez238111/scratch/new_case/test1/Buildconf/camconf/namelist -start_ymd 20100101 -ignore_ic_year -use_case hist_trop_strat_nudged_cam6 -inputdata /home/civil/phd/cez238111/scratch/new_case/test1/Buildconf/cam.input_data_list -namelist " &atmexp /" failed rc=2
out=
err=CAM build-namelist - ERROR: invalid value of use_case (hist_trop_strat_nudged_cam6) specified in commandline
expected one of:
 

jedwards

CSEG and Liaisons
Staff member
This error appears to be due to specifying a cam use_case "hist_trop_strat_nudged_cam6" which is not defined in the version of cam that you have checked out.
 

James King

James King
Member
Yes, I would advise using a defined compset for testing purposes first (e.g. F2000climo) before attempting something more complicated. You may be able to define this particular use_case in one of the namelist files in the CAM src but that's not something I've done before.
 

inos@bas_ac_uk

Ingrid Cnossen
Member
OK thanks. With this line in gnu_archer2.cmake, I can build and submit a case. However we're still having fun with compilers as the run fails with this line in the cesm.log:

/work/n02/n02/jking/cesm/CESM2.3.beta08/cesm_sims/runs/F2000climo_test/bld/cesm.exe: error while loading shared libraries: libnetcdf_parallel_gnu_91.so.18: cannot open shared object file: No such file or directory'

I have the module

cray-parallel-netcdf/1.12.3.1

loaded in my environment.
Hi James,

I'm also using ARCHER2 and getting the exact same error message about the netcdf library when trying to re-run a case I ran previously without any problem back in Oct 2022 (I just wanted to re-run to add a variable to my output that I missed that time - not as easy as I thought it'd be!). I'm guessing this is due to a module change on ARCHER2 in the meantime, but I haven't been able to pinpoint the issue or fix it. I followed the rest of this thread, but that hasn't led me to the solution either. Could you share how you fixed it in the end (assuming you did)?

Thanks!
Ingrid
 

jedwards

CSEG and Liaisons
Staff member
You may need to set an rpath in the executable to find the library. Add flag

string(APPEND LDFLAGS " -Wl,-rpath,${NETCDF_PATH}/lib")
 

Yuan Sun

Yuan Sun
Member
Thank you so much for you response! And sorry for being incredibly slow to follow this up. But where exactly would I need to add this flag?
Hi Inos,

I have ported CESM3 in archer2. Please goto ccs_config/machines/cmake_macros and create a new file named gnu_archer2.cmake. Below are my scripts for your reference.

set(HDF5_DIR "/opt/cray/pe/hdf5-parallel/1.12.2.1/gnu/9.1")
set(NETCDF_C_PATH "/opt/cray/pe/netcdf-hdf5parallel/4.9.0.1/gnu/9.1")
set(NETCDF_FORTRAN_PATH "/opt/cray/pe/netcdf-hdf5parallel/4.9.0.1/gnu/9.1")
set(ESMF_LIBDIR "/mnt/lustre/a2fs-work2/work/n02/n02/yuansun/privatemodules_packages/archer2/apps/gcc/esmf/8.6/lib")
set(PIO_LIBDIR "/mnt/lustre/a2fs-work2/work/n02/n02/yuansun/privatemodules_packages/archer2/apps/gcc/pio2/2.6.2/lib")
string(APPEND SLIBS " -L$ENV{HDF5_PATH}/lib -lhdf5_fortran -lhdf5 -lhdf5_hl -lhdf5hl_fortran -Wl,-rpath,${HDF5_PATH}/lib")
string(APPEND SLIBS " -L${NETCDF_C_PATH}/lib -L${NETCDF_FORTRAN_PATH}/lib -lnetcdff -lnetcdf -lm")
string(APPEND SLIBS " -L${PIO_LIBDIR} -Wl,-rpath,${PIO_LIBDIR} -ldl")
string(APPEND SLIBS " -L${ESMF_LIBDIR} -Wl,-rpath,${ESMF_LIBDIR} -ldl")

Best,
Yuan
 

inos@bas_ac_uk

Ingrid Cnossen
Member
Hi Yuan,

Thanks for getting in touch! I see that the ESMF library you have listed is different from what my code is pointing to, and I think that might be where the problem has been coming from. It looks to me like the ESMF library my code is pointing to was compiled with an older version of the netcdf-hdf5parallel library. So I am hopeful that your solution might work. However, I can't find the (sub)directory you are pointing me to. I don't seem to have one called ccs_config. I do have: my_cesm_sandbox/cime/config/machines. Is that where I should create the file gnu_archer2.cmake? And just to make sure that I understand you correctly: should I put all the commands you listed above as contents of the file called gnu_archer2.cmake?

Thanks for your help!
Ingrid
 

inos@bas_ac_uk

Ingrid Cnossen
Member
Hi Yuan,

I really do not have ccs_config in my_cesm_sandbox. Maybe this is a new directory in a more recent version. I am running CESM version 2.1.3 - sorry I was not clear about this. I'm hoping not to have to switch at this point, as I just want to do a re-run of a short simulation I did 2 years ago with an extra output variable that I had initially missed. If I have to update I'll need to re-do the initial spin-up simulation, as well as a set of simulations I want to compare with... Is your solution still relevant for version 2.1.3?

Thanks,
Ingrid
 

Yuan Sun

Yuan Sun
Member
Hi Yuan,

I really do not have ccs_config in my_cesm_sandbox. Maybe this is a new directory in a more recent version. I am running CESM version 2.1.3 - sorry I was not clear about this. I'm hoping not to have to switch at this point, as I just want to do a re-run of a short simulation I did 2 years ago with an extra output variable that I had initially missed. If I have to update I'll need to re-do the initial spin-up simulation, as well as a set of simulations I want to compare with... Is your solution still relevant for version 2.1.3?

Thanks,
Ingrid
Hi Ingrid,

So sorry if you are using 2.1.3, there is no ccs_config and please ignore my words. That's for CESM2.3 and above.

The CESM2.1.3 should work without any private software if you follow the First-Time setup of CESM 2.1.3 - ARCHER2 User Documentation

We do not need to change anything except for the $CESM_LOC/Externals.cfg to choose the component version.

Best,
Yuan
 

inos@bas_ac_uk

Ingrid Cnossen
Member
Hi Yuan,

Ok. I did follow the steps in the Archer documentation, but it does not work for me. There is an issue where the code is trying to use a Netcdf library that no longer exists (an older version), which must be due to an update on Archer. The same case ran successfully about 2 years ago. I'm just not sure a) which bit of the code is trying to use the old library, and b) how to fix it. I think the shared ESMF library (in /work/n02/shared/ESMF) was compiled with the previous library, but I don't know if that's what's causing the problem. I've tried compiling without the ESMF library, but that fils at the build stage. This is the error message I get when trying to run the code (with ESMF enabled):

/work/n02n02/inos/cesm/CESM2.1.3/runs/f.e213.FXHIST.f19_f19.WXAug2018/bld/cesm.exe: error while loading shared libraries: libnetcdf_parallel_gnu_91.so.18: cannot open shared object file: No such file por directory

And I can confirm that that library file indeed doesn't exist. I also know what it should be instead. I just don't know how I can tell the code! If you have any insights, it would be much appreciated.

Thanks,
Ingrid
 

Yuan Sun

Yuan Sun
Member
Hi Yuan,

Ok. I did follow the steps in the Archer documentation, but it does not work for me. There is an issue where the code is trying to use a Netcdf library that no longer exists (an older version), which must be due to an update on Archer. The same case ran successfully about 2 years ago. I'm just not sure a) which bit of the code is trying to use the old library, and b) how to fix it. I think the shared ESMF library (in /work/n02/shared/ESMF) was compiled with the previous library, but I don't know if that's what's causing the problem. I've tried compiling without the ESMF library, but that fils at the build stage. This is the error message I get when trying to run the code (with ESMF enabled):

/work/n02n02/inos/cesm/CESM2.1.3/runs/f.e213.FXHIST.f19_f19.WXAug2018/bld/cesm.exe: error while loading shared libraries: libnetcdf_parallel_gnu_91.so.18: cannot open shared object file: No such file por directory

And I can confirm that that library file indeed doesn't exist. I also know what it should be instead. I just don't know how I can tell the code! If you have any insights, it would be much appreciated.

Thanks,
Ingrid
Hi Ingrid,

If you did follow the steps, your configuration files should be the same as mine.

Please check.

Best,
Yuan
 

Attachments

  • config_batch.xml.txt
    28.5 KB · Views: 1
  • config_compilers.xml.txt
    45.6 KB · Views: 1
  • config_machines.xml.txt
    127 KB · Views: 2

jedwards

CSEG and Liaisons
Staff member
I think that there is some confusion between versions. Yuan has ported cesm3 which includes the ccs_config directory.
Ingrid wants to use cesm2.1.3 - You should update to the latest in this series 2.1.5, that may solve at least part of the issue that you are having.

2.1.5 is scientifically equivalent to 2.1.3 but contains a number of updates necessary to run correctly.
 

inos@bas_ac_uk

Ingrid Cnossen
Member
Hi Ingrid,

If you did follow the steps, your configuration files should be the same as mine.

Please check.

Best,
Yuan
Hi Yuan,

I am checking. There are some differences, but as far as I can see not related to the Archer2 setup. I'll double-check later as I am running out of time now today, and I am off tomorrow and Wednesday. I'll come back to it on Thursday. But I'll attach my config files in case it's useful.

Thanks,
Ingrid
 

Attachments

  • config_batch.txt
    27.5 KB · Views: 0
  • config_compilers.txt
    48 KB · Views: 0
  • config_machines.txt
    126.3 KB · Views: 0

inos@bas_ac_uk

Ingrid Cnossen
Member
I think that there is some confusion between versions. Yuan has ported cesm3 which includes the ccs_config directory.
Ingrid wants to use cesm2.1.3 - You should update to the latest in this series 2.1.5, that may solve at least part of the issue that you are having.

2.1.5 is scientifically equivalent to 2.1.3 but contains a number of updates necessary to run correctly.
Ok - I'll look into this as well on Thursday. Thanks!
 

inos@bas_ac_uk

Ingrid Cnossen
Member
Hi Yuan,

I am checking. There are some differences, but as far as I can see not related to the Archer2 setup. I'll double-check later as I am running out of time now today, and I am off tomorrow and Wednesday. I'll come back to it on Thursday. But I'll attach my config files in case it's useful.

Thanks,
Ingrid
I noticed some minor differences in the config files now, but I doubt they will fix the problem. I tried to check, but have run out of computing resource, so now I need to sort that out first. I may be a while...
 
Top