Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

Fail to run WRF-CTSM on Perlmutter

Status
Not open for further replies.

HuilinHuang

Huilin Huang
New Member
Hello,
I wanted to run CTSM with WRF on a new machine Perlmutter. After installing a new PIO and ESMF library, I could successfully compile the code and even run the real.exe. However, the model stopped when running CTSM (rsl.error.0321.txt). I am sure the errors come from CTSM as the wrf.exe can run without a land surface model (sf_surface_physics = 0)
At first, I thought it might be because I used different ESMF libraries when compiling WRF and CTSM so I changed the ESMF library when compiling WRF. Yet the same error occurred.
Attached I included my config_compilers.xml, config_machines.xml, and config_batch.xml, as well as the rsl.out.*, rsl.error.* with the error message. Could you please let me know which might be the cause?

Best,
Huilin
 

Attachments

  • config_batch.xml.txt
    30.9 KB · Views: 1
  • config_compilers.xml.txt
    54.3 KB · Views: 1
  • config_machines.xml.txt
    166 KB · Views: 2
  • rsl.error.0000.txt
    82.4 KB · Views: 3
  • rsl.error.0321.txt
    82.3 KB · Views: 3
  • rsl.out.0000.txt
    106.7 KB · Views: 2
  • rsl.out.0321.txt
    82.1 KB · Views: 2

HuilinHuang

Huilin Huang
New Member
Hello,
I wanted to run CTSM with WRF on a new machine Perlmutter. After installing a new PIO and ESMF library, I could successfully compile the code and even run the real.exe. However, the model stopped when running CTSM (rsl.error.0321.txt). I am sure the errors come from CTSM as the wrf.exe can run without a land surface model (sf_surface_physics = 0)
At first, I thought it might be because I used different ESMF libraries when compiling WRF and CTSM so I changed the ESMF library when compiling WRF. Yet the same error occurred.
Attached I included my config_compilers.xml, config_machines.xml, and config_batch.xml, as well as the rsl.out.*, rsl.error.* with the error message. Could you please let me know which might be the cause?

Best,
Huilin
Hello,
My question is " Awaiting approval before being displayed publicly.". Is there a problem on my side, or should I wait patiently for the admin to take action?
1686002002346.png
Regards,
Huilin
 

slevis

Moderator
Staff member
@HuilinHuang
I'm afraid I am unable to help. However, I will share some questions/comments that come to mind:

Based on your message I get the sense that you have run WRF-CTSM successfully elsewhere, but you get this error on Perlmutter. If you did everything the same on another machine and it worked, then you are most likely dealing with a porting issue on Perlmutter. This then makes me wonder whether you have tried running the models separately. Can you run WRF on Perlmutter and can you run the CTSM on Perlmutter? If so, I think you should also be able to get WRF-CTSM to work...
 

erik

Erik Kluzek
CSEG and Liaisons
Staff member
Hi Huilin

I took a quick look at your files and I see that you ran in production mode, rather than with DEBUG=TRUE. So set that, clean the build, rebuild and rerun and see if DEBUG compiler settings gives you more help.

The other thing is the normal type of things that help us with this is to give the exact version of the model you are working with, information on any changes you made to the system, and how you configured CTSM and WRF to work together? Did you use LILAC for this for example? I do see Perlmutter listed as a supported machine for at least recent versions of cesm_ccs_config, so I wonder if you just need to update the version...

Sam has a great suggestion as well. If you do have it working somewhere it's great to build on the experience of where it works, and then figure out what's different for your case that is failing.
 
Status
Not open for further replies.
Top