Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

NTASK configuration for CESM2.1.3 pacemaker experiment

chunlienchiang0113

蔣濬濂
New Member
Hello,

I tried to create a pacemaker expeirment in CESM2.1.3 based on the instruction here (CESM2 Pacific Pacemaker Ensemble Instructions | Community Earth System Model), and the simulation will be failed under some specific CPU configurations.

I run the experiment on my computer, and the following configuration is attempted:
  1. 320 cores for all components: successful
  2. 480 cores for atm, cpl, ocn, glc and 320 cores for other components: successful
  3. 480 cores for atm, cpl, ocn, glc, wav and 320 cores for other components: failed (idling for more than several hours without updating log files)
  4. 320 cores for wav and 560 cores for other components: failed (raise error during mapper_Sw2o initialization)
  5. Default BHIST with any normal configuration (320 or 1024 for all components): successful
I would like to ask if anyone has ever encountered the problem that the pacemaker experiment in CESM2.1.3 has specific CPU configuration requirements?

The attached zip file is the log files and env_mach_pes.xml file for the failed run (4) mentioned above.

Thanks!
 

Attachments

  • log_and_xml.zip
    250.7 KB · Views: 0

jedwards

CSEG and Liaisons
Staff member
I'm surprised that the wav model works on as many as 320 cores - I would limit it to use only one node.
You can do that with ./xmlchange NTASKS_WAV=-1
 
Top