Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

WACCM hangs whilst writing output.

marcinkupilas

Marcin Kupilas
New Member
I am running cam6_4_028 on Archer2. I am running a WACCM case with regional refinement to 1/8 degree. I have attached my build script, case settings, machine port details (config compilers and machines) and logs from a typical setup. TThe problem I have is the following.

I am trying to run the model for 1 day and write high frequency output every timestep, which is 3 min 45 s.

The model evolves for 15 timesteps, and then hangs. The history file keeps on growing, but doesn't get to the point of closing.

I have tried

- Running on more/less nodes (tried from 8 - 14)
- Changing nsplit parameter (8 - 24)
- Changing max number of entries to history file (mfilt = 384 (1 day), or 16 (1 hour)
- Changing pio type from netcdf to netcdf4p (model doesn't run)

I would appreciate any help.

Best wishes
Marcin
 

Attachments

  • logs_and_settings.tar.gz
    206.4 KB · Views: 1
  • build.FWmaHIST.ne0np4.tromso01.ne30x8.L135.no_tromso_gw.chem_output_1day.001.txt
    6.2 KB · Views: 2

jedwards

CSEG and Liaisons
Staff member
Actually your build log shows that you are already have pnetcdf installed - have you
tried using it?
../xmlchange PIO_TYPE_NAME='pnetcdf'
 

marcinkupilas

Marcin Kupilas
New Member
Thanks Jim

I have the following:
./xmlchange PIO_TYPENAME=pnetcdf
ERROR: Did not find pnetcdf in valid values for PIO_TYPENAME: ['netcdf', 'netcdf4p', 'netcdf4c', 'nothing']

I changed PIO_TYPENAME to netcdf4p, then built and submitted the model.

The model failed, and I get lots of the following error in my cesm.log:

Abort with message NetCDF: Invalid argument in file /mnt/lustre/a2fs-work2/work/n02/n02/mmkupilas/tag_cam6_4_028/libraries/parallelio/src/clib/pioc_support.c at line 2176

I have attached the full logs.

Thanks
Marcin
 

Attachments

  • logs.12612428.tar.gz
    208.5 KB · Views: 1

marcinkupilas

Marcin Kupilas
New Member
@jedwards @pel

Could you point me to a tag that is known to have a functional Parallel IO that also does WACCM-RR? Do you know if the newest development tag cam6_4_152 has both things working?

Best
Marcin
 
Top