Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

iCESM 1.2: segmentation fault with F1850C5 compset

Hi there,

I am trying to get iCESM1.2 up and running on a new HPC system called "hypatia". The machine config files are attached and my compiler is gnu/14.2.0.

I followed the instructions here and cloned the source code using the following command:


I am trying to run a tracer simulation with the F1850C5 compset:

./create_newcase -case lnd-trcr-f19-1850-fsst_cam5_1node_mpiexec_v6 -res f19_g16 -compset F1850C5 -compiler gnu -mach hypatia

I then change some xml files to (i) enable water tracing and (ii) setup as a hybrid run using restart files from here:

./xmlchange CAM_CONFIG_OPTS="-phys cam5 -water_tracer h2o_h216o_hdo_h218o -water_tag_num 1"
./xmlchange RUN_TYPE=hybrid
./xmlchange RUN_REFCASE=b.ie12.B1850C5CN.f19_g16.LME.002
./xmlchange RUN_STARTDATE=0001-01-01
./xmlchange START_TOD=0
./xmlchange RUN_REFDATE=1850-01-01
./xmlchange GET_REFCASE=FALSE

Before building, I make some changes to user_nl_cam, user_nl_cice, user_nl_rtm and user_nl_clm to enable water tracing (user_nl_* files attached) and copy atm_comp_mct.F90 to the SourceMods directory (as per the iCESM1.2 instructions). I then build the case and submit to the queue.

The error occurs during the run and shows up as a segmentation fault in cesm.log (all log files attached):

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Backtrace for this error:

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Backtrace for this error:

Interestingly, when I repeat all the steps above except copying atm_comp_mct.F90 to SourceMods, the case runs without error.

Any ideas on what might be the issue here? Has anyone else running iCESM1.2 encountered a similar issue? Given the case runs fine without including the modified atm_comp_mct.F90, it seems to be the water tracer functionality that is causing my problem.

Many thanks,

Mike
 

Attachments

  • atm_in.txt
    10.4 KB · Views: 2
  • atm_comp_mct.F90.txt
    53.7 KB · Views: 1
  • user_nl_rtm.txt
    594 bytes · Views: 2
  • user_nl_clm.txt
    1.2 KB · Views: 2
  • user_nl_cice.txt
    383 bytes · Views: 2
  • user_nl_cam.txt
    1.5 KB · Views: 3
  • mkbatch.hypatia.txt
    3.4 KB · Views: 0
  • config_machines.xml.txt
    32 KB · Views: 0
  • cesm.log.250901-092357.txt
    433 KB · Views: 3
  • atm.log.250901-092357.txt
    380.9 KB · Views: 2
Update on this issue running iCESM1.2:

I was able to backtrace the segfault and it occurs in water_tracers.F90 (attached), when the tracers interact with the ZM convection scheme (see attached cesm.log):

Error termination. Backtrace:

At line 4491 of file /home/mpb20/iCESM1.2/models/atm/cam/src/physics/cam/water_tracers.F90

Fortran runtime error: Index '0' of dimension 1 of array 'q' outside of expected range (1:16)

Error termination. Backtrace:

At line 4491 of file /home/mpb20/iCESM1.2/models/atm/cam/src/physics/cam/water_tracers.F90

Fortran runtime error: Index '-2037427000' of dimension 1 of array 'q' outside of expected range (1:16)

The water_tracers.F90 source line cited in the error is in the subroutine for calculating water tracer/isotope tendencies due to the ZM convection scheme:

qu(1:lengath,:,m) = q(ideep(:),:,wtrc_iatype(m,iwtvap))

To further probe where the issue arises, I added two write statements to water_tracers.F90:

write(*,*) 'WTRC DEBUG OUTPUT', m, iwtvap, wtrc_iatype(m, iwtvap), lengath, shape(q), shape(qu)

write(*,*) 'IDEEP DEBUG OUTPUT', ideep(:)

These write statements returned the following:

WTRC DEBUG OUTPUT 1 1 10 6 16 30 57 16 30 4
IDEEP DEBUG OUTPUT 5 10 11 12 14 15 934855210 1084360962 -712594165 1085347604 526323892 1086286750 1264340450 1087403250 -761714186 1088112591

The variable ideep has size pcols and holds the locations of the gridpoints undergoing deep convection. The first 6 values of ideep look ok (see above), but then the values are nonsense.

Any ideas on what might be going wrong here? And how I might go about fixing the issue? Thanks!
 

Attachments

  • cesm.log.250911-141107.txt
    439.4 KB · Views: 0
  • water_tracers.F90.txt
    228.5 KB · Views: 0
Top