Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

Floating point exception error in running iCESM1.2

yzLiu

Yizhang Liu
New Member
Dear all:
After I submit my <case>.run script for about less than two minutes, it would be killed. I tried to run an oxygen isotopes enabled startup PI case in our server based on the instructions of iCESM1.2 in github. And the compset is 'B1850C5', resolution is T31_g37. It's confused that it can run successfully in another server (I tried before), but it failed in ours. I don't think the question occured in CAM POP or some other modules for no errors in these *.log.* files. In cesm.log, after the lines like 'calcsize j,iq,jac, lsfrm,lstoo ............', many 'QNEG3 from ...... mixing ratio violated at ...... points' followed, and then occured some 'BalanceCheck: soil balance error' and 'ERROR: Isotopic deep-conv precip error'. In the end, it says 'BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES' and 'slurmstepd: error: Detected 1 oom_kill event in StepId = ......' and 'srun: error: l06c41n2: task 0: Out Of Memory'.

Later I changed 'Debug' in env_build.xml to 'True', the same location in cesm.log after 'calcsize .......', the node I used says 'Caught signal 8 (Floating point exception: floating-point invalid operation) and then some information about backtrace and 'forrtl: error: floating point exception'. By the way, there are many 'NetCDF: Invalid dimension ID or name' and 'NetCDF: Variable or Attribute not found' before all of these, will it be something wrong in the netcdf module? But it worked well in running normal CESM1.2.

Following are some of the screenshots in the debugged cesm.log. And I attached the debugged cesm.log and normal cesm.log files in attachment.

I'm grateful for any relevant suggestions or solutions in solving my problem.
屏幕截图 2023-11-12 192907.png屏幕截图 2023-11-12 193133.png
 

Attachments

  • cesm.log(debug).231110-150631.txt
    427 KB · Views: 1
  • cesm.log.231112-141854.txt
    626.4 KB · Views: 1
Top