Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

QNEG4 WARNING from TPHYSAC

Hi,

I am new to CAM3. Its an Altix - Itanium machine that I am working on. I have used Intel compilers V 8.0 for building CAM. I get the below error message while running CAM3.

QNEG4 WARNING from TPHYSAC , lchnk = 5; Max possible LH flx exceeded at 1 points. Worst excess = -2.9597E-06 at i = 121
QNEG4 WARNING from TPHYSAC , lchnk = 5; Max possible LH flx exceeded at 1 points. Worst excess = -5.4597E-08 at i = 122
.
.
.
QNEG4 WARNING from TPHYSAC , lchnk = 4; Max possible LH flx exceeded at 1 points. Worst excess = -1.4211E-06 at i = 125
QNEG4 WARNING from TPHYSAC , lchnk = 4; Max possible LH flx exceeded at 1 points. Worst excess = -4.2685E-09 at i = 126

MPI: MPI_COMM_WORLD rank 0 has terminated without calling MPI_Finalize()
MPI: aborting job
MPI: Received signal 11

Any pointers in this regard will be of great help.

Thanks,
rpb
 

pjr

Member
rpb said:
Hi,

I am new to CAM3. Its an Altix - Itanium machine that I am working on. I have used Intel compilers V 8.0 for building CAM. I get the below error message while running CAM3.

QNEG4 WARNING from TPHYSAC , lchnk = 5; Max possible LH flx exceeded at 1 points. Worst excess = -2.9597E-06 at i = 121
QNEG4 WARNING from TPHYSAC , lchnk = 5; Max possible LH flx exceeded at 1 points. Worst excess = -5.4597E-08 at i = 122
.
.
.
QNEG4 WARNING from TPHYSAC , lchnk = 4; Max possible LH flx exceeded at 1 points. Worst excess = -1.4211E-06 at i = 125
QNEG4 WARNING from TPHYSAC , lchnk = 4; Max possible LH flx exceeded at 1 points. Worst excess = -4.2685E-09 at i = 126

MPI: MPI_COMM_WORLD rank 0 has terminated without calling MPI_Finalize()
MPI: aborting job
MPI: Received signal 11

Any pointers in this regard will be of great help.

Thanks,
rpb
The QNEG4 errors are usually generated at the beginning of a model run when there are inconsistencies between the model initial conditions for land and atmosphere. I wouldnt worry about them unless they persist.

The MPI errors is obviously more serious. I would first try running the
model single tasked, single threaded, and see if problems still occur.

One of the fundamental tests we do to see whether the model is
correctly implemented is to run it in a variety of configurations and assure ourselves that the simulations differ at an acceptable level. So, for
example, I would check:
1) does the model run for an extended length (at least a few days)
without MPI or OPENMP turned on?
2) does the model give the similar answers when MPI and OPENMP are turned on?
3) does the model give the exact same answer when run with a given
set of compiler options (eg OPENMP and MPI choice) but a different number of processors.

There are other posts on the forum that discuss these issues in more detail, and the appropriate testing procedures, but it is worth noting that
just getting the model to compiler is no guarantee that you are going to
get the right answer.

Phil
 
Thank you both for your replies !!

Apparently, the QNEG warnings were just "warnings"; as in, the program abort was not connected to this. I had initially provided a stack of 100 M; the model used to fail with this. But it ran fine when I gave unlimited stacksize. ( And it was just a sequential run, which is what I need for the time being. So no MPI used)

Thanks,
rpb
 
Top