Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

CAM4 SIGSEGV error when reading in netcdf file

Hi

I am running CAM4. In these runs, I am reading in a netcdf file of my own in the routine physpkg.F90. This file contains
heating values which I subsequently apply to the model. I have recently switched the machine I run CAM4 on FROM an
IBM p5-575 cluster running AIX 5.3 TO an IBM iDataplex cluster running RedHat Linux. On the old machine, the file could
be read in without problem. On the current machine, I get the following error (see below) when I try to read this file. The
error occurs in the function PIO_openfile in the module piolib_mod.F90. It chokes on the line that says:

file%iosystem => iosystem

Here is the error that Im getting:

***********************************************************************
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image PC Routine Line Source
ccsm.exe 0000000000D9E35E Unknown Unknown Unknown
ccsm.exe 0000000000521249 cam_pio_utils_mp_ 602 cam_pio_utils.F90
ccsm.exe 0000000000582C9E physpkg_mp_phys_i 568 physpkg.F90
ccsm.exe 00000000004EF80F cam_comp_mp_cam_i 164 cam_comp.F90
ccsm.exe 00000000004EC14B atm_comp_mct_mp_a 271 atm_comp_mct.F90
ccsm.exe 000000000049580B MAIN__ 630 ccsm_driver.F90
ccsm.exe 000000000048D46C Unknown Unknown Unknown
libc.so.6 0000003568E1D994 Unknown Unknown Unknown
ccsm.exe 000000000048D379 Unknown Unknown Unknown
--------------------------------------------------------------------------
mpiexec has exited due to process rank 2 with PID 15456 on
node login002 exiting without calling "finalize". This may
have caused other processes in the application to be
terminated by signals sent by mpiexec (as reported here).
--------------------------------------------------------------------------
***********************************************************************

I have no idea what is causing this error and I have spent a lot of time trying to figure it out. Any help you could offer would
be much appreciated, Thanks

Cara-Lyn
 
I have found a fix for this problem.

Im not exactly sure the reason why this worked, but it did. When I declared the variable that contained the file ID number for the netcdf file that I
was trying to open... on my old machine, it had to be declared with the pointer attribute. On this new machine, it did not like the variable being declared
a pointer. So, on the old IBM machine running AIX 5.3, the commands were:

type(file_desc_t), pointer :: ncid
call cam_pio_openfile(ncid, lhfile, PIO_NOWRITE)

On the new IBM machine running redhat linux, the commands were:

type(file_desc_t) :: ncid
call cam_pio_openfile(ncid, lhfile, PIO_NOWRITE)

In the routine cam_pio_openfile, the variable who takes the values of ncid is declared as a target, so I have no idea
why declaring it as a pointer didnt work. I dont understand it, but this is how I fixed it..

Hope this help anyone else who may have this problem.
 

eaton

CSEG and Liaisons
The cam_pio_openfile subroutine is expecting ncid to be an object of type file_desc_t. When you add the pointer attribute to the declaration of the actual argument, then you need an allocate statement for ncid to get the memory allocated for the file_desc_t object (I can't explain why the code you were using on the ibm p5 platform was working). When you leave the pointer attribute out of the declaration then the file_desc_t object is allocated on the stack. So your second code example is correct.

The dummy argument "file" of subroutine cam_pio_openfile is declared with the target attribute which means that the object being passed is allowed to be associated with a pointer in the subroutine (although in this instance the object is just being passed down the call tree). It doesn't mean that object being passed can be an unassociated pointer. If that were the case then the dummy argument would need a pointer attribute rather than a target attribute.
 
Top