Welcome to the new DiscussCESM forum!
We are still working on the website migration, so you may experience downtime during this process.

Existing users, please reset your password before logging in here: https://xenforo.cgd.ucar.edu/cesm/index.php?lost-password/

Understanding a vague error message

jhollowed

New Member
I'm trying to run from initial conditions on file, pointed to by NCDATA. I'd like to verify that I've constructed the initial conditions properly by running a 30-day sim with FV3 (--res C24_C24_mg17) and --compset FADIAB. However, the run crashes immediately with a vague error message in the atm.log that I cannot interpret, nor find much documentation for:

Advected constituent list:
1 Q Specific humidity wet
Thermodynamic active species Q 1 1 1810.00000000000 1384.51391946048
Creating field_table file to load tracer fields into fv3
Creating new decomp: 0!1!!3456!!d6!i1!
ERROR: cam_filemap_get_filemap: Map size (96) too large for array dims (0, 1, )

What does this mean? What is the Map, and where are the quoted sizes coming from? The source code at the location of this error message is pretty cryptic
 

jet

Member
The subroutine is creating a mapping between a field on the initial file and the array segments for that field which is distributed among a number of tasks. The error is telling you that there are more elements in the map than in an array segment. There is either something wrong with the way the model is being decomposed among a number of processors or there could be a problem with your initial file.

FV3 requires a minimum of 6 processors in order to run. You should specify the number of processors to run via the --pecount parameter to create_newcase. The CESM/cime scripts will determine how the processors are distributed among each face of the FV3 cubed sphere and set the atm_in namelist parameter fv3_layout. There could be a bug in the way fv3_layout is being calculated. Could you try running on 6 processors and see if that works?

If you are still having issues with 6 processors you could also try running with one of the C24 files in our repository if it's at the same number of levels that you are using. You can always just compare your file to one of the default C24 initial files (inputdata/atm/cam/inic/fv3/cami-mam3_0000-01-01_C24_L30_c200625.nc.) to see if there is an obvious difference (ncols different, time/ncol coordinates reversed, lat/lon different between your file and the standard one).


jt
 

jhollowed

New Member
In atm_in, I see

fv3_layout = 2,3
fv3_ntiles=6

On model build, I requested 36 processes, so the MPI topology seems potentially correct. If the map has 96 elements, then I would think I should verify this by (96*36) = 3456. This is the number of grid points for C24 on a single level. Is that to be expected? If so, the error remains confusing.

Per your advice, I've compared my input file to the one you cite in the repo, both with L30. I see no obvious differences; the shape of lat, lon, hybm (just to check a few) were consistent. So, I've tried ignoring my initial file completely, and replaced it in user_nl_cam with

NCDATA = '/my/path/to/cami-mam3_0000-01-01_C24_L30_c200625.nc'

With NTASKS=36, this crashes with the same error as mentioned in my original post. Trying again with NTASKS=6, I see the same issue:


ERROR:
cam_filemap_get_filemap: Map size (576) too large for array dims (0, 1, )

where again, 576*6 = 3456, the number of gridpoints for C24. The CAM version I am using is

jtruesdal/CAM-1,

as per our recent email correspondence (if I am right in recognizing your signature)
 

jet

Member
Yep. I should have asked which repo you were using. I just assumed it was a release. jtruesdal/CAM-1 is a personal repository and is for development only. The official releases go through extensive testing and are the ones you should be using. Glad it's working for you now though and good luck with your research.

jt
 

jhollowed

New Member
Yep. I should have asked which repo you were using. I just assumed it was a release. jtruesdal/CAM-1 is a personal repository and is for development only. The official releases go through extensive testing and are the ones you should be using. Glad it's working for you now though and good luck with your research.

jt
Thank you for the assistance!
 
Top