Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

cice6 sandboxing

dpath2o

Daniel Atwater
New Member
Hello,

Notwithstanding my intellectual shortcomings, I'm doing my best to get CICE6 up and running at a higher resolution than the default 1 or 3 degree. At first on my local machine, I'm just looking at stepping up CICE6 in systematic fashion to a higher resolution that is inline with strategic goals of the modelling framework I am aligning to in the future. At present I'm *not* coupling to an ocean as I would like to understand CICE6 at short time steps/frames using ERA5 forcing in standalone mode.

1. I'm struggling to get past the first step of creating a grid
2. I have noticed that 'tx1' appears 'out-of-the-box' appears not be reading-in its forcing files

I have begun documenting my efforts here ( AFIM/cice.org at main · dpath2o/AFIM ) with an accompanying jupyter notebook ( AFIM/cice_analysis.ipynb at main · dpath2o/AFIM ).

If you have any comments and/or suggestions I'd be most receptive.

Kind Regards,
Dan
 

dpath2o

Daniel Atwater
New Member
Updating my own thread.

Using Getting started with xgcm for MOM6 — Pangeo Gallery documentation as an example I created a 1/2 degree grid and saved it as a NetCDF to my local run directory:
~/cice-dirs/input/CICE_data/grids/gx0p5/grid_gx05.nc
Then, as per normal attempted to 'setup' CICE with:
./cice.setup -m conda -e macos -c sandbox_gx0p5 -g gx0p5
which results in:
./cice.setup:
cice.setup: ERROR unknown grid gx0p5
./cice.setup: ERROR, cice_decomp.csh aborted
That error is something that I expected, or at least expected 'cice.setup' to likely not understand my boutique grid. As a sidebar, I did create a new namelist file 'set_nml.gx0p5' and placed it in my CICE source git clone directory 'configuration/scripts/options'. For your situational awareness that file looks like this:
Bash:
dt                     = 3600.0
runtype                = 'initial'
year_init              = 2005
use_leap_years         = .true.
use_restart_time       = .false.
ice_ic                 = ''
grid_format            = 'nc'
grid_type              = 'displaced_pole'
grid_file              = 'ICE_MACHINE_INPUTDATA/CICE_data/grid/gx0p5/grid_gx0p5.nc'
kmt_file               = 'kmt'
bathymetry_file        = ''
maskhalo_dyn           = .true.
maskhalo_remap         = .true.
maskhalo_bound         = .true.
fyear_init             = 2005
atm_data_format        = 'nc'
atm_data_type          = 'ERA5_gx0p5'
atm_data_dir           = ''
precip_units           = 'mks'
ocn_data_dir           = ''
bgc_data_dir           = ''
distribution_wght_file = ''

I do realise that the above namelist file is pretty much garbage as it contains no initial conditions nor atmospheric forcing, but my goal at the moment is just get a new grid file to be understood by CICE.

Hence I continued on, as running the above 'cice.setup', even though it exits on error, it still produces the 'sandbox_gx0p5' directory with what appears to be the crucial scripts to sally forth. So I edited 'sandbox_gx0p5/cice.settings' to look like this:
Bash:
#!/bin/csh -f

setenv ICE_CASENAME   sandbox_gx0p5
setenv ICE_SANDBOX    /Users/dpath2o/PHD/MODELS/src/CICE
setenv ICE_MACHINE    conda
setenv ICE_ENVNAME    macos
setenv ICE_MACHCOMP   conda_macos
setenv ICE_SCRIPTS    /Users/dpath2o/PHD/MODELS/src/CICE/configuration/scripts
setenv ICE_CASEDIR    /Users/dpath2o/PHD/MODELS/src/CICE/sandbox_gx0p5
setenv ICE_RUNDIR     /Users/dpath2o/cice-dirs/runs/sandbox_gx0p5
setenv ICE_OBJDIR     ${ICE_RUNDIR}/compile
setenv ICE_RSTDIR     ${ICE_RUNDIR}/restart
setenv ICE_HSTDIR     ${ICE_RUNDIR}/history
setenv ICE_LOGDIR     ${ICE_CASEDIR}/logs
setenv ICE_DRVOPT     standalone/cice
setenv ICE_TARGET     cice
setenv ICE_IOTYPE     netcdf    # binary, netcdf, pio1, pio2
setenv ICE_CLEANBUILD true
setenv ICE_CPPDEFS    ""
setenv ICE_QUIETMODE  false
setenv ICE_GRID       gx0p5
setenv ICE_NTASKS     8
setenv ICE_NTHRDS     1
setenv ICE_OMPSCHED   "static,1"
setenv ICE_TEST       undefined
setenv ICE_TESTNAME   undefined
setenv ICE_TESTID     undefined
setenv ICE_BASELINE   /Users/dpath2o/cice-dirs/baseline
setenv ICE_BASEGEN    undefined
setenv ICE_BASECOM    undefined
setenv ICE_BFBCOMP    undefined
setenv ICE_BFBTYPE    restart
setenv ICE_SPVAL      undefined
setenv ICE_RUNLENGTH  -1
setenv ICE_ACCOUNT    P0000000
setenv ICE_QUEUE      debug

#======================================================

setenv ICE_THREADED   true
if (${ICE_NTHRDS} > 1) setenv ICE_THREADED  true
setenv ICE_COMMDIR mpi
if (${ICE_NTASKS} == 1) setenv ICE_COMMDIR serial

### Specialty code
setenv ICE_BLDDEBUG  false  # build debug flags
setenv ICE_COVERAGE  false  # build coverage flags

I than ran './cice.build' and to my surprise it ends with"
./cice.build: COMPILE SUCCESSFUL, /Users/dpath2o/PHD/MODELS/src/CICE/sandbox_gx0p5/logs/cice.bldlog.220607-093444

Alas, the build comes with NIL 'cice.run' or 'cice.submit' ... hmmmm ... possibly need to dig deeper but thought this might be a good place to pause and update here in case anyone home here smells something burning in the kitchen!

Thanks!
 

Philippe Blain

New Member
Hi Dan,

The error you get from ./cice.setup says "cice_decomp.csh aborted". The script cice_decomp.csh is ran automatically by cice.setup and is used to guess a good blocksize and block decomposition for the chosen grid based on the number of grid points in X and Y (these are currently hardcoded in the script) and the chosen number of MPI processes and OpenMP threads (-p argumnt to cice.setup, defaults to 4x1 i.e. 4 MPI ranks and no OpenMP).

Since your new grid is not recognized by cice_decomp.csh, the script aborts and then cice.setup alos aborts, which is why you get no cice.run script in your case directory.

The values chosen by cice_decomp.csh, however, are just starting guesses and can be changed afterwards in the namelist (ice_in).

What I would recommend to get you started quickly is to use an invocation like this:

Bash:
./cice.setup -m conda -e conda -c sandbox_gx0p5 -g gx1 -s gx0p5

This will make the cice_decomp.csh script stop complaining as it will use its hardcoded values for the gx1 grid, but then cice.setup will apply the settings from your set_nml.gx0p5 file. This will allow cice.setup to complete successfully (I tested it) and you can then adjust the namelist to continue iterating on your setup.

Hope this helps,

Philippe.
 

dpath2o

Daniel Atwater
New Member
Forcing Data Setup

Hi, I have a new query regarding forcing data. I have a quarter-degree global tripole grid, essential from here (COSIMA), that I have re-gridded (using CDO) for most of the required stand-alone ocean fields (uocn, vocn, ss_tltx, ss_tlty, sss, sst, and hmix) from a dataset called Bluelink ReANalysis (BRAN). I have also re-gridded ERA5 (and derived what I could) for most of the required stand-alone atmosphere fields (uatm, vatm, strax, stray, potT, Tair, Qa, rhoa, flw, fsw, swvdr, swvdf, frain, and fsnow).

BRAN data is on a daily time step and ERA5 data is on an hourly time step. I do not see the need to alter (i.e. filter ERA5) to a lower temporal resolution. At present I have only re-gridded BRAN and ERA5 for one month's worth of ocean and atmosphere data; January 2010. That is, before I expand to the larger decadal duration. I have placed these two NetCDF (one for BRAN and one for ERA5) files into appropriate location (/Users/dpath2o/cice-dirs/input/CICE_data/forcing/0p25/daily/BRAN_0p25_forCICE_2010.nc) with which my ice_in file is directing cice.build and hence cice.run to force. Attached is my ice_in file.

At the moment if I might draw our attention to just the ocean forcing file (NetCDF) and the header of which is also attached.

My problem is this, when I run CICE the following error occurs:
SSS climatology computed from:
/Users/dpath2o/cice-dirs/input/CICE_data/forcing/0p25/daily/sss.mm.100x116.da
At line 275 of file /Users/dpath2o/PHD/MODELS/src/CICE/cicecore/cicedynB/infrastructure/ice_read_write.F90 (unit = 13, file = '/Users/dpath2o/cice-dirs/input/CICE_data/forcing/0p25/daily/sss.mm.100x116.da')
Fortran runtime error: Non-existing record number

Error termination. Backtrace:
#0 0x10c0a84dc
#1 0x10c0a9395
#2 0x10c0a9f55
#3 0x10c626c1c
#4 0x10244db79
#5 0x102340ae3
#6 0x10227628a
#7 0x10254cf98
--------------------------------------------------------------------------
Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:

Process name: [[65114,1],0]
Exit code: 2
--------------------------------------------------------------------------
(this log file is also attached).

My immediate question is, does anyone have any idea what I'm doing wrong?

One of my less immediate questions is, what technical (non-scientific) problems will arise (if any), from not having the complete list of ocean and atmosphere fields as listed here? The fields that I'm currently missing from my forcing dataset are: swidr, swidf, frzmlt, frzmlt_init, qdp and Tf. As token knowledge, if anyone has any insights on how to derive swidr & swidf from ERA5 I'd be most intrigued as I have the full solar atmospheric flux dataset from ERA5, but just have not worked out how to derive those two fields. Likewise for ocean heat content/flux fields?

As always, thanks in advance for any help/insights.

Kind Regards,
Dan
 

Attachments

  • BRAN_0p25_forCICE_2010_header.txt
    6.8 KB · Views: 3
  • cice.runlog.220716-213122.txt
    24.2 KB · Views: 1
  • ice_in.txt
    18.1 KB · Views: 6

dpath2o

Daniel Atwater
New Member
After reviewing cicecore/cicedynB/general/ice_forcing.F90 , a worthwhile test that I'll pursue now is to set (in my ice_in)
Code:
&setup_forcing.ocn_data_type='ncar'
and ensure my ocean forcing dataset (i.e. my /Users/dpath2o/cice-dirs/input/CICE_data/forcing/0p25/daily/BRAN_0p25_forCICE_2010.nc) adheres to this format:
Code:
!=======================================================================
! NCAR CESM M-configuration (AIO) ocean forcing
!=======================================================================

      subroutine ocn_data_ncar_init

! Reads NCAR pop ocean forcing data set 'pop_frc_gx1v3_010815.nc'
! 
! List of ocean forcing fields: Note that order is important!
! (order is determined by field list in vname).
! 
! For ocean mixed layer-----------------------------units 
! 
! 1  sst------temperature---------------------------(C)   
! 2  sss------salinity------------------------------(ppt) 
! 3  hbl------depth---------------------------------(m)   
! 4  u--------surface u current---------------------(m/s) 
! 5  v--------surface v current---------------------(m/s) 
! 6  dhdx-----surface tilt x direction--------------(m/m) 
! 7  dhdy-----surface tilt y direction--------------(m/m) 
! 8  qdp------ocean sub-mixed layer heat flux-------(W/m2)
!
! Fields 4, 5, 6, 7 are on the U-grid; 1, 2, 3, and 8 are
! on the T-grid.
 

dpath2o

Daniel Atwater
New Member
Update:

Attached is the header file of the BRAN data that now conforms structurally to cicecore/cicedynB/general/ice_forcing.F90 subroutine ocn_data_ncar_init , evidence by the attached log file. I did edit cicecore/cicedynB/general/ice_forcing.F90 only slightly, here (line 3810)

Code:
      character(char_len) :: &
        vname(nfld) ! variable names to search for in file
      data vname /  &
           !           'T',      'S',      'hblt',  'U',     'V', &
           !           'dhdx',   'dhdy',   'qdp' /
           'sst',      'sss',      'hbl',  'u',     'v', &
           'dhdx',   'dhdy',   'qdp' /

and here (line 3859)
Code:
!          status = nf90_inq_dimid(fid,'nlon',dimid)
!          status = nf90_inq_dimid(fid,'ni',dimid)
          status = nf90_inq_dimid(fid,'nx',dimid)
          status = nf90_inquire_dimension(fid,dimid,len=nlon)
  
!          status = nf90_inq_dimid(fid,'nlat',dimid)
!          status = nf90_inq_dimid(fid,'nj',dimid)
          status = nf90_inq_dimid(fid,'ny',dimid)
          status = nf90_inquire_dimension(fid,dimid,len=nlat)

The attached log file is now crashing on the CICE-CESM provided initial condition file /Users/dpath2o/cice-dirs/input/CICE_data/ic/gx1/iced_gx1_v6.2005-01-01.nc .

I realise the grid of iced_gx1_v6.2005-01-01.nc (i.e. the 'gx1' grid) is not the same as the one that this particular case is run, so I acknowledge it may be crashing because of a grid discrepancy, however the run log does not appear to indicate this as the failure point.

Current questions:
1.) Why is it crashing on not being able to find variable 'aicen' when 'aicen' is in that variable?
2.) Where is the reading of initial conditions performed?
3.) Can CICE have initial conditions on different grid? I suspect not, as I recall reading CICE does not perform any interpolation. So before I invest the time in gathering the initial conditions for January 2010 and re-gridding them onto the grid that I'm using for this case, it would be really good to have answered the previous question. Anyone's insight here is most appreciated. I want to emphasis this because in the documentation there is a dearth of information on what the initial condition file structure required to run CICE. After a bit of searching, via the breadcrumbs, from the attached log file, I'm coming up bupkis on where the initial condition file is being read-in.
 

Attachments

  • BRAN_g0p25_2020.nc.header.txt
    1.1 KB · Views: 2
  • cice.runlog.220718-210210.txt
    42.5 KB · Views: 3

dpath2o

Daniel Atwater
New Member
New question.

Can someone tell me where I can find a location in the documentation, or within the code itself, that shows what the units and plain name of that the initial conditions netcdf file that ice_restart_driver.F90 is expecting?

I have a comprehensive list of variables from the initial conditions that are provided by this community (see below), and while I can guess at what some, if not most, of what these variables are, it would be helpful to have a list of their plain/long names and the units.

netcdf iced_tx1_v5 {
dimensions:
ni = 360 ;
nj = 240 ;
ncat = 5 ;
variables:
double uvel(nj, ni) ;
double vvel(nj, ni) ;
double scale_factor(nj, ni) ;
double swvdr(nj, ni) ;
double swvdf(nj, ni) ;
double swidr(nj, ni) ;
double swidf(nj, ni) ;
double strocnxT(nj, ni) ;
double strocnyT(nj, ni) ;
double stressp_1(nj, ni) ;
double stressp_2(nj, ni) ;
double stressp_3(nj, ni) ;
double stressp_4(nj, ni) ;
double stressm_1(nj, ni) ;
double stressm_2(nj, ni) ;
double stressm_3(nj, ni) ;
double stressm_4(nj, ni) ;
double stress12_1(nj, ni) ;
double stress12_2(nj, ni) ;
double stress12_3(nj, ni) ;
double stress12_4(nj, ni) ;
double iceumask(nj, ni) ;
double sst(nj, ni) ;
double frzmlt(nj, ni) ;
double frz_onset(nj, ni) ;
double fsnow(nj, ni) ;
double ulon(nj, ni) ;
double ulat(nj, ni) ;
double tlon(nj, ni) ;
double tlat(nj, ni) ;
double aicen(ncat, nj, ni) ;
double vicen(ncat, nj, ni) ;
double vsnon(ncat, nj, ni) ;
double Tsfcn(ncat, nj, ni) ;
double iage(ncat, nj, ni) ;
double FY(ncat, nj, ni) ;
double alvl(ncat, nj, ni) ;
double vlvl(ncat, nj, ni) ;
double apnd(ncat, nj, ni) ;
double hpnd(ncat, nj, ni) ;
double ipnd(ncat, nj, ni) ;
double dhs(ncat, nj, ni) ;
double ffrac(ncat, nj, ni) ;
double fbrn(ncat, nj, ni) ;
double first_ice(ncat, nj, ni) ;
double sice001(ncat, nj, ni) ;
double qice001(ncat, nj, ni) ;
double sice002(ncat, nj, ni) ;
double qice002(ncat, nj, ni) ;
double sice003(ncat, nj, ni) ;
double qice003(ncat, nj, ni) ;
double sice004(ncat, nj, ni) ;
double qice004(ncat, nj, ni) ;
double sice005(ncat, nj, ni) ;
double qice005(ncat, nj, ni) ;
double sice006(ncat, nj, ni) ;
double qice006(ncat, nj, ni) ;
double sice007(ncat, nj, ni) ;
double qice007(ncat, nj, ni) ;
double qsno001(ncat, nj, ni) ;

// global attributes:
:istep1 = 175200 ;
:time = 630720000. ;
:time_forc = 630720000. ;
:nyr = 21 ;
:month = 1 ;
:mday = 1 ;
:sec = 0 ;
}
 

dpath2o

Daniel Atwater
New Member
I'm attempting to stand-up CICE6 in stand-alone mode to conduct fast ice sensitivity studies in Antarctica and I'm hoping to get some help. Presently I'm trying to find a good reference for the 2D/3D restart fields -- i.e. what there long names are, maybe a brief description, and there units. As I look to initialise CICE6 with SOSE ( Ocean State Estimation at Scripps Institution of Oceanography ). In ice_restart.F90 I found what appears to be the laundry list of fields. I also observe that your name appears to be ascribed to the author of those routines, and hence why I'm seeking your input. Can you help me?
 

dbailey

CSEG and Liaisons
Staff member
I don't think we have documentation as such on the restart or initial files. We don't typically document the meta data for restart files, but these are the main prognostic variables in the model. There is a master list of the variables and parameters in the model here:


The names of the variables in the restart / initial files should be in this list. However, here is a shorter description:

uvel, vvel -> The sea ice velocity components (m/s)
scale_factor -> used for timestep offsets in the incoming shortwave field (non-dimensional)
swvdr, swvdf, swidr, swidf -> incoming shortwave components from the atmosphere (visible/infrared direct/diffuse) W/m^2
strocnxT, strocnyT -> The ice-ocean stress components (N/m?)
stressp_{1,2,3,4}, stressm_{1,2,3,4}, stress12_{1,2,3,4} -> The internal stress strain tensor components (N/m?)
iceumask -> The mask where there is ice in a gridcell (dimensionless)
sst -> sea surface temperature in degrees C
frzmlt -> the ice ocean heat exchange in W/m^2
frz_onset -> the Julian day of the onset of freezing
fsnow -> snow fall in kg / m^2 s
ulon, ulat, tlon, tlat -> the 2D lat / lon points at U and T points with B-grid staggering
aicen -> ice fraction in each of the subgridscale categories where n = 5
vicen -> the ice volume per unit area in each of the categories (m)
vsnon -> the snow volume per unit area in each of the categories (m)
Tsfcn -> surface snow/ice temperature in each of the categories
iage -> sea ice age
FY -> first year ice area
alvl, vlvl -> The fraction and volume of the level ice area
apnd, hpnd -> the fraction and depth of the ponds in a cell
ipnd -> other pond characteristics
dhs, ffrac, fbrs ... I'm actually not sure here. Think these are related to ponds.
first_ice - ?
sice??? -> This is the salinity profile in the sea ice (psu)
qice??? -> Internal sea ice enthalpy profile
qsno??? -> Internal snow enthalpy profile
 

dpath2o

Daniel Atwater
New Member
Thank you Sir.

Do you know how someone should deal missing parameters? What I mean, is that if I do not have any gridded information on 'alvl/vlvl', for example, do I omit this from the initial condition netcdf file that I provide in my 'ice_in', or do I need to create empty arrays/matrices for each field/parameter that I do not have information for? If I do need to create empty arrays/matrices then do I fill with NaN or is there a value that CICE6 flags as rubbish data (i.e. '99999' or something like that)?
 

dpath2o

Daniel Atwater
New Member
Hello! Is there anybody in there? I think I really would benefit from some help from others if there's anybody home.

Summary: It appears I have been able to get CICE6 reading my alternate forcing and initial condition datasets (described above in this thread), but just to be clear, attached are the headers of each of those. Note, that the ERA5_01hr_forcing_tx0p25_2010.nc.txt is in fact named with the full path /Users/dpath2o/cice-dirs/input/CICE_data/forcing/0p25/hourly/8XDAILY/JRA55_03hr_forcing_tx1_2010.nc. The full path name is given to indicate that I could find no other mechanism to easily 'fool' CICE6 into reading an atmospheric forcing file that is NetCDF. Essentially I re-gridded the ERA5 fields and put them into this NetCDF file with a name that CICE6 requires (please see my attached ice_in file). Here is the python wrapper script ( AFIM/cice_era5_forcing.py at main · dpath2o/AFIM ) and module that I'm using ( AFIM/afim.py at main · dpath2o/AFIM ). Also, it may be important to note that there are number of empty fields in the initial condition which was compiled using this python script ( AFIM/cice_initial_conditions.py at main · dpath2o/AFIM )

Problem: CICE6 aborts in ice_step_mod.F90 and I would like to get some help in way(s) forward past this abortion. The `tail -1000` log file and ice_in files are also attached.

Can someone please help me?

Thanks!
 

Attachments

  • ice_in.txt
    18.1 KB · Views: 9
  • ice.runlog.220809-164505.txt
    52.6 KB · Views: 5
  • iceh_ic.2010-01-01-03600.nc.txt
    3.1 KB · Views: 5
  • BRAN_g0p25_2010.nc.txt
    1.1 KB · Views: 1
  • ERA5_01hr_forcing_tx0p25_2010.nc.txt
    1.2 KB · Views: 4
  • cice6_ic_2010-01.nc.txt
    3.4 KB · Views: 3

david_hebert@nrlssc_navy_mil

David Hebert
New Member
At this point, the only piece of info we have is there is a problem with icepack_step_therm1. That is located in columnphysics/icepack_therm_vertical.F90. Toward the top of your runlog there might be info on the min/max of your forcing files. Can you see if those look reasonable? I also notice you have NaN for your FillValue, might want to change that to some large negative number like -9999.0.
 

dpath2o

Daniel Atwater
New Member
Thanks for the comments David. If you're curious to help further I've included the `head -1200` of the runlog that shows the results of min/max of the initials conditions and forcing files (the header of those *.nc provided in the previous post). I thank you for pointing out that NaN is not the preferred FillValue. I had ask this in a previous post in this thread. I'll re-compile the input files with that as the FillValue instead of NaN in the forcing files as well zero arrays in the initial condition file for fields I do not have information.
 

Attachments

  • cice.runlog.220810-072911.txt
    70.7 KB · Views: 6

dbailey

CSEG and Liaisons
Staff member
This is the key error:

(picard_nonconvergence)-------------------------------------
(picard_solver) picard_solver: Picard solver non-convergence
(icepack_warnings_setabort) T :file /Users/dpath2o/PHD/MODELS/src/CICE/icepack/columnphysics/icepack_therm_mushy.F90 :line 1382

So, the internal thermodynamic solution is not converging. This could be a timestep issue, or actually I see:

swvdr = 0.41607406735420227
swvdf = 0.94060909748077393
swidr = -25.805578231811523
swidf = 0.95200002193450928

You have negative incoming shortwave from the atmosphere. This should not be happening. Carefully check your forcing files, but I am guessing this is the root of the problem.
 

dpath2o

Daniel Atwater
New Member
Hello again,

I'm attempting to build CICE (and Icepack) on a machine that is not in the list of machines configuration/scripts/machines ... So I created the attached file in hopes of 'defining' my new machine, but fails in the batch.csh script.
/home/581/da1339/src/CICE/Icepack/configuration/scripts/icepack.batch.csh ERROR: gadi unknown
/home/581/da1339/src/CICE/Icepack/afim_0p1/casescripts/icepack.run.setup.csh: ERROR icepack.batch.csh aborted
./icepack.setup: ERROR, icepack.run.setup.csh aborted

Does anyone see how I can get past this error?

Cheers,
Dan
 

Attachments

  • env.gadi_intel.txt
    565 bytes · Views: 2
  • Macros.gadi_intel.txt
    1.2 KB · Views: 3

david_hebert@nrlssc_navy_mil

David Hebert
New Member
Hello again,

I'm attempting to build CICE (and Icepack) on a machine that is not in the list of machines configuration/scripts/machines ... So I created the attached file in hopes of 'defining' my new machine, but fails in the batch.csh script.


Does anyone see how I can get past this error?

Cheers,
Dan
Hi Dan,

When adding a new machine you'll need to add it to the icepack.batch.csh file. In that file are examples you can start with for is you are running on with a queuing system, or at the bottom are examples where no queuing system is needed. Depending on how you plan on running, I suggest starting with one of those examples and add the it to the else if statement. So, example, if you are not using a queuing system, you can add this to the bottom:

else if (${ICE_MACHINE} =~ gadi*) then
cat >> ${jobfile} << EOFB
# nothing to do
EOFB


Hope that helps!
Thanks,
David
 

dpath2o

Daniel Atwater
New Member
Hi,

I'm curious if anyone has seen this error message (see below) and has any idea how to solve it.

Thanks in advace.

Cheers,
Dan

Atmospheric data files:
/scratch/jk72/da1339/cice-dirs/input/AFIM/forcing/0p1/JRA55/8XDAILY/JRA55_gx3_0
3hr_forcing_2005.nc
Set current forcing data year = 2005
(JRA55_data) reading forcing file 1st ts =
/scratch/jk72/da1339/cice-dirs/input/AFIM/forcing/0p1/JRA55/8XDAILY/JRA55_gx3_0
3hr_forcing_2005.nc

Finished writing ./history/iceh_ic.2005-01-01-03600.nc
[gadi-hmem-clx-0005:370771:0:371338] Caught signal 11 (Segmentation fault: address not mapped to object at address (nil))
[gadi-hmem-clx-0005:370773:0:371342] Caught signal 11 (Segmentation fault: address not mapped to object at address (nil))
[gadi-hmem-clx-0005:370775:0:371347] Caught signal 11 (Segmentation fault: invalid permissions for mapped object at address 0x14c14ad53000)
[gadi-hmem-clx-0005:370768:0:371335] Caught signal 11 (Segmentation fault: invalid permissions for mapped object at address 0x14582b43c000)
[gadi-hmem-clx-0005:370772:0:371341] Caught signal 11 (Segmentation fault: invalid permissions for mapped object at address 0x1513b6415000)
[gadi-hmem-clx-0005:370769:0:371336] Caught signal 11 (Segmentation fault: invalid permissions for mapped object at address 0x14af7bef2000)
[gadi-hmem-clx-0005:370765:0:371328] Caught signal 11 (Segmentation fault: invalid permissions for mapped object at address 0x149bddcbe000)
[gadi-hmem-clx-0005:370766:0:371330] Caught signal 11 (Segmentation fault: address not mapped to object at address (nil))
[gadi-hmem-clx-0005:370767:0:371332] Caught signal 11 (Segmentation fault: invalid permissions for mapped object at address 0x14d66fd1e000)
[gadi-hmem-clx-0005:370761:0:371329] Caught signal 11 (Segmentation fault: invalid permissions for mapped object at address 0x14656f59f000)
[gadi-hmem-clx-0005:370770:0:371337] Caught signal 11 (Segmentation fault: invalid permissions for mapped object at address 0x1533110cd000)
[gadi-hmem-clx-0005:370762:0:371331] Caught signal 11 (Segmentation fault: invalid permissions for mapped object at address 0x14b154bcc000)
[gadi-hmem-clx-0005:370764:0:371334] Caught signal 11 (Segmentation fault: address not mapped to object at address (nil))
[gadi-hmem-clx-0005:370763:0:371333] Caught signal 11 (Segmentation fault: invalid permissions for mapped object at address 0x14f643d97000)
[gadi-hmem-clx-0005:370753:0:371321] Caught signal 11 (Segmentation fault: invalid permissions for mapped object at address 0x15212a8b1000)
[gadi-hmem-clx-0005:370754:0:371320] Caught signal 11 (Segmentation fault: invalid permissions for mapped object at address 0x14720ea8c000)
[gadi-hmem-clx-0005:370755:0:371322] Caught signal 11 (Segmentation fault: address not mapped to object at address 0x7ffdc0bd1bb8)
[gadi-hmem-clx-0005:370757:0:371324] Caught signal 11 (Segmentation fault: address not mapped to object at address 0x7ffd484abd38)
[gadi-hmem-clx-0005:370758:0:371325] Caught signal 11 (Segmentation fault: invalid permissions for mapped object at address 0x15279dfda000)
[gadi-hmem-clx-0005:370759:0:371326] Caught signal 11 (Segmentation fault: invalid permissions for mapped object at address 0x150c44bf5000)
[gadi-hmem-clx-0005:370760:0:371327] Caught signal 11 (Segmentation fault: address not mapped to object at address (nil))
[gadi-hmem-clx-0005:370742:0:371309] Caught signal 11 (Segmentation fault: invalid permissions for mapped object at address 0x14e989366000)
[gadi-hmem-clx-0005:370745:0:371312] Caught signal 11 (Segmentation fault: address not mapped to object at address (nil))
[gadi-hmem-clx-0005:370752:0:371318] Caught signal 11 (Segmentation fault: address not mapped to object at address (nil))
[gadi-hmem-clx-0005:370743:0:371310] Caught signal 11 (Segmentation fault: invalid permissions for mapped object at address 0x1462f5d87000)
[gadi-hmem-clx-0005:370751:0:371317] Caught signal 11 (Segmentation fault: invalid permissions for mapped object at address 0x14d14fa8e000)
[gadi-hmem-clx-0005:370750:0:371319] Caught signal 11 (Segmentation fault: invalid permissions for mapped object at address 0x14cae7c6d000)
[gadi-hmem-clx-0005:370744:0:371311] Caught signal 11 (Segmentation fault: invalid permissions for mapped object at address 0x14f9eaa51000)
[gadi-hmem-clx-0005:370733:0:371345] Caught signal 11 (Segmentation fault: invalid permissions for mapped object at address 0x14fdef40c000)
[gadi-hmem-clx-0005:370747:0:371313] Caught signal 0 ((null): (null)(null))
[gadi-hmem-clx-0005:370746:0:371314] Caught signal 11 (Segmentation fault: address not mapped to object at address (nil))
[gadi-hmem-clx-0005:370748:0:371315] Caught signal 11 (Segmentation fault: invalid permissions for mapped object at address 0x14572b81d000)
[gadi-hmem-clx-0005:370731:0:371343] Caught signal 11 (Segmentation fault: invalid permissions for mapped object at address 0x14e079da5000)
[gadi-hmem-clx-0005:370732:0:371339] Caught signal 11 (Segmentation fault: invalid permissions for mapped object at address 0x14dd17eca000)
[gadi-hmem-clx-0005:370734:0:371346] Caught signal 11 (Segmentation fault: invalid permissions for mapped object at address 0x1455853b5000)
[gadi-hmem-clx-0005:370740:0:371355] Caught signal 11 (Segmentation fault: address not mapped to object at address (nil))
[gadi-hmem-clx-0005:370736:0:371351] Caught signal 11 (Segmentation fault: invalid permissions for mapped object at address 0x14e7f1c75000)
[gadi-hmem-clx-0005:370729:0:371340] Caught signal 11 (Segmentation fault: invalid permissions for mapped object at address 0x149c45e0a000)
[gadi-hmem-clx-0005:370738:0:371353] Caught signal 11 (Segmentation fault: invalid permissions for mapped object at address 0x14f6491af000)
==== backtrace (tid: 371330) ====
0 0x0000000000012ce0 __funlockfile() :0
=================================
==== backtrace (tid: 371342) ====
0 0x0000000000012ce0 __funlockfile() :0
=================================
==== backtrace (tid: 371318) ====
0 0x0000000000012ce0 __funlockfile() :0
=================================
==== backtrace (tid: 371312) ====
0 0x0000000000012ce0 __funlockfile() :0
=================================
[gadi-hmem-clx-0005:370749:0:371316] Caught signal 11 (Segmentation fault: invalid permissions for mapped object at address 0x14fddd45f000)
*** stack smashing detected ***: <unknown> terminated
forrtl: error (76): Abort trap signal
Image PC Routine Line Source
libpthread-2.28.s 000014FEFBC8BCE0 Unknown Unknown Unknown
libc-2.28.so 000014FEFB902A9F gsignal Unknown Unknown
libc-2.28.so 000014FEFB8D5E05 abort Unknown Unknown
libc-2.28.so 000014FEFB945037 Unknown Unknown Unknown
libc-2.28.so 000014FEFB9F32F5 Unknown Unknown Unknown
libc-2.28.so 000014FEFB9F32A8 Unknown Unknown Unknown
libucs.so.0.0.0 000014FEF4A6DB44 Unknown Unknown Unknown
libucs.so.0.0.0 000014FEF4A17A25 Unknown Unknown Unknown
libucs.so.0.0.0 000014FEF4A1C121 Unknown Unknown Unknown
libucs.so.0.0.0 000014FEF4A1C268 Unknown Unknown Unknown
libucs.so.0.0.0 000014FEF49F477B Unknown Unknown Unknown
libucs.so.0.0.0 000014FEF49BE041 Unknown Unknown Unknown
libucs.so.0.0.0 000014FEF49D1DA7 Unknown Unknown Unknown
libucs.so.0.0.0 000014FEF49C28EE Unknown Unknown Unknown
libucs.so.0.0.0 000014FEF49C31FF ucs_debug_backtra Unknown Unknown
libucs.so.0.0.0 000014FEF49C3717 ucs_handle_error Unknown Unknown
libucs.so.0.0.0 000014FEF49C39A1 Unknown Unknown Unknown
libucs.so.0.0.0 000014FEF49C3C58 Unknown Unknown Unknown
libpthread-2.28.s 000014FEFBC8BCE0 Unknown Unknown Unknown
cice 0000000000A60BEA Unknown Unknown Unknown
cice 00000000008B5251 ice_transport_rem 989 ice_transport_remap.F90
cice 00000000008A66AF ice_transport_rem 488 ice_transport_remap.F90
libiomp5.so 000014FEFBFF3B13 __kmp_invoke_micr Unknown Unknown
libiomp5.so 000014FEFBF63473 Unknown Unknown Unknown
libiomp5.so 000014FEFBF623B2 Unknown Unknown Unknown
libiomp5.so 000014FEFBFF4883 Unknown Unknown Unknown
libpthread-2.28.s 000014FEFBC811CF Unknown Unknown Unknown
libc-2.28.so 000014FEFB8EDDD3 clone Unknown Unknown
==== backtrace (tid: 371337) ====
0 0x0000000000012ce0 __funlockfile() :0
1 0x0000000000a60bea __intel_avx_rep_memset() ???:0
2 0x00000000008b5251 ice_transport_remap_mp_make_masks_() /home/581/da1339/src/CICE/cicecore/cicedynB/dynamics/ice_transport_remap.F90:989
3 0x00000000008a66af L_ice_transport_remap_mp_horizontal_remap__467__par_loop0_2_0() /home/581/da1339/src/CICE/cicecore/cicedynB/dynamics/ice_transport_remap.F90:488
4 0x000000000015ab13 __kmp_invoke_microtask() ???:0
5 0x00000000000ca473 __kmp_invoke_task_func() /nfs/site/proj/openmp/promo/20220623-fix-specACCEL/tmp/lin_32e-rtl_int_5_nor_dyn.rel.c0.s0.t1..h1.w1-fxilab153/../../src/kmp_runtime.cpp:7894
6 0x00000000000c93b2 __kmp_launch_thread() /nfs/site/proj/openmp/promo/20220623-fix-specACCEL/tmp/lin_32e-rtl_int_5_nor_dyn.rel.c0.s0.t1..h1.w1-fxilab153/../../src/kmp_runtime.cpp:6300
7 0x000000000015b883 _INTERNALd40edf92::__kmp_launch_worker() /nfs/site/proj/openmp/promo/20220623-fix-specACCEL/tmp/lin_32e-rtl_int_5_nor_dyn.rel.c0.s0.t1..h1.w1-fxilab153/../../src/z_Linux_util.cpp:532
8 0x00000000000081cf start_thread() ???:0
9 0x0000000000039dd3 __GI___clone() :0
=================================
 

dpath2o

Daniel Atwater
New Member
Hello,

Can someone help me get passed a hurtle here? All my attempts are failing.

Specifically at line 710 of ice_history_write.F90 . So I added a print statement in that file:
Code:
status = nf90_enddef(ncid)
print *,"STATUS", status

And cice.runlog reports.
(JRA55_data) reading forcing file 1st ts = /Volumes/ioa03/cice-dirs/input/AFIM/forcing/0p1//8XDAILY/JRA55_03hr_forcing_tx1_2005.nc
STATUS -62

(abort_ice)ABORTED:
(abort_ice) error = (ice_write_hist)ERROR in nf90_enddef

I think this is a size error.

In my cice.run file I have:
setenv OMP_STACKSIZE 64M
ulimit -s unlimited

And while I'm currently testing with a serial run the same is happening with MPI.

I'm using NetCDF version 4.9.0 and ifort (or mpifort when threaded) and also testing on my MacOS with conda-cice

This is run on an HPC with 2990GB of memory allocated and on my MacOS with 32GB.

This file
JRA55_03hr_forcing_tx1_2005.nc
is *not* JRA55 data on a 1-degree tripole, but rather, hourly ERA5 data on 1/10-degree.

While I do have the full year's worth in one file. Currently I have hyperslabbed the first month of data for testing purpose.

That file's structure is:
netcdf JRA55_03hr_forcing_tx1_2005 {
dimensions:
time = UNLIMITED ; // (721 currently)
nj = 2700 ;
ni = 3600 ;
variables:
float airtmp(time, nj, ni) ;
airtmp:_FillValue = -2.e+08f ;
airtmp:long_name = "2 metre temperature" ;
airtmp:units = "Kelvin" ;
airtmp:coordinates = "LON LAT" ;
float dlwsfc(time, nj, ni) ;
dlwsfc:_FillValue = -2.e+08f ;
dlwsfc:long_name = "Mean surface downward long-wave radiation flux" ;
dlwsfc:units = "W m**-2" ;
dlwsfc:coordinates = "LON LAT" ;
float glbrad(time, nj, ni) ;
glbrad:_FillValue = -2.e+08f ;
glbrad:long_name = "Mean surface downward short-wave radiation flux" ;
glbrad:units = "W m**-2" ;
glbrad:coordinates = "LON LAT" ;
float spchmd(time, nj, ni) ;
spchmd:_FillValue = -2.e+08f ;
spchmd:long_name = "specific humidity" ;
spchmd:units = "kg/kg" ;
spchmd:coordinates = "LON LAT" ;
float ttlpcp(time, nj, ni) ;
ttlpcp:_FillValue = -2.e+08f ;
ttlpcp:long_name = "Mean total precipitation rate" ;
ttlpcp:units = "kg m**-2 s**-1" ;
ttlpcp:coordinates = "LON LAT" ;
float wndewd(time, nj, ni) ;
wndewd:_FillValue = -2.e+08f ;
wndewd:long_name = "10 metre meridional wind component" ;
wndewd:units = "m s**-1" ;
wndewd:coordinates = "LON LAT" ;
float wndnwd(time, nj, ni) ;
wndnwd:_FillValue = -2.e+08f ;
wndnwd:long_name = "10 metre zonal wind component" ;
wndnwd:units = "m s**-1" ;
wndnwd:coordinates = "LON LAT" ;
double LON(nj, ni) ;
LON:_FillValue = NaN ;
LON:units = "degrees_east" ;
double LAT(nj, ni) ;
LAT:_FillValue = NaN ;
LAT:units = "degrees_north" ;
int64 time(time) ;
time:units = "hours since 2005-01-01 00:00:00" ;
time:calendar = "proleptic_gregorian" ;

// global attributes:
:creation_date = "2022-11-30 13" ;
:conventions = "CCSM data model domain description -- for CICE6 standalone \'JRA55\' atmosphere option" ;
:title = "re-gridded ERA5 for CICE6 standalone ocean forcing" ;
:source = "ERA5, https://doi.org/10.1002/qj.3803, " ;
:comment = "source files found on gadi, /g/data/rt52/era5/single-levels/reanalysis" ;
:note1 = "ERA5 documentation, ERA5: data documentation - Copernicus Knowledge Base - ECMWF Confluence Wiki" ;
:note2 = "regridding weight file, /g/data/jk72/da1339/grids/weights/map_ERA5_access-om2_cice_0p1_bilinear.nc" ;
:note3 = "re-gridded using ESMF_RegridGenWeights" ;
:author = "Daniel P Atwater" ;
:email = "daniel.atwater@utas.edu.au" ;
}

Any advice or help is most greatly appreciated.

Thank you.
 
Top