What version of the code are you using?
CESM2.1.3
Have you made any changes to files in the source tree?
No code changes, changed the stop condition to be 5 days, although the error occurs before the first timestep (I think, based on the output). NTASKS_WAV=600 (based on advice in another thread on the forum). The run is performed on 9 nodes, 128 tasks per node
Describe every step you took leading up to the problem:
$CIMEROOT/scripts/create_newcase --case $CESM_ROOT/runs/b.e21.B1850G.f09_g17_gl4.CMIP6-historical-withism.001.profile_io_2 --compset B1850G --res f09_g17_gl4 --project d446 --run-unsupported
./check_input_data --download
./xmlchange NTASKS_WAV=600
./xmlchange DEBUG=TRUE
./case.setup
./case.build
./case.submit
If this is a port to a new machine: Please attach any files you added or changed for the machine port (e.g., config_compilers.xml, config_machines.xml, and config_batch.xml) and tell us the compiler version you are using on this machine.
Please attach any log files showing error messages or other useful information.
Run on ARCHER2 (I take it, CESM isn't usually run here, but it has been ported previously: Quick Start: CESM Model Workflow (CESM 2.1.3) - ARCHER2 User Documentation). I turned on debugging symbols, loaded the linaro forge module for profiling (although this run wasn't profiled). Using the GCC compiler and CrayMPI.
Describe your problem or question:
Hi,
I'm new to CESM and am running into issues running a B1850G (but have also tried B1850) compset with the f09_g17_gl4 grid. It worked fine for the f19_g17 grid, but am getting segfaults or array out of bounds errors in w3initmd.f90 (depending on whether running with DEBUG=TRUE or not). I haven't modified the code at all. With debug true most proximal error is:
At line 1902 of file /mnt/lustre/a2fs-work4/work/d446/d446/ab_continents/cesm/CESM2.1.3/my_cesm_sandbox/components/ww3/src/source/w3initmd.f90
Fortran runtime error: Index '1201' of dimension 1 of array 'irqrs' above upper bound of 1200
There is also a lot of output suggesting "no dedicated output process, any file system".
On a slightly unrelated note, this is part of work on a grant to investigate data movement for climate codes. We figured a good starting point would be to investigate IO performance on CESM. A profile of a short run (100 days) with output snapshots every 10 days on the f19_g17 grid showed IO was a very small portion of the runtime, hence trying a finer grid. Any suggestions for what other setups might be interesting to investigate, or where IO/data movement might be an issue?
Thank you
Alexei
CESM2.1.3
Have you made any changes to files in the source tree?
No code changes, changed the stop condition to be 5 days, although the error occurs before the first timestep (I think, based on the output). NTASKS_WAV=600 (based on advice in another thread on the forum). The run is performed on 9 nodes, 128 tasks per node
Describe every step you took leading up to the problem:
$CIMEROOT/scripts/create_newcase --case $CESM_ROOT/runs/b.e21.B1850G.f09_g17_gl4.CMIP6-historical-withism.001.profile_io_2 --compset B1850G --res f09_g17_gl4 --project d446 --run-unsupported
./check_input_data --download
./xmlchange NTASKS_WAV=600
./xmlchange DEBUG=TRUE
./case.setup
./case.build
./case.submit
If this is a port to a new machine: Please attach any files you added or changed for the machine port (e.g., config_compilers.xml, config_machines.xml, and config_batch.xml) and tell us the compiler version you are using on this machine.
Please attach any log files showing error messages or other useful information.
Run on ARCHER2 (I take it, CESM isn't usually run here, but it has been ported previously: Quick Start: CESM Model Workflow (CESM 2.1.3) - ARCHER2 User Documentation). I turned on debugging symbols, loaded the linaro forge module for profiling (although this run wasn't profiled). Using the GCC compiler and CrayMPI.
Describe your problem or question:
Hi,
I'm new to CESM and am running into issues running a B1850G (but have also tried B1850) compset with the f09_g17_gl4 grid. It worked fine for the f19_g17 grid, but am getting segfaults or array out of bounds errors in w3initmd.f90 (depending on whether running with DEBUG=TRUE or not). I haven't modified the code at all. With debug true most proximal error is:
At line 1902 of file /mnt/lustre/a2fs-work4/work/d446/d446/ab_continents/cesm/CESM2.1.3/my_cesm_sandbox/components/ww3/src/source/w3initmd.f90
Fortran runtime error: Index '1201' of dimension 1 of array 'irqrs' above upper bound of 1200
There is also a lot of output suggesting "no dedicated output process, any file system".
On a slightly unrelated note, this is part of work on a grant to investigate data movement for climate codes. We figured a good starting point would be to investigate IO performance on CESM. A profile of a short run (100 days) with output snapshots every 10 days on the f19_g17 grid showed IO was a very small portion of the runtime, hence trying a finer grid. Any suggestions for what other setups might be interesting to investigate, or where IO/data movement might be an issue?
Thank you
Alexei