Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

very slow CICE running on Derecho

Ppl

Pengfei Zhang
New Member
Hello,

I'm re-running a previous CICE stand-alone experiment forced by JRA inputdata on Derecho. It was conducted on Cheyenne and the setup is the same as the old one. However, the experiment runs very slowly on Derecho. 1 model year can take up to 6 hours. It does not change regardless of the number of nodes used. I would appreciate any suggestions.

Thanks,

Phil
 

Ppl

Pengfei Zhang
New Member
Hi Phil - can you point me to a case directory please?
Hi Jedwards,

Thank you for your reply. Please see the case running on Derecho at: /glade/u/home/zpengfei/cases/cice5.alone/
This one is based on CESM2.1.5. I have tried 2.1.3, there is no difference in running.

The old one had been conducted on Cheyenne under the framework of cesm2.1.3 : /glade/u/home/zpengfei/cases/cice5.ctrl.chn/

Thanks,
 

jedwards

CSEG and Liaisons
Staff member
cd: /glade/u/home/zpengfei/cases/cice5.alone/: Permission denied

Please change permissions.
 

Ppl

Pengfei Zhang
New Member
Are you running the consortium version of the model or a DTEST case in CESM?
Hi Dave,
Thank you for your reply. I used the publicly released version of CESM2.1.3 and CESM2.1.5 to create the CICE cases:

--compset 2000_DATM%JRA_SLND_CICE_DOCN%SOM_DROF%IAF_SGLC_SWAV

Please see more details in my case folders in the above reply to Jedwards.
 

jedwards

CSEG and Liaisons
Staff member
The pelayout that you have is not ideal.
Comp NTASKS NTHRDS ROOTPE
CPL : 64/ 2; 0
ATM : 64/ 2; 0
LND : 64/ 2; 0
ICE : 64/ 2; 0
OCN : 64/ 2; 0
ROF : 64/ 2; 0
GLC : 64/ 2; 0
WAV : 64/ 2; 0
ESP : 64/ 2; 0

Unless there is a compelling reason to use threading I would set NTHRDS=1 and NTASKS=256
 

Ppl

Pengfei Zhang
New Member
The pelayout that you have is not ideal.
Comp NTASKS NTHRDS ROOTPE
CPL : 64/ 2; 0
ATM : 64/ 2; 0
LND : 64/ 2; 0
ICE : 64/ 2; 0
OCN : 64/ 2; 0
ROF : 64/ 2; 0
GLC : 64/ 2; 0
WAV : 64/ 2; 0
ESP : 64/ 2; 0

Unless there is a compelling reason to use threading I would set NTHRDS=1 and NTASKS=256
In the last two days, I have tried NTHRDS=2 and NTASKS=512. There is no difference. I will try NTHRDS=1 and NTASKS=256, and let you know
 

jedwards

CSEG and Liaisons
Staff member
ls: cannot access '/glade/work/zpengfei/bakup/bitz/cesm2_free_ens30.pop_frc.bc.15.1x1d.090130.nc': Permission denied
 

Ppl

Pengfei Zhang
New Member
The pelayout that you have is not ideal.
Comp NTASKS NTHRDS ROOTPE
CPL : 64/ 2; 0
ATM : 64/ 2; 0
LND : 64/ 2; 0
ICE : 64/ 2; 0
OCN : 64/ 2; 0
ROF : 64/ 2; 0
GLC : 64/ 2; 0
WAV : 64/ 2; 0
ESP : 64/ 2; 0

Unless there is a compelling reason to use threading I would set NTHRDS=1 and NTASKS=256

I have tested NTHRDS=1 and NTASKS=256. It didn't work. The run failed with an error information :
ice: no. blocks exceed max: increase max to 2
The testing case directory is /glade/u/home/zpengfei/cases/cice5.test

Please see details in the run directory at: /glade/derecho/scratch/zpengfei/cesm_run/cice5.test/


NTHRDS=2 and NTASKS=256 work but it runs as slow as NTHRDS=2 and NTASKS=64.


ls: cannot access '/glade/work/zpengfei/bakup/bitz/cesm2_free_ens30.pop_frc.bc.15.1x1d.090130.nc': Permission denied
This file is for the setting of slab ocean condition. I didn't see any error information related to this file. I put a copy at /glade/derecho/scratch/zpengfei/cesm_run/cice5.test/ for your testing.

Many thanks,
 

dbailey

CSEG and Liaisons
Staff member
You will also need to change the CICE block layout as we don't have good block / processor layouts in CESM2.1. Assuming this is the gx1v6 grid:

Check the values for CICE_BLCKX, CICE_BLCKY, CICE_MXBLCKS, and CICE_DECOMPTYPE. I see you are getting the "tall skinny" blocks. These don't work so well on derecho.

Edit env_build.xml and set CICE_AUTO_DECOMP=false

I am having troubles doing this with xmlchange, so it is likely better to edit env_build.xml.

./xmlchange CICE_BLCKX=16
./xmlchange CICE_BLCKY=16
./xmlchange CICE_MXBLCKS=2
./xmlchange CICE_DECOMPTYPE=sectrobin

./case.build --clean-all
./case.build

This should work much better.

Dave
 

Ppl

Pengfei Zhang
New Member
You will also need to change the CICE block layout as we don't have good block / processor layouts in CESM2.1. Assuming this is the gx1v6 grid:

Check the values for CICE_BLCKX, CICE_BLCKY, CICE_MXBLCKS, and CICE_DECOMPTYPE. I see you are getting the "tall skinny" blocks. These don't work so well on derecho.

Edit env_build.xml and set CICE_AUTO_DECOMP=false

I am having troubles doing this with xmlchange, so it is likely better to edit env_build.xml.

./xmlchange CICE_BLCKX=16
./xmlchange CICE_BLCKY=16
./xmlchange CICE_MXBLCKS=2
./xmlchange CICE_DECOMPTYPE=sectrobin

./case.build --clean-all
./case.build

This should work much better.

Dave
Hi Dave,

Thank you very much. Your suggestions solved this issue. I appreciate your and Jedwards' help.

Best,
Pengfei
 
Top