Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

run too slow and do not send any error messages

Renji2021

RENJI
New Member
I have built the model successfully with cesm2.1.3 ,the compset is BHIST and the grid is f09_g17,then i went to my 'bld' directory and copied the cesm.exe to 'run' directory .Finally,I sbatch the job to my HPC with slurm.I did not change any set,the stop_n is 5,but I had been waiting twelve hours,it did not get any output files and it did not send any error message.Here is my slurm out message.
Invalid PIO rearranger comm max pend req (comp2io), 0
Resetting PIO rearranger comm max pend req (comp2io) to 64
PIO rearranger options:
comm type =
p2p

comm fcd =
2denable

max pend req (comp2io) = 0
enable_hs (comp2io) = T
enable_isend (comp2io) = F
max pend req (io2comp) = 64
enable_hs (io2comp) = F
enable_isend (io2comp) = T
(seq_comm_setcomm) init ID ( 1 GLOBAL ) pelist = 0 359 1 ( npes = 360) ( nthreads = 1)( suffix =)
(seq_comm_setcomm) init ID ( 2 CPL ) pelist = 0 319 1 ( npes = 320) ( nthreads = 1)( suffix =)
(seq_comm_setcomm) init ID ( 5 ATM ) pelist = 0 319 1 ( npes = 320) ( nthreads = 1)( suffix =)
(seq_comm_joincomm) init ID ( 6 CPLATM ) join IDs = 2 5 ( npes = 320) ( nthreads = 1)
(seq_comm_jcommarr) init ID ( 3 ALLATMID ) join multiple comp IDs ( npes = 320) ( nthreads = 1)
(seq_comm_joincomm) init ID ( 4 CPLALLATMID ) join IDs = 2 3 ( npes = 320) ( nthreads = 1)
(seq_comm_setcomm) init ID ( 9 LND ) pelist = 0 159 1 ( npes = 160) ( nthreads = 1)( suffix =)
(seq_comm_joincomm) init ID ( 10 CPLLND ) join IDs = 2 9 ( npes = 320) ( nthreads = 1)
(seq_comm_jcommarr) init ID ( 7 ALLLNDID ) join multiple comp IDs ( npes = 160) ( nthreads = 1)
(seq_comm_joincomm) init ID ( 8 CPLALLLNDID ) join IDs = 2 7 ( npes = 320) ( nthreads = 1)
(seq_comm_setcomm) init ID ( 13 ICE ) pelist = 160 319 1 ( npes = 160) ( nthreads = 1)( suffix =)
(seq_comm_joincomm) init ID ( 14 CPLICE ) join IDs = 2 13 ( npes = 320) ( nthreads = 1)
(seq_comm_jcommarr) init ID ( 11 ALLICEID ) join multiple comp IDs ( npes = 160) ( nthreads = 1)
(seq_comm_joincomm) init ID ( 12 CPLALLICEID ) join IDs = 2 11 ( npes = 320) ( nthreads = 1)
(seq_comm_setcomm) init ID ( 17 OCN ) pelist = 320 359 1 ( npes = 40) ( nthreads = 1)( suffix =)
(seq_comm_joincomm) init ID ( 18 CPLOCN ) join IDs = 2 17 ( npes = 360) ( nthreads = 1)
(seq_comm_jcommarr) init ID ( 15 ALLOCNID ) join multiple comp IDs ( npes = 40) ( nthreads = 1)
(seq_comm_joincomm) init ID ( 16 CPLALLOCNID ) join IDs = 2 15 ( npes = 360) ( nthreads = 1)
(seq_comm_setcomm) init ID ( 21 ROF ) pelist = 0 159 1 ( npes = 160) ( nthreads = 1)( suffix =)
(seq_comm_joincomm) init ID ( 22 CPLROF ) join IDs = 2 21 ( npes = 320) ( nthreads = 1)
(seq_comm_jcommarr) init ID ( 19 ALLROFID ) join multiple comp IDs ( npes = 160) ( nthreads = 1)
(seq_comm_joincomm) init ID ( 20 CPLALLROFID ) join IDs = 2 19 ( npes = 320) ( nthreads = 1)
(seq_comm_setcomm) init ID ( 25 GLC ) pelist = 0 319 1 ( npes = 320) ( nthreads = 1)( suffix =)
(seq_comm_joincomm) init ID ( 26 CPLGLC ) join IDs = 2 25 ( npes = 320) ( nthreads = 1)
(seq_comm_jcommarr) init ID ( 23 ALLGLCID ) join multiple comp IDs ( npes = 320) ( nthreads = 1)
(seq_comm_joincomm) init ID ( 24 CPLALLGLCID ) join IDs = 2 23 ( npes = 320) ( nthreads = 1)
(seq_comm_setcomm) init ID ( 29 WAV ) pelist = 0 319 1 ( npes = 320) ( nthreads = 1)( suffix =)
(seq_comm_joincomm) init ID ( 30 CPLWAV ) join IDs = 2 29 ( npes = 320) ( nthreads = 1)
(seq_comm_jcommarr) init ID ( 27 ALLWAVID ) join multiple comp IDs ( npes = 320) ( nthreads = 1)
(seq_comm_joincomm) init ID ( 28 CPLALLWAVID ) join IDs = 2 27 ( npes = 320) ( nthreads = 1)
(seq_comm_setcomm) init ID ( 33 ESP ) pelist = 0 0 1 ( npes = 1) ( nthreads = 1)( suffix =)
(seq_comm_joincomm) init ID ( 34 CPLESP ) join IDs = 2 33 ( npes = 320) ( nthreads = 1)
(seq_comm_jcommarr) init ID ( 31 ALLESPID ) join multiple comp IDs ( npes = 1) ( nthreads = 1)
(seq_comm_joincomm) init ID ( 32 CPLALLESPID ) join IDs = 2 31 ( npes = 320) ( nthreads = 1)
(seq_comm_printcomms) 1 0 360 1 GLOBAL:
(seq_comm_printcomms) 2 0 320 1 CPL:
(seq_comm_printcomms) 3 0 320 1 ALLATMID:
(seq_comm_printcomms) 4 0 320 1 CPLALLATMID:
(seq_comm_printcomms) 5 0 320 1 ATM:
(seq_comm_printcomms) 6 0 320 1 CPLATM:
(seq_comm_printcomms) 7 0 160 1 ALLLNDID:
(seq_comm_printcomms) 8 0 320 1 CPLALLLNDID:
(seq_comm_printcomms) 9 0 160 1 LND:
(seq_comm_printcomms) 10 0 320 1 CPLLND:
(seq_comm_printcomms) 11 160 -1 1 ALLICEID:
(seq_comm_printcomms) 12 0 320 1 CPLALLICEID:
(seq_comm_printcomms) 13 160 -1 1 ICE:
(seq_comm_printcomms) 14 0 320 1 CPLICE:
(seq_comm_printcomms) 15 320 -1 1 ALLOCNID:
(seq_comm_printcomms) 16 0 360 1 CPLALLOCNID:
(seq_comm_printcomms) 17 320 -1 1 OCN:
(seq_comm_printcomms) 18 0 360 1 CPLOCN:
(seq_comm_printcomms) 19 0 160 1 ALLROFID:
(seq_comm_printcomms) 20 0 320 1 CPLALLROFID:
(seq_comm_printcomms) 21 0 160 1 ROF:
(seq_comm_printcomms) 22 0 320 1 CPLROF:
(seq_comm_printcomms) 23 0 320 1 ALLGLCID:
(seq_comm_printcomms) 24 0 320 1 CPLALLGLCID:
(seq_comm_printcomms) 25 0 320 1 GLC:
(seq_comm_printcomms) 26 0 320 1 CPLGLC:
(seq_comm_printcomms) 27 0 320 1 ALLWAVID:
(seq_comm_printcomms) 28 0 320 1 CPLALLWAVID:
(seq_comm_printcomms) 29 0 320 1 WAV:
(seq_comm_printcomms) 30 0 320 1 CPLWAV:
(seq_comm_printcomms) 31 0 1 1 ALLESPID:
(seq_comm_printcomms) 32 0 320 1 CPLALLESPID:
(seq_comm_printcomms) 33 0 1 1 ESP:
(seq_comm_printcomms) 34 0 320 1 CPLESP:
OCN : pio_numiotasks = 2
OCN : pio_stride = 20
OCN : pio_root = 1
OCN : pio_iotype = 5
OCN : pio_numiotasks = 2
OCN : pio_stride = 20
OCN : pio_rearranger = 1
OCN : pio_root = 1
OCN : pio_iotype = 5
ICE : pio_numiotasks = 4
ICE : pio_stride = 40
ICE : pio_root = 1
ICE : pio_iotype = 5
ICE : pio_numiotasks = 4
ICE : pio_stride = 40
ICE : pio_rearranger = 1
ICE : pio_root = 1
ICE : pio_iotype = 5
(t_initf) Read in prof_inparm namelist from: drv_in
(t_initf) Using profile_disable= F
(t_initf) profile_timer= 4
(t_initf) profile_depth_limit= 4
(t_initf) profile_detail_limit= 2
(t_initf) profile_barrier= F
(t_initf) profile_outpe_num= 1
(t_initf) profile_outpe_stride= 0
(t_initf) profile_single_file= F
(t_initf) profile_global_stats= T
(t_initf) profile_ovhd_measurement= F
(t_initf) profile_add_detail= F
(t_initf) profile_papi_enable= F
Any help is appreciated.
Thank you for all helping!!!
 

jedwards

CSEG and Liaisons
Staff member
Are you copying the cesm.exe to the run directory and then writing your own submission script instead of
using the provided tools? We do not support that. Did you follow the porting instructions from the cime users guide?
 

Renji2021

RENJI
New Member
Are you copying the cesm.exe to the run directory and then writing your own submission script instead of
using the provided tools? We do not support that. Did you follow the porting instructions from the cime users
Yes,I copied the cesm,exe to the run directory and writed my own stbatch script.I will try to follow the users guide and try it angin,thanks for your rely!!!
 

ohmpawat

ohmpawat chen
Member
I have built the model successfully with cesm2.1.3 ,the compset is BHIST and the grid is f09_g17,then i went to my 'bld' directory and copied the cesm.exe to 'run' directory .Finally,I sbatch the job to my HPC with slurm.I did not change any set,the stop_n is 5,but I had been waiting twelve hours,it did not get any output files and it did not send any error message.Here is my slurm out message.
Invalid PIO rearranger comm max pend req (comp2io), 0
Resetting PIO rearranger comm max pend req (comp2io) to 64
PIO rearranger options:
comm type =
p2p

comm fcd =
2denable

max pend req (comp2io) = 0
enable_hs (comp2io) = T
enable_isend (comp2io) = F
max pend req (io2comp) = 64
enable_hs (io2comp) = F
enable_isend (io2comp) = T
(seq_comm_setcomm) init ID ( 1 GLOBAL ) pelist = 0 359 1 ( npes = 360) ( nthreads = 1)( suffix =)
(seq_comm_setcomm) init ID ( 2 CPL ) pelist = 0 319 1 ( npes = 320) ( nthreads = 1)( suffix =)
(seq_comm_setcomm) init ID ( 5 ATM ) pelist = 0 319 1 ( npes = 320) ( nthreads = 1)( suffix =)
(seq_comm_joincomm) init ID ( 6 CPLATM ) join IDs = 2 5 ( npes = 320) ( nthreads = 1)
(seq_comm_jcommarr) init ID ( 3 ALLATMID ) join multiple comp IDs ( npes = 320) ( nthreads = 1)
(seq_comm_joincomm) init ID ( 4 CPLALLATMID ) join IDs = 2 3 ( npes = 320) ( nthreads = 1)
(seq_comm_setcomm) init ID ( 9 LND ) pelist = 0 159 1 ( npes = 160) ( nthreads = 1)( suffix =)
(seq_comm_joincomm) init ID ( 10 CPLLND ) join IDs = 2 9 ( npes = 320) ( nthreads = 1)
(seq_comm_jcommarr) init ID ( 7 ALLLNDID ) join multiple comp IDs ( npes = 160) ( nthreads = 1)
(seq_comm_joincomm) init ID ( 8 CPLALLLNDID ) join IDs = 2 7 ( npes = 320) ( nthreads = 1)
(seq_comm_setcomm) init ID ( 13 ICE ) pelist = 160 319 1 ( npes = 160) ( nthreads = 1)( suffix =)
(seq_comm_joincomm) init ID ( 14 CPLICE ) join IDs = 2 13 ( npes = 320) ( nthreads = 1)
(seq_comm_jcommarr) init ID ( 11 ALLICEID ) join multiple comp IDs ( npes = 160) ( nthreads = 1)
(seq_comm_joincomm) init ID ( 12 CPLALLICEID ) join IDs = 2 11 ( npes = 320) ( nthreads = 1)
(seq_comm_setcomm) init ID ( 17 OCN ) pelist = 320 359 1 ( npes = 40) ( nthreads = 1)( suffix =)
(seq_comm_joincomm) init ID ( 18 CPLOCN ) join IDs = 2 17 ( npes = 360) ( nthreads = 1)
(seq_comm_jcommarr) init ID ( 15 ALLOCNID ) join multiple comp IDs ( npes = 40) ( nthreads = 1)
(seq_comm_joincomm) init ID ( 16 CPLALLOCNID ) join IDs = 2 15 ( npes = 360) ( nthreads = 1)
(seq_comm_setcomm) init ID ( 21 ROF ) pelist = 0 159 1 ( npes = 160) ( nthreads = 1)( suffix =)
(seq_comm_joincomm) init ID ( 22 CPLROF ) join IDs = 2 21 ( npes = 320) ( nthreads = 1)
(seq_comm_jcommarr) init ID ( 19 ALLROFID ) join multiple comp IDs ( npes = 160) ( nthreads = 1)
(seq_comm_joincomm) init ID ( 20 CPLALLROFID ) join IDs = 2 19 ( npes = 320) ( nthreads = 1)
(seq_comm_setcomm) init ID ( 25 GLC ) pelist = 0 319 1 ( npes = 320) ( nthreads = 1)( suffix =)
(seq_comm_joincomm) init ID ( 26 CPLGLC ) join IDs = 2 25 ( npes = 320) ( nthreads = 1)
(seq_comm_jcommarr) init ID ( 23 ALLGLCID ) join multiple comp IDs ( npes = 320) ( nthreads = 1)
(seq_comm_joincomm) init ID ( 24 CPLALLGLCID ) join IDs = 2 23 ( npes = 320) ( nthreads = 1)
(seq_comm_setcomm) init ID ( 29 WAV ) pelist = 0 319 1 ( npes = 320) ( nthreads = 1)( suffix =)
(seq_comm_joincomm) init ID ( 30 CPLWAV ) join IDs = 2 29 ( npes = 320) ( nthreads = 1)
(seq_comm_jcommarr) init ID ( 27 ALLWAVID ) join multiple comp IDs ( npes = 320) ( nthreads = 1)
(seq_comm_joincomm) init ID ( 28 CPLALLWAVID ) join IDs = 2 27 ( npes = 320) ( nthreads = 1)
(seq_comm_setcomm) init ID ( 33 ESP ) pelist = 0 0 1 ( npes = 1) ( nthreads = 1)( suffix =)
(seq_comm_joincomm) init ID ( 34 CPLESP ) join IDs = 2 33 ( npes = 320) ( nthreads = 1)
(seq_comm_jcommarr) init ID ( 31 ALLESPID ) join multiple comp IDs ( npes = 1) ( nthreads = 1)
(seq_comm_joincomm) init ID ( 32 CPLALLESPID ) join IDs = 2 31 ( npes = 320) ( nthreads = 1)
(seq_comm_printcomms) 1 0 360 1 GLOBAL:
(seq_comm_printcomms) 2 0 320 1 CPL:
(seq_comm_printcomms) 3 0 320 1 ALLATMID:
(seq_comm_printcomms) 4 0 320 1 CPLALLATMID:
(seq_comm_printcomms) 5 0 320 1 ATM:
(seq_comm_printcomms) 6 0 320 1 CPLATM:
(seq_comm_printcomms) 7 0 160 1 ALLLNDID:
(seq_comm_printcomms) 8 0 320 1 CPLALLLNDID:
(seq_comm_printcomms) 9 0 160 1 LND:
(seq_comm_printcomms) 10 0 320 1 CPLLND:
(seq_comm_printcomms) 11 160 -1 1 ALLICEID:
(seq_comm_printcomms) 12 0 320 1 CPLALLICEID:
(seq_comm_printcomms) 13 160 -1 1 ICE:
(seq_comm_printcomms) 14 0 320 1 CPLICE:
(seq_comm_printcomms) 15 320 -1 1 ALLOCNID:
(seq_comm_printcomms) 16 0 360 1 CPLALLOCNID:
(seq_comm_printcomms) 17 320 -1 1 OCN:
(seq_comm_printcomms) 18 0 360 1 CPLOCN:
(seq_comm_printcomms) 19 0 160 1 ALLROFID:
(seq_comm_printcomms) 20 0 320 1 CPLALLROFID:
(seq_comm_printcomms) 21 0 160 1 ROF:
(seq_comm_printcomms) 22 0 320 1 CPLROF:
(seq_comm_printcomms) 23 0 320 1 ALLGLCID:
(seq_comm_printcomms) 24 0 320 1 CPLALLGLCID:
(seq_comm_printcomms) 25 0 320 1 GLC:
(seq_comm_printcomms) 26 0 320 1 CPLGLC:
(seq_comm_printcomms) 27 0 320 1 ALLWAVID:
(seq_comm_printcomms) 28 0 320 1 CPLALLWAVID:
(seq_comm_printcomms) 29 0 320 1 WAV:
(seq_comm_printcomms) 30 0 320 1 CPLWAV:
(seq_comm_printcomms) 31 0 1 1 ALLESPID:
(seq_comm_printcomms) 32 0 320 1 CPLALLESPID:
(seq_comm_printcomms) 33 0 1 1 ESP:
(seq_comm_printcomms) 34 0 320 1 CPLESP:
OCN : pio_numiotasks = 2
OCN : pio_stride = 20
OCN : pio_root = 1
OCN : pio_iotype = 5
OCN : pio_numiotasks = 2
OCN : pio_stride = 20
OCN : pio_rearranger = 1
OCN : pio_root = 1
OCN : pio_iotype = 5
ICE : pio_numiotasks = 4
ICE : pio_stride = 40
ICE : pio_root = 1
ICE : pio_iotype = 5
ICE : pio_numiotasks = 4
ICE : pio_stride = 40
ICE : pio_rearranger = 1
ICE : pio_root = 1
ICE : pio_iotype = 5
(t_initf) Read in prof_inparm namelist from: drv_in
(t_initf) Using profile_disable= F
(t_initf) profile_timer= 4
(t_initf) profile_depth_limit= 4
(t_initf) profile_detail_limit= 2
(t_initf) profile_barrier= F
(t_initf) profile_outpe_num= 1
(t_initf) profile_outpe_stride= 0
(t_initf) profile_single_file= F
(t_initf) profile_global_stats= T
(t_initf) profile_ovhd_measurement= F
(t_initf) profile_add_detail= F
(t_initf) profile_papi_enable= F
Any help is appreciated.
Thank you for all helping!!!
Hi, have you solved the problem? I encounter the same problem. But I don't know the solution. Could you tell me how to solve it? Thanks a lot!
 

Renji2021

RENJI
New Member
Hi, have you solved the problem? I encounter the same problem. But I don't know the solution. Could you tell me how to solve it? Thanks a lot!
Hello,Chen.I solved the problem by changing my slurm scripts. As I said,I went to my 'bld' directory and copied the cesm.exe to 'run' directory,that is a wrong operation,I suggest using case.submit and do not copy the cesm.exe.And,I will rebuild my model every time I change anything in the env.run.xml or anything.
 

ohmpawat

ohmpawat chen
Member
Hello,Chen.I solved the problem by changing my slurm scripts. As I said,I went to my 'bld' directory and copied the cesm.exe to 'run' directory,that is a wrong operation,I suggest using case.submit and do not copy the cesm.exe.And,I will rebuild my model every time I change anything in the env.run.xml or anything.
Thanks for your reply!
 
Top