Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

Variable performance issue

Hi,
We are trying to get CCSM3 up and running on a new SGI Altix 3700 Bx2 cluster. All seems well and the model appears to pass all the pre-designed tests. After some time load balancing (using many short 5-day runs) we decided on the best configuration. Now on running a longer case, we find that the performance changes dramatically after the first few days. Below are the dt times from a sample cpl.log. We've run this case a second time with almose identical results.
Any ideas why this might be happening. Would you expect a more consistent dt? I suspect that this is some problem on our system, but would be good to know if anyone has seen this kind of behaviour before.

Thanks
Alex
(tStamp_write) cpl model date 0001-01-01 00000s wall clock 2005-06-08 18:29:51 avg dt 0s dt 0s
(tStamp_write) cpl model date 0001-01-02 00000s wall clock 2005-06-08 18:32:01 avg dt 129s dt 129s
(tStamp_write) cpl model date 0001-01-03 00000s wall clock 2005-06-08 18:34:14 avg dt 131s dt 133s
(tStamp_write) cpl model date 0001-01-04 00000s wall clock 2005-06-08 18:36:27 avg dt 132s dt 133s
(tStamp_write) cpl model date 0001-01-05 00000s wall clock 2005-06-08 18:38:41 avg dt 132s dt 133s
(tStamp_write) cpl model date 0001-01-06 00000s wall clock 2005-06-08 18:40:53 avg dt 132s dt 133s
(tStamp_write) cpl model date 0001-01-07 00000s wall clock 2005-06-08 18:43:06 avg dt 132s dt 133s
(tStamp_write) cpl model date 0001-01-08 00000s wall clock 2005-06-08 18:45:31 avg dt 134s dt 145s
(tStamp_write) cpl model date 0001-01-09 00000s wall clock 2005-06-08 18:48:12 avg dt 138s dt 161s
(tStamp_write) cpl model date 0001-01-10 00000s wall clock 2005-06-08 18:51:33 avg dt 145s dt 201s
(tStamp_write) cpl model date 0001-01-11 00000s wall clock 2005-06-08 18:55:43 avg dt 155s dt 250s
(tStamp_write) cpl model date 0001-01-12 00000s wall clock 2005-06-08 19:00:27 avg dt 167s dt 284s
(tStamp_write) cpl model date 0001-01-13 00000s wall clock 2005-06-08 19:05:27 avg dt 178s dt 301s
(tStamp_write) cpl model date 0001-01-14 00000s wall clock 2005-06-08 19:10:51 avg dt 189s dt 323s
(tStamp_write) cpl model date 0001-01-15 00000s wall clock 2005-06-08 19:16:27 avg dt 200s dt 336s
(tStamp_write) cpl model date 0001-01-16 00000s wall clock 2005-06-08 19:22:09 avg dt 209s dt 342s
(tStamp_write) cpl model date 0001-01-17 00000s wall clock 2005-06-08 19:28:05 avg dt 218s dt 356s
(tStamp_write) cpl model date 0001-01-18 00000s wall clock 2005-06-08 19:34:11 avg dt 227s dt 366s
(tStamp_write) cpl model date 0001-01-19 00000s wall clock 2005-06-08 19:40:18 avg dt 235s dt 368s
(tStamp_write) cpl model date 0001-01-20 00000s wall clock 2005-06-08 19:46:29 avg dt 242s dt 371s
(tStamp_write) cpl model date 0001-01-21 00000s wall clock 2005-06-08 19:52:41 avg dt 248s dt 372s
(tStamp_write) cpl model date 0001-01-22 00000s wall clock 2005-06-08 19:58:47 avg dt 254s dt 367s
(tStamp_write) cpl model date 0001-01-23 00000s wall clock 2005-06-08 20:04:48 avg dt 259s dt 361s
(tStamp_write) cpl model date 0001-01-24 00000s wall clock 2005-06-08 20:10:59 avg dt 264s dt 370s
(tStamp_write) cpl model date 0001-01-25 00000s wall clock 2005-06-08 20:17:23 avg dt 269s dt 385s
(tStamp_write) cpl model date 0001-01-26 00000s wall clock 2005-06-08 20:24:12 avg dt 274s dt 409s
(tStamp_write) cpl model date 0001-01-27 00000s wall clock 2005-06-08 20:31:27 avg dt 281s dt 435s
(tStamp_write) cpl model date 0001-01-28 00000s wall clock 2005-06-08 20:38:51 avg dt 287s dt 444s
(tStamp_write) cpl model date 0001-01-29 00000s wall clock 2005-06-08 20:46:12 avg dt 292s dt 441s
(tStamp_write) cpl model date 0001-01-30 00000s wall clock 2005-06-08 20:53:32 avg dt 297s dt 440s
(tStamp_write) cpl model date 0001-01-31 00000s wall clock 2005-06-08 21:00:48 avg dt 302s dt 435s
(tStamp_write) cpl model date 0001-02-01 00000s wall clock 2005-06-08 21:07:56 avg dt 306s dt 428s
(tStamp_write) cpl model date 0001-02-02 00000s wall clock 2005-06-08 21:14:58 avg dt 310s dt 423s
(tStamp_write) cpl model date 0001-02-03 00000s wall clock 2005-06-08 21:22:01 avg dt 313s dt 423s
(tStamp_write) cpl model date 0001-02-04 00000s wall clock 2005-06-08 21:29:05 avg dt 316s dt 424s
(tStamp_write) cpl model date 0001-02-05 00000s wall clock 2005-06-08 21:36:04 avg dt 319s dt 419s
(tStamp_write) cpl model date 0001-02-06 00000s wall clock 2005-06-08 21:43:00 avg dt 322s dt 416s
(tStamp_write) cpl model date 0001-02-07 00000s wall clock 2005-06-08 21:49:53 avg dt 324s dt 413s
(tStamp_write) cpl model date 0001-02-08 00000s wall clock 2005-06-08 21:56:35 avg dt 326s dt 402s
(tStamp_write) cpl model date 0001-02-09 00000s wall clock 2005-06-08 22:03:16 avg dt 328s dt 401s
(tStamp_write) cpl model date 0001-02-10 00000s wall clock 2005-06-08 22:10:01 avg dt 330s dt 405s
(tStamp_write) cpl model date 0001-02-11 00000s wall clock 2005-06-08 22:16:45 avg dt 332s dt 404s
(tStamp_write) cpl model date 0001-02-12 00000s wall clock 2005-06-08 22:23:24 avg dt 334s dt 399s
(tStamp_write) cpl model date 0001-02-13 00000s wall clock 2005-06-08 22:29:59 avg dt 335s dt 395s
(tStamp_write) cpl model date 0001-02-14 00000s wall clock 2005-06-08 22:36:30 avg dt 336s dt 391s
(tStamp_write) cpl model date 0001-02-15 00000s wall clock 2005-06-08 22:42:59 avg dt 338s dt 389s
(tStamp_write) cpl model date 0001-02-16 00000s wall clock 2005-06-08 22:49:27 avg dt 339s dt 388s
(tStamp_write) cpl model date 0001-02-17 00000s wall clock 2005-06-08 22:55:54 avg dt 340s dt 387s
(tStamp_write) cpl model date 0001-02-18 00000s wall clock 2005-06-08 23:02:21 avg dt 341s dt 387s
(tStamp_write) cpl model date 0001-02-19 00000s wall clock 2005-06-08 23:08:49 avg dt 342s dt 388s
(tStamp_write) cpl model date 0001-02-20 00000s wall clock 2005-06-08 23:15:18 avg dt 343s dt 389s
(tStamp_write) cpl model date 0001-02-21 00000s wall clock 2005-06-08 23:21:47 avg dt 343s dt 389s
(tStamp_write) cpl model date 0001-02-22 00000s wall clock 2005-06-08 23:28:17 avg dt 344s dt 390s
(tStamp_write) cpl model date 0001-02-23 00000s wall clock 2005-06-08 23:34:44 avg dt 345s dt 387s
(tStamp_write) cpl model date 0001-02-24 00000s wall clock 2005-06-08 23:41:07 avg dt 346s dt 384s
(tStamp_write) cpl model date 0001-02-25 00000s wall clock 2005-06-08 23:47:28 avg dt 346s dt 381s
(tStamp_write) cpl model date 0001-02-26 00000s wall clock 2005-06-08 23:53:47 avg dt 347s dt 380s
(tStamp_write) cpl model date 0001-02-27 00000s wall clock 2005-06-09 00:00:02 avg dt 348s dt 375s
(tStamp_write) cpl model date 0001-02-28 00000s wall clock 2005-06-09 00:06:13 avg dt 348s dt 371s
(tStamp_write) cpl model date 0001-03-01 00000s wall clock 2005-06-09 00:12:27 avg dt 348s dt 374s
(tStamp_write) cpl model date 0001-03-02 00000s wall clock 2005-06-09 00:18:46 avg dt 349s dt 379s
(tStamp_write) cpl model date 0001-03-03 00000s wall clock 2005-06-09 00:25:06 avg dt 349s dt 380s
(tStamp_write) cpl model date 0001-03-04 00000s wall clock 2005-06-09 00:31:25 avg dt 350s dt 379s
(tStamp_write) cpl model date 0001-03-05 00000s wall clock 2005-06-09 00:37:42 avg dt 350s dt 377s
(tStamp_write) cpl model date 0001-03-06 00000s wall clock 2005-06-09 00:43:58 avg dt 351s dt 376s
(tStamp_write) cpl model date 0001-03-07 00000s wall clock 2005-06-09 00:50:12 avg dt 351s dt 374s
 

njn01

Member
Alex,

You might try looking at the thread " Experiences porting CCSM to LLNL IA64 'Thunder' cluster" posted by Art Mirim. You'll fnd it in the Archives, in the category "CCSM porting to unsupported machines" Art gave a detailed summary of his experiences, including a problem with increasingly slow timesteps, which was traced to an underflow in the ice model. I'm not sure this is the same problem you're experiencing, but I think it's worth looking at Art's post.
 
Top