Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

an issue with modifying radiation.F90 (to modify cloud radiative effects)

Hi,

I am making some simple changes in radiation.F90 to modify the cloud radiative effects. The changes are basically to override one variable with another (after they are both computed), for example, "fsns = fsnsc". However, the model failed due to these changes (it run successfully without these changes). The model configuration is CAM4 coupled with an aqua-planet slab ocean and all other components in stub mode. The code base is CESM1.2.0 on Yellowstone.

I found some error messages (below) in the cesm log file (no errors in other log files).
INFO: 0031-251 task 3 exited: rc=-11
INFO: 0031-251 task 6 exited: rc=-11
INFO: 0031-251 task 64 exited: rc=-11
INFO: 0031-251 task 85 exited: rc=-11
112:forrtl: error (78): process killed (SIGTERM)
112:Image PC Routine Line Source
112:libpthread.so.0 00002B89C25252A5 Unknown Unknown Unknown
112:libpoe.so 00002B89C750EAE2 Unknown Unknown Unknown
112:libpthread.so.0 00002B89C251D851 Unknown Unknown Unknown
112:libc.so.6 00002B89C4AD190D Unknown Unknown Unknown
112:INFO: 0031-306 pm_atexit: pm_exit_value is 1.
......
INFO: 0031-251 task 106 exited: rc=1
INFO: 0031-251 task 113 exited: rc=1
INFO: 0031-251 task 111 exited: rc=1
INFO: 0031-251 task 107 exited: rc=1
156:forrtl: error (78): process killed (SIGTERM)
156:Image PC Routine Line Source
156:libpthread.so.0 00002B5AFCDF62A5 Unknown Unknown Unknown
156:libpoe.so 00002B5B01DDFAE2 Unknown Unknown Unknown
156:libpthread.so.0 00002B5AFCDEE851 Unknown Unknown Unknown
156:libc.so.6 00002B5AFF3A290D Unknown Unknown Unknown
156:INFO: 0031-306 pm_atexit: pm_exit_value is 1.


I also found some core_lite files in the run directory, which seems to be all similar. for example:
Thread 13 (Thread 0x2aae1ab56700 (LWP 11551)):
#0 0x00002aae1778cf03 in epoll_wait () from /lib64/libc.so.6
#1 0x00002aae1a1caea6 in poe_exiting_thread () from /opt/ibmhpc/pe1307/base/intel/lib64/libpoe.so
#2 0x00002aae151d8851 in start_thread () from /lib64/libpthread.so.0
#3 0x00002aae1778c90d in clone () from /lib64/libc.so.6
Thread 11 (Thread 0x2b3bdd283700 (LWP 18438)):
#0 0x00002b3bd9280f74 in tbk_get_pc_info () from /ncar/opt/intel/2013/composer_xe_2013.1.117/compiler/lib/intel64/libirc.so
#1 0x00002b3bd92806eb in stackwalk_cb () from /ncar/opt/intel/2013/composer_xe_2013.1.117/compiler/lib/intel64/libirc.so
#2 0x00002b3bd9281a65 in tbk_trace_stack () from /ncar/opt/intel/2013/composer_xe_2013.1.117/compiler/lib/intel64/libirc.so
#3 0x00002b3bd92804b6 in tbk_string_stack_signal () from /ncar/opt/intel/2013/composer_xe_2013.1.117/compiler/lib/intel64/libirc.so
#4 0x000000000127eb52 in tbk_stack_trace ()
#5 0x00000000011ff48c in for__issue_diagnostic ()
#6 0x0000000001208e03 in for__signal_handler ()


Can anyone help with this issue?
Many thanks.
Honghai
 

santos

Member
On Yellowstone, the useful part of a core_lite file is actually the very bottom. Can you attach the whole thing?
 
Below is the whole information from one of the 12 core_lite files generated. Others are not the same, but similar.  Please let me know if you need to see others or go to the run directory. Thanks a lot! Thread 13 (Thread 0x2aae1ab56700 (LWP 11551)):#0  0x00002aae1778cf03 in epoll_wait () from /lib64/libc.so.6#1  0x00002aae1a1caea6 in poe_exiting_thread () from /opt/ibmhpc/pe1307/base/intel/lib64/libpoe.so#2  0x00002aae151d8851 in start_thread () from /lib64/libpthread.so.0#3  0x00002aae1778c90d in clone () from /lib64/libc.so.6Thread 12 (Thread 0x2aae1ad57700 (LWP 11552)):#0  0x00002aae1778cf03 in epoll_wait () from /lib64/libc.so.6#1  0x00002aae1a1c8af3 in pm_child_sig_thread () from /opt/ibmhpc/pe1307/base/intel/lib64/libpoe.so#2  0x00002aae151d8851 in start_thread () from /lib64/libpthread.so.0#3  0x00002aae1778c90d in clone () from /lib64/libc.so.6Thread 11 (Thread 0x2aae1af58700 (LWP 11553)):#0  0x00002aae151d90ad in pthread_join () from /lib64/libpthread.so.0#1  0x00002aae1741f02b in __kmp_reap_monitor (th=0x2aae80cd09d0) at ../../src/z_Linux_util.c:1478#2  0x00002aae173ff0b1 in __kmp_internal_end () at ../../src/kmp_runtime.c:7293#3  0x00002aae17401ee8 in __kmp_internal_end_library (gtid_req=-2134046256) at ../../src/kmp_runtime.c:7464#4  0x00002aae17403cbe in __kmp_internal_end_atexit () at ../../src/kmp_runtime.c:7078#5  0x00002aae173ff3c9 in __kmp_internal_end_fini () at ../../src/kmp_runtime.c:7045#6  0x00002aae144d2b8c in _dl_fini () from /lib64/ld-linux-x86-64.so.2#7  0x00002aae176d9da2 in exit () from /lib64/libc.so.6#8  0x0000000001209a98 in for__signal_handler ()#9  #10 0x00002aae151e02a5 in sigwait () from /lib64/libpthread.so.0#11 0x00002aae1a1c9ae2 in pm_async_thread () from /opt/ibmhpc/pe1307/base/intel/lib64/libpoe.so#12 0x00002aae151d8851 in start_thread () from /lib64/libpthread.so.0#13 0x00002aae1778c90d in clone () from /lib64/libc.so.6Thread 10 (Thread 0x2aae1cf6a700 (LWP 11554)):#0  0x00002aae151dc43c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0#1  0x00002aae1c5b3ecb in hal_ibl_user_intr_hndlr () from /opt/ibmhpc/pe1307/base/intel/lib64/libhal64_ibm.so#2  0x00002aae151d8851 in start_thread () from /lib64/libpthread.so.0#3  0x00002aae1778c90d in clone () from /lib64/libc.so.6Thread 9 (Thread 0x2aae1d16b700 (LWP 11555)):#0  0x00002aae1778cf03 in epoll_wait () from /lib64/libc.so.6#1  0x00002aae1c5b3ac3 in hal_ibl_async_intr_hndlr () from /opt/ibmhpc/pe1307/base/intel/lib64/libhal64_ibm.so#2  0x00002aae151d8851 in start_thread () from /lib64/libpthread.so.0#3  0x00002aae1778c90d in clone () from /lib64/libc.so.6Thread 8 (Thread 0x2aae3c482700 (LWP 11583)):#0  0x00002aae151dc43c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0#1  0x00002aae1995671a in shm_dispatcher_thread (arg=0x2aae1d4723ac) at /project/sprelcot/build/rcots007a/src/ppe/lapi/lapi_shm.c:2827#2  0x00002aae151d8851 in start_thread () from /lib64/libpthread.so.0#3  0x00002aae1778c90d in clone () from /lib64/libc.so.6Thread 7 (Thread 0x2aae3cb29700 (LWP 11584)):#0  0x00002aae151dc43c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0#1  0x00002aae1996db6f in rc_ibl_intr_hndlr (param=0x2aae19f64d7c) at /project/sprelcot/build/rcots007a/src/ppe/lapi/lapi_rc_rdma_verbs_wrappers.c:1139#2  0x00002aae151d8851 in start_thread () from /lib64/libpthread.so.0#3  0x00002aae1778c90d in clone () from /lib64/libc.so.6Thread 6 (Thread 0x2aae3cd2a700 (LWP 11585)):#0  0x00002aae1778cf03 in epoll_wait () from /lib64/libc.so.6#1  0x00002aae1996dfb1 in rc_ibl_async_intr_hndlr (param=0x17) at /project/sprelcot/build/rcots007a/src/ppe/lapi/lapi_rc_rdma_verbs_wrappers.c:1333#2  0x00002aae151d8851 in start_thread () from /lib64/libpthread.so.0#3  0x00002aae1778c90d in clone () from /lib64/libc.so.6Thread 5 (Thread 0x2aae3cf2b700 (LWP 11586)):#0  0x00002aae151dc7bb in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0#1  0x00002aae199361ff in _timer_arm (timer=0x665c7dc) at /project/sprelcot/build/rcots007a/src/ppe/lapi/intrhndlrs.c:533#2  0x00002aae1993612f in _lapi_tmr_thrd (param=0x665c7dc) at /project/sprelcot/build/rcots007a/src/ppe/lapi/intrhndlrs.c:796#3  0x00002aae151d8851 in start_thread () from /lib64/libpthread.so.0#4  0x00002aae1778c90d in clone () from /lib64/libc.so.6Thread 4 (Thread 0x2aae3d12c700 (LWP 11682)):#0  0x00002aae17783253 in poll () from /lib64/libc.so.6#1  0x00002aae19f94572 in Connection::Wait (this=0x666cc30) at /project/sprelcot/build/rcots007a/src/ppe/pnsd/connection.cpp:169#2  0x00002aae19f8035e in internal_pnsd_api_wait_for_updates(int, uint *, char *, *, nrt_window_id_t *, char **, int *, char **) (handle=, wakeup_event=0x2aae3d12ba74, device_name=, adapter_type=, win_id=, cmd_string=0x2aae3d12ba60, opt_length=0x2aae3d12ba70, opt=0x2aae3d12ba68) at /project/sprelcot/build/rcots007a/src/ppe/pnsd/pnsd_api.cpp:337#3  0x00002aae19f8063c in pnsd_api_wait_for_updates (handle=1024637104, wakeup_event_OUT=0x2, cmd_string_OUT=, opt_length=, opt_OUT=) at /project/sprelcot/build/rcots007a/src/ppe/pnsd/pnsd_api.cpp:379#4  0x00002aae1995d2fb in preempt_monitor_thread (param=0x2aae3d12b8b0) at /project/sprelcot/build/rcots007a/src/ppe/lapi/lapi_preempt.c:575#5  0x00002aae151d8851 in start_thread () from /lib64/libpthread.so.0#6  0x00002aae1778c90d in clone () from /lib64/libc.so.6Thread 3 (Thread 0x2aae80cd0700 (LWP 11731)):#0  0x00002aae151dc7bb in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0#1  0x00002aae1741d1cb in __kmp_launch_monitor (thr=0x2aae17685284) at ../../src/z_Linux_util.c:1014#2  0x00002aae151d8851 in start_thread () from /lib64/libpthread.so.0#3  0x00002aae1778c90d in clone () from /lib64/libc.so.6Thread 2 (Thread 0x2aae98cd1700 (LWP 11732)):#0  0x00002aae177507ad in waitpid () from /lib64/libc.so.6#1  0x00002aae176e2889 in do_system () from /lib64/libc.so.6#2  0x00002aae176e2bc0 in system () from /lib64/libc.so.6#3  0x00002aae1a1cb1af in pm_linux_print_coredump () from /opt/ibmhpc/pe1307/base/intel/lib64/libpoe.so#4  0x00002aae1a1c9f1a in pm_lwcf_signal_handler () from /opt/ibmhpc/pe1307/base/intel/lib64/libpoe.so#5  #6  0x0000000000cbc4e0 in tp_core_mp_xtpv_ ()#7  0x0000000000cb4ca1 in tp_core_mp_tp2c_ ()#8  0x0000000000ca319d in sw_core_mp_d_sw_ ()#9  0x0000000000ae5083 in cd_core_ ()#10 0x00002aae1741c003 in L_kmp_invoke_pass_parms () from /ncar/opt/intel/2013/composer_xe_2013.1.117/compiler/lib/intel64/libiomp5.so#11 0x00007fff416e5438 in ?? ()#12 0x00007fff416e4f80 in ?? ()#13 0x00007fff416e500c in ?? ()#14 0x00007fff416e5004 in ?? ()#15 0x00007fff416e5008 in ?? ()#16 0x00007fff416e4c74 in ?? ()#17 0x00007fff416e3a30 in ?? ()#18 0x00007fff416e3aa8 in ?? ()#19 0x00007fff416e3ee0 in ?? ()#20 0x00007fff416e3f58 in ?? ()#21 0x00007fff416e3b20 in ?? ()#22 0x00007fff416e3b98 in ?? ()#23 0x00007fff416e3e68 in ?? ()#24 0x00007fff416e3c88 in ?? ()#25 0x00007fff416e3d00 in ?? ()#26 0x00007fff416e3d78 in ?? ()#27 0x00007fff416e3df0 in ?? ()#28 0x00007fff416e50e8 in ?? ()#29 0x00007fff416e50ec in ?? ()#30 0x00007fff416e50f0 in ?? ()#31 0x00007fff416e50f4 in ?? ()#32 0x00007fff416e50f8 in ?? ()#33 0x00007fff416e50fc in ?? ()#34 0x00007fff416e5100 in ?? ()#35 0x00007fff416e5104 in ?? ()#36 0x00007fff416e5108 in ?? ()#37 0x00007fff416e510c in ?? ()#38 0x00007fff416e519c in ?? ()#39 0x00007fff416e51a0 in ?? ()#40 0x00007fff416e51a4 in ?? ()#41 0x00007fff416e51a8 in ?? ()#42 0x00007fff416e51ac in ?? ()#43 0x00007fff416e51b0 in ?? ()#44 0x00007fff416e51b4 in ?? ()#45 0x00007fff416e51b8 in ?? ()#46 0x00007fff416e51bc in ?? ()#47 0x00007fff416e51c0 in ?? ()#48 0x00007fff416e5124 in ?? ()#49 0x00007fff416e5128 in ?? ()#50 0x00007fff416e512c in ?? ()#51 0x00007fff416e5130 in ?? ()#52 0x00007fff416e5134 in ?? ()#53 0x00007fff416e5110 in ?? ()#54 0x00007fff416e5114 in ?? ()#55 0x00007fff416e5118 in ?? ()#56 0x00007fff416e511c in ?? ()#57 0x00007fff416e5120 in ?? ()#58 0x00007fff416e5188 in ?? ()#59 0x00007fff416e518c in ?? ()#60 0x00007fff416e5190 in ?? ()#61 0x00007fff416e5194 in ?? ()#62 0x00007fff416e5198 in ?? ()#63 0x00007fff416e5138 in ?? ()#64 0x00007fff416e513c in ?? ()#65 0x00007fff416e5140 in ?? ()#66 0x00007fff416e5144 in ?? ()#67 0x00007fff416e5148 in ?? ()#68 0x00007fff416e514c in ?? ()#69 0x00007fff416e5150 in ?? ()#70 0x00007fff416e5154 in ?? ()#71 0x00007fff416e5158 in ?? ()#72 0x00007fff416e515c in ?? ()#73 0x00007fff416e5160 in ?? ()#74 0x00007fff416e5164 in ?? ()#75 0x00007fff416e5168 in ?? ()#76 0x00007fff416e516c in ?? ()#77 0x00007fff416e5170 in ?? ()#78 0x00007fff416e5174 in ?? ()#79 0x00007fff416e5178 in ?? ()#80 0x00007fff416e517c in ?? ()#81 0x00007fff416e5180 in ?? ()#82 0x00007fff416e5184 in ?? ()#83 0x00002aae80c5ab00 in ?? ()#84 0x0000000000000001 in ?? ()#85 0x00000000fffffffe in ?? ()#86 0x0000000000000000 in ?? ()Thread 1 (Thread 0x2aae19f6e800 (LWP 11531)):#0  0x00002aae177507ad in waitpid () from /lib64/libc.so.6#1  0x00002aae176e2889 in do_system () from /lib64/libc.so.6#2  0x00002aae176e2bc0 in system () from /lib64/libc.so.6#3  0x00002aae1a1cb1af in pm_linux_print_coredump () from /opt/ibmhpc/pe1307/base/intel/lib64/libpoe.so#4  0x00002aae1a1c9f1a in pm_lwcf_signal_handler () from /opt/ibmhpc/pe1307/base/intel/lib64/libpoe.so#5  #6  0x0000000000cbc4e0 in tp_core_mp_xtpv_ ()#7  0x0000000000cb4ca1 in tp_core_mp_tp2c_ ()#8  0x0000000000ca319d in sw_core_mp_d_sw_ ()#9  0x0000000000ae5083 in cd_core_ ()#10 0x00002aae1741c003 in L_kmp_invoke_pass_parms () from /ncar/opt/intel/2013/composer_xe_2013.1.117/compiler/lib/intel64/libiomp5.so#11 0x00007fff416e5438 in ?? ()#12 0x00007fff416e4f80 in ?? ()#13 0x00007fff416e500c in ?? ()#14 0x00007fff416e5004 in ?? ()#15 0x00007fff416e5008 in ?? ()#16 0x00007fff416e4c74 in ?? ()#17 0x00007fff416e3a30 in ?? ()#18 0x00007fff416e3aa8 in ?? ()#19 0x00007fff416e3ee0 in ?? ()#20 0x00007fff416e3f58 in ?? ()#21 0x00007fff416e3b20 in ?? ()#22 0x00007fff416e3b98 in ?? ()#23 0x00007fff416e3e68 in ?? ()#24 0x00007fff416e3c88 in ?? ()#25 0x00007fff416e3d00 in ?? ()#26 0x00007fff416e3d78 in ?? ()#27 0x00007fff416e3df0 in ?? ()#28 0x00007fff416e50e8 in ?? ()#29 0x00007fff416e50ec in ?? ()#30 0x00007fff416e50f0 in ?? ()#31 0x00007fff416e50f4 in ?? ()#32 0x00007fff416e50f8 in ?? ()#33 0x00007fff416e50fc in ?? ()#34 0x00007fff416e5100 in ?? ()#35 0x00007fff416e5104 in ?? ()#36 0x00007fff416e5108 in ?? ()#37 0x00007fff416e510c in ?? ()#38 0x00007fff416e519c in ?? ()#39 0x00007fff416e51a0 in ?? ()#40 0x00007fff416e51a4 in ?? ()#41 0x00007fff416e51a8 in ?? ()#42 0x00007fff416e51ac in ?? ()#43 0x00007fff416e51b0 in ?? ()#44 0x00007fff416e51b4 in ?? ()#45 0x00007fff416e51b8 in ?? ()#46 0x00007fff416e51bc in ?? ()#47 0x00007fff416e51c0 in ?? ()#48 0x00007fff416e5124 in ?? ()#49 0x00007fff416e5128 in ?? ()#50 0x00007fff416e512c in ?? ()#51 0x00007fff416e5130 in ?? ()#52 0x00007fff416e5134 in ?? ()#53 0x00007fff416e5110 in ?? ()#54 0x00007fff416e5114 in ?? ()#55 0x00007fff416e5118 in ?? ()#56 0x00007fff416e511c in ?? ()#57 0x00007fff416e5120 in ?? ()#58 0x00007fff416e5188 in ?? ()#59 0x00007fff416e518c in ?? ()#60 0x00007fff416e5190 in ?? ()#61 0x00007fff416e5194 in ?? ()#62 0x00007fff416e5198 in ?? ()#63 0x00007fff416e5138 in ?? ()#64 0x00007fff416e513c in ?? ()#65 0x00007fff416e5140 in ?? ()#66 0x00007fff416e5144 in ?? ()#67 0x00007fff416e5148 in ?? ()#68 0x00007fff416e514c in ?? ()#69 0x00007fff416e5150 in ?? ()#70 0x00007fff416e5154 in ?? ()#71 0x00007fff416e5158 in ?? ()#72 0x00007fff416e515c in ?? ()#73 0x00007fff416e5160 in ?? ()#74 0x00007fff416e5164 in ?? ()#75 0x00007fff416e5168 in ?? ()#76 0x00007fff416e516c in ?? ()#77 0x00007fff416e5170 in ?? ()#78 0x00007fff416e5174 in ?? ()#79 0x00007fff416e5178 in ?? ()#80 0x00007fff416e517c in ?? ()#81 0x00007fff416e5180 in ?? ()#82 0x00007fff416e5184 in ?? ()#83 0x00002aae00000001 in ?? ()#84 0x0000000000000003 in ?? ()#85 0x0000000000000000 in ?? () 
 

santos

Member
The crash is in tp_core, part of the FV dycore, so your changes to radiation.F90 probably have a bug, producing bad data which causes the dycore to crash. I would examine the variables you are using more closely to make sure that they are reasonable, and in particular that you are not accidentally using data that is not initialized (or is initialized on only some grid points instead of the whole grid).To get some extra checks on the data produced by radiation, you can set "state_debug_checks = .true." in the namelist (this will detect severe problems such as negative or infinite temperatures), and/or run in DEBUG mode to get information from compiler checks. But these only detect very obvious problems; it's likely that you'll have output values like max and min heating rates in each grid cell (or column) in order to see what is wrong.
 
Sean, thank you for the suggestion! I tried another case with "state_debug_checks = .true.", and got the following ERROR messages that are actually what you suspected. The model produced invalid negative temperatures at a single location (2, 17).These invalid temperatures should be generated at the very begining when the simulation started to run, since the cpl log file stoped right before the line (from a cpl log file of a successful case) "tStamp_write: model date =    10102       0 wall clock = 2014-07-06 18:08:06 avg dt =     3.91 dt =     3.91" . You mentioned in your previous post that this may have something to do with initializtion. Can you give a little more information on the initialization? Does this bug have anything to do with my case being a startup run?  My changes to radiation.F90 don't involve using external data. I simply added the following codes after Line994 (   end if   !  if (dosw .or. dolw) then):    qrl  = qrlc    qrs  = qrsc    fsns = fsnsc    fsnt = fsntc    flns = flnsc    flnt = flntc    fsds = fsdsc 
 The errors I found in the cesm log file:-------------------------------------------------  57: ERROR: shr_assert_in_domain: state%t has invalid value  -2.523176289233014E+063  12: ERROR: shr_assert_in_domain: state%t has invalid value  -2.526503365774719E+063  13: ERROR: shr_assert_in_domain: state%t has invalid value  -2.528148892255106E+063  13:  at location:            2          17  13: Expected value to be greater than   0.000000000000000E+000  13:(shr_sys_abort) ERROR: Invalid value produced in physics_state by package radheat.  13:(shr_sys_abort) WARNING: calling shr_mpi_abort() and stopping  14: ERROR: shr_assert_in_domain: state%t has invalid value  -2.520966444599702E+063  14:  at location:            2          17  14: Expected value to be greater than   0.000000000000000E+000  14:(shr_sys_abort) ERROR: Invalid value produced in physics_state by package radheat.  14:(shr_sys_abort) WARNING: calling shr_mpi_abort() and stopping  42: ERROR: shr_assert_in_domain: state%t has invalid value  -2.527459347043085E+063  72: ERROR: shr_assert_in_domain: state%t has invalid value  -2.526111108440685E+063  43: ERROR: shr_assert_in_domain: state%t has invalid value  -2.512263362032814E+063  43:  at location:            2          17  43: Expected value to be greater than   0.000000000000000E+000  43:(shr_sys_abort) ERROR: Invalid value produced in physics_state by package radheat.  43:(shr_sys_abort) WARNING: calling shr_mpi_abort() and stopping  44: ERROR: shr_assert_in_domain: state%t has invalid value  -2.511911566595041E+063  44:  at location:            2          17  44: Expected value to be greater than   0.000000000000000E+000  44: ERROR: shr_assert_in_domain: state%t has invalid value  -2.516525442616982E+063  44:  at location:            2          17  44: Expected value to be greater than   0.000000000000000E+000  44:(shr_sys_abort) ERROR: Invalid value produced in physics_state by package radheat.  73: ERROR: shr_assert_in_domain: state%t has invalid value  -2.516079133185552E+063  73:  at location:            2          17  73: Expected value to be greater than   0.000000000000000E+000  73: ERROR: shr_assert_in_domain: state%t has invalid value  -2.518966136792934E+063  84: ERROR: shr_assert_in_domain: state%t has invalid value  -2.505608099793679E+063  84:  at location:            2          17  27: ERROR: shr_assert_in_domain: state%t has invalid value  -2.528356949852081E+063  42:  at location:            2          17  42: Expected value to be greater than   0.000000000000000E+000  42:(shr_sys_abort) ERROR: Invalid value produced in physics_state by package radheat.  42:(shr_sys_abort) WARNING: calling shr_mpi_abort() and stopping  42: ERROR: shr_assert_in_domain: state%t has invalid value  -2.528244766643989E+063  42:  at location:            2          17  42: Expected value to be greater than   0.000000000000000E+000  42:(shr_sys_abort) ERROR: Invalid value produced in physics_state by package radheat.  42:(shr_sys_abort) WARNING: calling shr_mpi_abort() and stopping  12:  at location:            2          17  12: Expected value to be greater than   0.000000000000000E+000  12: ERROR: shr_assert_in_domain: state%t has invalid value  -2.527759932500048E+063  12:  at location:            2          17  12: Expected value to be greater than   0.000000000000000E+000  12:(shr_sys_abort) ERROR: Invalid value produced in physics_state by package radheat.  12:(shr_sys_abort) WARNING: calling shr_mpi_abort() and stopping  12:(shr_sys_abort) ERROR: Invalid value produced in physics_state by package radheat.  12:(shr_sys_abort) WARNING: calling shr_mpi_abort() and stopping  42:Abort(1001) on node 42 (rank 42 in comm 1140850688): application called MPI_Abort(MPI_COMM_WORLD, 1001) - process 42  42:Abort(1001) on node 42 (rank 42 in comm 1140850688): application called MPI_Abort(MPI_COMM_WORLD, 1001) - process 42   42:INFO: 0031-306  pm_atexit: pm_exit_value is 1. ......-------------------------------------------------
 

santos

Member
What I mean is, these variables (e.g. qrsc) may not always be calculated at every grid point, or they may not be calculated in every time step. If they are not set, then you will get random (often very large) numbers as output. I know very little about the radiation code; you'll have to look through it yourself to see how these variables are being set.
 

eaton

CSEG and Liaisons
Sean is correct about the problem.  Since the radiation calcs don't happen every timestep you need to overwrite the whole sky with the clear sky values *inside* the conditionals where the calculations are done, not outside.  Probably the locations right after the loops doing unit conversions and in front of the outfld calls.  That way you can also look at the history output to verify that the whole sky and clear sky values are equal.
 
Sean and Brian, thank both of you for the help. I appreciate it!When the radiation calcs don't happen at a timestep, does the whole sky radiation (e.g., qrl and qrs) use its values from the previous step? The fact that the radiation calcs don't happen at every timestep means that the clear sky radiation components are not computed at all timesteps. Why does the model NOT need to compute them at every timestep?  what's the physical/technical consideration about the need to compute the clear sky radiation?
 

eaton

CSEG and Liaisons
During timesteps when radiation calculations don't happen then the whole sky heating rates and fluxes from the previous step are used.  The heating rates are stored in the physics buffer and the fluxes are stored in the comsrf module, so they all persist across timesteps.  The clear sky calculations are diagnostic only; they have no impact on the climate simulation.  So there is no need for them to persist across timesteps.  You'll notice in the radiation_tend subroutine that they are stored in local variables.
 
There are quite a few clear-sky radiation components inside the subroutine "radiation_tend". In order to turn off the cloud radiative effects, does one need to override all the corresponding whole sky components, or just those variables used by the subroutine "radheat_tend" (qrl, qrs, fsns, fsnt, flns, flnt) plus 'fsds' used downstream in the codes?  How about those radiation components (listed below) that have a clear-sky counterpart but are not used either by "radheat_tend" or downstream in the radiation codes?Shortwave:  fsntoa (fsntoac), fsnirt (fsnrtc), fns (fcns), fsn200 (fsn200c);Longwave: flut (flutc), flds (fldsc), fnl (fcnl), fln200 (fln200c);
Are there any other things that one need to change in order to completely turn off the could radiative effects?  Thanks.
 
Top