Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

Waccmx runtime error

dengchuangwu

chuangwu deng
New Member
My environment:
  1. Linux 3.10.0-693.el7.x86_64(CentOS distribution)
  2. intel 2019.0.117
  3. intel mpi, it's same as intel compiler
  4. Python 2.7.5
  5. Perl 5.16.3
  6. NetCDF 4.7.4
  7. PNetCDF 1.12.1
  8. ESMF 8.1.0
When I run the waccmx case, it's been automatically canceled. So I check the log, the key error information show blow:
Abort(469989892) on node 132 (rank 132 in comm 0): Fatal error in PMPI_Ibsend: Invalid tag, error stack:
PMPI_Ibsend(207): Invalid tag, value is 1080132
forrtl: error (78): process killed (SIGTERM)
Image PC Routine Line Source
libpnetcdf.so.4.0 00002B51DBD5DD64 for__signal_handl Unknown Unknown
libpthread-2.17.s 00002B51EA8745E0 Unknown Unknown Unknown
libpthread-2.17.s 00002B51EA871483 pthread_spin_lock Unknown Unknown

My cesm version is 2.1.3, and you can see the detail in the file named version_info.txt.
And you can see the change which I made to the XML files in the bash file(cesm_2.1.3_intel_2019.0.117_impi.sh), which is my script to run the case.

I have reproduced the problems to make sure it still exists.
finally, you can see the log files in attach files, thank you!

I have tried the suggestion in https://bb.cgd.ucar.edu/cesm/threads/mpi-tag-exceeds-limit-when-using-240-mpi-tasks.4659/ , but it makes the case building error.
./xmlchange --file env_build.xml --id USE_ESMF_LIB --val 'FALSE'

Some error information I listed below
/public/home/sjchang/CESM/source_code/2.1.3_intel_2019.0.117_impi/components/cam/src/ionosphere/waccmx/edyn_esmf.F90(4): error #6580: Name in only-list does not exist or is not accessible. [ESMF_FIELD]
/public/home/sjchang/CESM/source_code/2.1.3_intel_2019.0.117_impi/components/cam/src/ionosphere/waccmx/edyn_esmf.F90(4): error #6580: Name in only-list does not exist or is not accessible. [ESMF_ROUTEHANDLE]
/public/home/sjchang/CESM/source_code/2.1.3_intel_2019.0.117_impi/components/cam/src/ionosphere/waccmx/edyn_esmf.F90(6): error #6580: Name in only-list does not exist or is not accessible. [ESMF_FIELDGET]
/public/home/sjchang/CESM/source_code/2.1.3_intel_2019.0.117_impi/components/cam/src/ionosphere/waccmx/edyn_esmf.F90(6): error #6580: Name in only-list does not exist or is not accessible. [ESMF_STAGGERLOC_CENTER]
/public/home/sjchang/CESM/source_code/2.1.3_intel_2019.0.117_impi/components/cam/src/ionosphere/waccmx/edyn_esmf.F90(6): error #6580: Name in only-list does not exist or is not accessible. [ESMF_FIELDREGRIDSTORE]
/public/home/sjchang/CESM/source_code/2.1.3_intel_2019.0.117_impi/components/cam/src/ionosphere/waccmx/edyn_esmf.F90(7): error #6580: Name in only-list does not exist or is not accessible. [ESMF_REGRIDMETHOD_BILINEAR]
/public/home/sjchang/CESM/source_code/2.1.3_intel_2019.0.117_impi/components/cam/src/ionosphere/waccmx/edyn_esmf.F90(7): error #6580: Name in only-list does not exist or is not accessible. [ESMF_POLEMETHOD_ALLAVG]
/public/home/sjchang/CESM/source_code/2.1.3_intel_2019.0.117_impi/components/cam/src/ionosphere/waccmx/edyn_esmf.F90(7): error #6580: Name in only-list does not exist or is not accessible. [ESMF_FIELDSMMSTORE]
 

Attachments

  • version_info.txt
    4.9 KB · Views: 0
  • atm.log.txt
    432.1 KB · Views: 0
  • cesm.log.txt
    132.2 KB · Views: 0
  • cesm_2.1.3_intel_2019.0.117_impi.sh.txt
    3.3 KB · Views: 0
  • config_compilers.xml.txt
    1.9 KB · Views: 2
  • config_machines.xml.txt
    117.4 KB · Views: 2
  • cpl.log.txt
    67.1 KB · Views: 0
  • ice.log.txt
    11.2 KB · Views: 0
  • lnd.log.txt
    37.6 KB · Views: 0
  • ocn.log.txt
    3.3 KB · Views: 0

jackma

jack
New Member
My environment:

When I run the waccmx case, it's been automatically canceled. So I check the log, the key error information show blow:


My cesm version is 2.1.3, and you can see the detail in the file named version_info.txt.
And you can see the change which I made to the XML files in the bash file(cesm_2.1.3_intel_2019.0.117_impi.sh), which is my script to run the case.

I have reproduced the problems to make sure it still exists.
finally, you can see the log files in attach files, thank you!

I have tried the suggestion in MPI tag exceeds limit when using >240 MPI tasks , but it makes the case building error.
./xmlchange --file env_build.xml --id USE_ESMF_LIB --val 'FALSE'

Some error information I listed below
Hello
I also encounter some problems in building the waccmx case. If you build success, can I see your correctconfig_compilers.xml. Thank you very much.
 
Top