mhrzmomeni@yahoo_com
Member
Hi guys,I am confused. I want to run CLM regionally. I have created the required files (Domain and surface data files) to run CLM regionally. Now I can run the model with 1 node successfully (with adding “limit coredumpsize unlimited” and “limit stacksize unlimited” to my run script) but I get the following message:--------------------------------------------------------------------------
The usNIC BTL failed to initialize while trying to register some
memory. This typically can indicate that the "memlock" limits are set
too low. For most HPC installations, the memlock limits should be set
to "unlimited". The failure occurred here:
Local host: compute-2-8-ib
Memlock limit: 65536
You may need to consult with your system administrator to get this
problem fixed. This FAQ entry on the Open MPI web site may also be
helpful:
http://www.open-mpi.org/faq/?category=openfabrics#ib-locked-pages
--------------------------------------------------------------------------
--------------------------------------------------------------------------
The OpenFabrics (openib) BTL failed to initialize while trying to
allocate some locked memory. This typically can indicate that the
memlock limits are set too low. For most HPC installations, the
memlock limits should be set to "unlimited". The failure occured
here:
Local host: compute-2-8-ib
OMPI source: ../../../../../openmpi-1.8.6/ompi/mca/btl/openib/btl_openib.c:794
Function: ompi_free_list_init_ex_new()
Device: mlx4_0
Memlock limit: 65536
You may need to consult with your system administrator to get this
problem fixed. This FAQ entry on the Open MPI web site may also be
helpful:
http://www.open-mpi.org/faq/?category=openfabrics#ib-locked-pages
--------------------------------------------------------------------------
[compute-2-8-ib][[54635,1],112][../../../../../openmpi-1.8.6/ompi/mca/btl/openib/btl_openib.c:873:mca_btl_openib_add_procs] could not prepare openib device for use
[compute-2-8-ib.local:46482] 127 more processes have sent help message help-mpi-btl-usnic.txt / check_reg_mem_basics fail
[compute-2-8-ib.local:46482] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages[compute-2-8-ib.local:46482] 127 more processes have sent help message help-mpi-btl-openib.txt / init-fail-no-memBUT when I want to run the model with more than 2 nodes, first I get the following error:
Host key verification failed.^M
--------------------------------------------------------------------------
ORTE was unable to reliably start one or more daemons.
This usually is caused by:
* not finding the required libraries and/or binaries on
one or more nodes. Please check your PATH and LD_LIBRARY_PATH
settings, or configure OMPI with --enable-orterun-prefix-by-default
* lack of authority to execute on one or more specified nodes.
Please verify your allocation and authorities.
* the inability to write startup files into /tmp (--tmpdir/orte_tmpdir_base).
Please check with your sys admin to determine the correct location to use.
* compilation of the orted with dynamic libraries when static are required
(e.g., on Cray). Please check your configure cmd line and consider using
one of the contrib/platform definitions for your system type.
* an inability to create a connection back to mpirun due to a
lack of common network interfaces and/or no route found between
them. Please check network connectivity (including firewalls
and network routing requirements).
-------------------------------------------------------------------------- To solve it, I created a file named “config” in my home directory and in the folder of “.ssh” (/home/MyHome/.ssh). Then I added “StrictHostKeyChecking no” to the config file. After this, My problem about “Host key verification failed” was solved. Now I meet the following error:/share/apps/openmpi-1.8.6-intel/bin/orted: error while loading shared libraries: libimf.so: cannot open shared object file: No such file or directory--------------------------------------------------------------------------ORTE was unable to reliably start one or more daemons.This usually is caused by: * not finding the required libraries and/or binaries on one or more nodes. Please check your PATH and LD_LIBRARY_PATH settings, or configure OMPI with --enable-orterun-prefix-by-default * lack of authority to execute on one or more specified nodes. Please verify your allocation and authorities. * the inability to write startup files into /tmp (--tmpdir/orte_tmpdir_base). Please check with your sys admin to determine the correct location to use. * compilation of the orted with dynamic libraries when static are required (e.g., on Cray). Please check your configure cmd line and consider using one of the contrib/platform definitions for your system type. * an inability to create a connection back to mpirun due to a lack of common network interfaces and/or no route found between them. Please check network connectivity (including firewalls and network routing requirements).-------------------------------------------------------------------------- I am so confused. What is the problem??
The usNIC BTL failed to initialize while trying to register some
memory. This typically can indicate that the "memlock" limits are set
too low. For most HPC installations, the memlock limits should be set
to "unlimited". The failure occurred here:
Local host: compute-2-8-ib
Memlock limit: 65536
You may need to consult with your system administrator to get this
problem fixed. This FAQ entry on the Open MPI web site may also be
helpful:
http://www.open-mpi.org/faq/?category=openfabrics#ib-locked-pages
--------------------------------------------------------------------------
--------------------------------------------------------------------------
The OpenFabrics (openib) BTL failed to initialize while trying to
allocate some locked memory. This typically can indicate that the
memlock limits are set too low. For most HPC installations, the
memlock limits should be set to "unlimited". The failure occured
here:
Local host: compute-2-8-ib
OMPI source: ../../../../../openmpi-1.8.6/ompi/mca/btl/openib/btl_openib.c:794
Function: ompi_free_list_init_ex_new()
Device: mlx4_0
Memlock limit: 65536
You may need to consult with your system administrator to get this
problem fixed. This FAQ entry on the Open MPI web site may also be
helpful:
http://www.open-mpi.org/faq/?category=openfabrics#ib-locked-pages
--------------------------------------------------------------------------
[compute-2-8-ib][[54635,1],112][../../../../../openmpi-1.8.6/ompi/mca/btl/openib/btl_openib.c:873:mca_btl_openib_add_procs] could not prepare openib device for use
[compute-2-8-ib.local:46482] 127 more processes have sent help message help-mpi-btl-usnic.txt / check_reg_mem_basics fail
[compute-2-8-ib.local:46482] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages[compute-2-8-ib.local:46482] 127 more processes have sent help message help-mpi-btl-openib.txt / init-fail-no-memBUT when I want to run the model with more than 2 nodes, first I get the following error:
Host key verification failed.^M
--------------------------------------------------------------------------
ORTE was unable to reliably start one or more daemons.
This usually is caused by:
* not finding the required libraries and/or binaries on
one or more nodes. Please check your PATH and LD_LIBRARY_PATH
settings, or configure OMPI with --enable-orterun-prefix-by-default
* lack of authority to execute on one or more specified nodes.
Please verify your allocation and authorities.
* the inability to write startup files into /tmp (--tmpdir/orte_tmpdir_base).
Please check with your sys admin to determine the correct location to use.
* compilation of the orted with dynamic libraries when static are required
(e.g., on Cray). Please check your configure cmd line and consider using
one of the contrib/platform definitions for your system type.
* an inability to create a connection back to mpirun due to a
lack of common network interfaces and/or no route found between
them. Please check network connectivity (including firewalls
and network routing requirements).
-------------------------------------------------------------------------- To solve it, I created a file named “config” in my home directory and in the folder of “.ssh” (/home/MyHome/.ssh). Then I added “StrictHostKeyChecking no” to the config file. After this, My problem about “Host key verification failed” was solved. Now I meet the following error:/share/apps/openmpi-1.8.6-intel/bin/orted: error while loading shared libraries: libimf.so: cannot open shared object file: No such file or directory--------------------------------------------------------------------------ORTE was unable to reliably start one or more daemons.This usually is caused by: * not finding the required libraries and/or binaries on one or more nodes. Please check your PATH and LD_LIBRARY_PATH settings, or configure OMPI with --enable-orterun-prefix-by-default * lack of authority to execute on one or more specified nodes. Please verify your allocation and authorities. * the inability to write startup files into /tmp (--tmpdir/orte_tmpdir_base). Please check with your sys admin to determine the correct location to use. * compilation of the orted with dynamic libraries when static are required (e.g., on Cray). Please check your configure cmd line and consider using one of the contrib/platform definitions for your system type. * an inability to create a connection back to mpirun due to a lack of common network interfaces and/or no route found between them. Please check network connectivity (including firewalls and network routing requirements).-------------------------------------------------------------------------- I am so confused. What is the problem??