Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

New NERSC computer - Cori and the modules

Hello All,I am trying to build CESM 1.2.2 on a new computer called Cori. It's a new DOE supercomputer.When I was editing "env_mach_specific", I found there is a line "module use /global/project/projectdirs/ccsm1/modulefiles/edison" (I borrowed the settings from Edison for Cori). I have no idea what those modulefiles are, and how to generate a new set for Cori. Can anyone help me out here? Thanks a lot.
 

jedwards

CSEG and Liaisons
Staff member
I have ported the latest cesm (cesm1_5_beta03) code to cori.   Because cori usesslurm instead of PBS porting is more complicated than just modifying the edison settings.I will work on cesm1.2.x and should have an official port in the coming weeks. 
 

jedwards

CSEG and Liaisons
Staff member
 Here is a new Machines directory that contains the corip1 (cori phase 1) port:https://svn-ccsm-models.cgd.ucar.edu/Machines/release_tags/cesm1_2_x_n16_Machines_140528to use go to the Machines directory in your source tree and runsvn switch https://svn-ccsm-models.cgd.ucar.edu/Machines/release_tags/cesm1_2_x_n16_Machines_140528Note that this is a functional port only, it has not been tuned for optimal pe-layouts or performance.
 
Hello,Thank you for making the machine settings for Cori available!While I can now build, the model does not complete when running the executable. I am testing CESM1.2.2 out of the box, and I have tested the same case, xmlchange parameters and user_nl_cam on Edison, which runs fine.The cesm.log file is loaded, if that helps.Thanks,Kuo
 
Hello, Kuo, I can't see your log file. Do you mind attaching it again?I haven't run CESM successfully on Cori yet, because there is some PE layout issue. I am trying it figure it out. 
 

jedwards

CSEG and Liaisons
Staff member
There is now a port to cori in the source at  /project/projectdirs/ccsm1/collections/cesm1_2_2/
 
Thank you Jim!The file /project/projectdirs/ccsm1/collections/cesm1_2_2/env_mach_specific.corip1 currently has owner and group access.Is it possible for you to change the permissions on the file so that it can be accessible and copied by the world?Thank you,Kuo
 

jedwards

CSEG and Liaisons
Staff member
 You mean .../project/projectdirs/ccsm1/collections/cesm1_2_2/scripts/ccsm_utils/Machines/env_mach_specific.corip1it should be okay now.   I have a few more changes to add that i hope to get to today....
 
Hi, Jim:I have been trying to use more than 1 thread (e.g., NTHRDS_ATM=2) per task, but the model run kept crashing with segmentation fault. Does the configuration currently work for single thread? Can I do anything to fix it?Thank you very much. 
 
We have been trying this version on cori as well but are still dead in the water. It runs a bit then crashes writing an output fileHas anybody actually run yet on this machine?See /global/cscratch1/sd/mwehner/seCAM5v2_2_prescribed_c0101/run/cesm.log.160126-214554


0001:  Setting mpi info: striping_factor=16
0001:  Setting mpi info: striping_unit=1048576
0001:  Setting mpi info: striping_factor=16
0001:  Setting mpi info: striping_unit=1048576
0001:  Opened file ./seCAM5v2_2_prescribed_c0101.clm2.r.2000-01-06-00000.nc to write
0001:       196608
0001:  Setting mpi info: striping_factor=16
0001:  Setting mpi info: striping_unit=1048576
0001:  Opened file ./seCAM5v2_2_prescribed_c0101.clm2.rh0.2000-01-06-00000.nc to write
0001:       262144
0001:  Setting mpi info: striping_factor=16
0001:  Setting mpi info: striping_unit=1048576
0001:  Opened file seCAM5v2_2_prescribed_c0101.cam.r.2000-01-06-00000.nc to write
0001:            7
0001:  Setting mpi info: striping_factor=16
0001:  Setting mpi info: striping_unit=1048576
0001:  Opened file seCAM5v2_2_prescribed_c0101.cam.rh0.2000-01-06-00000.nc to write
0001:            8
0001:  pio_support::pio_die:: myrank=          -1 : ERROR: pionfatt_mod.F90:
0001:          170 : NetCDF: Not a valid data type or _FillValue type mismatch
0001: Rank 1 [Tue Jan 26 22:00:45 2016] [c1-0c0s11n1] application called MPI_Abort(MPI_COMM_WORLD, 1) - process 1

 

jedwards

CSEG and Liaisons
Staff member
hi Michael,Are you using the port that I provided in /project/projectdirs/ccsm1/collections/cesm1_2_2
 
Hi Dr. Wehner:I put a "env_mach_specific.corip1" in my home directory (~/jihwang). Please feel free to use it.By the way, cori is fairly unstable. It is down every once a while for different reasons. I ran CESM on corip1 several times, but I don't like its unstable environment.
 
I see this in my codeseCAM5v2_2_prescribed_c0101/env_build.xml:   
seCAM5v2_2_prescribed_c0101/env_build.xml:   

But this in env_mach_specific.corip1> if ( $MPILIB == "mpi-serial" ) then
>   module load cray-hdf5/1.8.14
>   module load cray-netcdf/4.3.3.1
> endif
> if ( $MPILIB != "mpi-serial" ) then
>   module load cray-netcdf-hdf5parallel/4.3.3.1
>   module load cray-hdf5-parallel/1.8.14
>   module load cray-parallel-netcdf/1.6.1
> endif

What should I change with xmlchange, if anything?Thanksmichael
 
Dear Dr. Wehner:I don't think you need to xmlchange anything. I am trying the same configuration again. I believe it will work, because it did before. You are welcome to use "env_mach_specific.corip1" or other files in my jihwang/temp/ directory. There is also a file called "make_CTRLtest.csh". That's the one that I am testing now. 
 
Top