Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

porting ctsm5.3.009 on our system

xgao304

Member
Dear Sir:

I tried to port the ctsm 5.3.009 on our system (before I successfully ported the cesm 2.1.3), but get a bit lost.

For porting the cesm2.1.3, I modified three files under ~/cime/config/cesm/machines by adding the relevant system specification:
1. config_batch.xml

<batch_system MACH="svante" type="slurm">
<batch_submit>sbatch</batch_submit>
<submit_args>
<arg flag="--time" name="$JOB_WALLCLOCK_TIME"/>
</submit_args>
<directives>
<directive> --partition=edr</directive>
<directive> --mem=0</directive>
</directives>
<queues>
<queue walltimemax="24:00:00" nodemin="1" nodemax="128" default="true">edr</queue>
</queues>
</batch_system>

2. config_compilers.xml
<compiler MACH="svante" COMPILER="intel">
<NETCDF_PATH> $(NETCDF)</NETCDF_PATH>
<SLIBS>
<append> -L${NETCDF_PATH}/lib -lnetcdf -lnetcdff </append>
</SLIBS>
<MPI_LIB_NAME>mpi</MPI_LIB_NAME>
<MPI_PATH> $(INC_MPI)/..</MPI_PATH>
</compiler>

3. config_machines.xml
... a lot of system specifications.

for ctsm5.3.009, under the directory of ccs_config/machines,

I only changed "config_batch.xml" following the same way as I did in cesm2.1.3. There is no corresponding "config_compilers.xml". "config_machines.xml" has very different contents (only machine names) from what is in cesm2.1.3 ( a lot of system specifications).
The READM file only says "Please refer to the documentation in the config_machines.xml and config_compilers.xml files.". Not sure what is the documentation referred to?

My questions are:
1) in order to port ctsm5.3.009 (not with any other cesm components), what files should I modify for porting?
2) once I change those files, could I do the similar steps as building cesm:

create_newcase
case.setup
case.build
case.submit

Or I should follow the link below to build ctsm (include build various prerequisites)
3.2.1. Obtaining and building CTSM and LILAC — ctsm CTSM master documentation

Thanks,

Xiang
 

jedwards

CSEG and Liaisons
Staff member
What was in config_compilers.xml is now in cmake macros and contents are divided into subdirectories based on machine name.
So add your machine name to the top level config_machines.xml and create a subdirectory of that same name. That subdirectory will
contain the machine specific contents of config_machines.xml. It looks like the tag you are using ctsm 5.3.009 has the first iteration of this change in
ccs_config, in future versions you will see the cmake_macros and config_batch files also seperated according to machine name. I hope that helps.
 

xgao304

Member
Dear Sir:

I have followed your instructions and did what you suggested. When I try to create a new case using the following command, I got the same error message with both intel and gcc compilers (see the attached file). I can successfully run the same command using the cesm2.1.3 and intel compiler. I also attached my config_machines.xml and config_batch.xml (under machine name "svante").

module load python/3.9.1
/net/fs12/d2/xgao/ctsm5.3/CTSM/cime/scripts/create_newcase --case testctsm \
--compset 2000_DATM%GSWP3v1_CLM50%BGC-CROP_SICE_SOCN_SROF_SGLC_SWAV \
--res CLM_USRDAT --user-mods-dir $MYDATA_DIR --machine svante --compiler gcc --run-unsupported

I discussed with our system administrator, but he is not very certain about the correct procedure to solve the issue. But he proposed the following steps to take (below ----), which seems to me quite complicated. I am wondering if there is a simpler way like cesm2.1.3 - once you set up all the required configurations, then it is ready to go. Could you provide some feedbacks?

--------
I think we have to actually build/compile it first. Based on the docs ( https://escomp.github.io/ctsm-docs/versions/master/html/lilac/obtaining-building-and-running/obtaining-and-building-ctsm.html) is
looks like that is maybe done with a process like this:

1. I think we do this from the head-node so if building for HDR, be on
svante.mit.edu
2. Load all of the modules I have specified in the machines_config.xml
for Svante
3. cd into `/net/fs12/d2/xgao/ctsm5.3/CTSM`
4. Run the build command which I think would be something like
`./lilac/build_ctsm /net/fs12/d2/$USER/ctsm_build_hdr --os linux
--machine svante --compiler gcc --netcdf-path
'/home/software/rhel/8/gcc/11.3.0/pkg/netcdf/4_shared_libs/'
--pnetcdf-path
'/home/software/rhel/8/gcc/11.3.0/pkg/netcdf/4_shared_libs/'
--esmf-mkfile-path '$ENV{ESMFMKFILE}' --max-mpitasks-per-node 48`

Ultimately, that should build/compile CTSM into
`/net/fs12/d2/xgao/ctsm_build_hdr`.

I have a feeling it might error on pnetcdf, if so I can go and build
that separately. What is confusing is there is pnetcdf and also just
netcdf 4 with parallel support. I know I have the latter built in that
netcdf/4_shared_libs module but they might specifically need pnetcdf.

Very possible we are missing something else about the config_machines
setup.
---------

Thanks,

Xiang
 

Attachments

  • config_batch.xml.txt
    27.5 KB · Views: 0
  • config_machines.xml.txt
    2.8 KB · Views: 1
  • error.txt
    2.5 KB · Views: 2
Top