Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

need help to run CAM4

Hi,
I'm very new to model and using parallel computer. Now I've been trying to get CAM4_9_02 default run on blufile.ucar.edu machine for several days. I inputed the account number in
the line
BSUB -P # account number
in the file run-ibm.csh and tried to run it by type run-ibm.csh &
but they give me a report like

[1] 692956
be1105en$ Missing name for redirect.

Do you have any idea which part i did wrong?
 

eaton

CSEG and Liaisons
That script is designed to be submitted to the batch queue, i.e.:

% bsub < run-ibm.csh

But note that in addition to the project number you'll also need to supply the root to your source tree. The script currently contains:

set camroot = /fis/cgd/...

That path needs to be supplied.
 
Thanks a lot! Now I have another problem after changing the camroot
it says

bash-3.1$ LSB_HOSTS: Undefined variable.
ERROR: unrecognized arguments: 4


following lots of configure options looks come from the configure file

and in the last it says
configure failed

Do I need to make some change to configure as well?
 

eaton

CSEG and Liaisons
LSB_HOSTS isn't defined unless you submit the job to the batch queue using bsub.

You haven't provided enough info for me to guess why the configure command isn't working.
 

eaton

CSEG and Liaisons
I did some testing of the script with cam4_9_02 and found the following:

1) The script is meant to be used from the batch queue. If you execute it interactively then LSB_HOSTS will not be defined, and the ntasks shell variable won't be set. Since that variable is used in the arguments to configure then configure gets bad arguments which is why you saw the "configure failed" message.

2) The script will work without modification in the release code. But cam4_9_02 is development code which does not have the same default chemistry and physics packages as the released CAM4. To get the script to work you'll need to add arguments to the configure line to get the configuration that you really want to run. At a miniumum you'll need to add an argument such as "-hgrid 1.9x2.5" to run a 2 degree model. configure is currently defaulting to a 10x15 grid which isn't a high enough resolution for running with 16 tasks and 4 threads per task.
 
thanks a lot!
I solved the ntasks problem by set it to a number of 1. Don't know if it makes sense.

Now I have another problem, I'm trying to link a namelist file into the run-ibm.csh

by using the command

cd $blddir || echo "cd $blddir failed" && exit 1
$cfgdir/build-namelist -s -case $case -runtype $runtype
-infile $namelistdir/namelist || echo "build-namelist failed" && exit 1 #ADD PHL

it gave me -infile: Command not found.
build-namelist failed

Any idea how to solve this?
Thank you!
 
I am trying to run a test case to familiarize myself with CAM, but every time I try to configure I get the message

-----------------
WORKDIR: Undefined variable.
-----------------

and none of the files are created. I am trying to configure it with the machine midnight, resolution 10x15_10x15, and component set FCN. I can't seem to find what WORKDIR is in the user's guide
 
to WLee

I think it has a description of the workdir inside the run file
## $wrkdir is a working directory where the model will be built and run.

it depends on which machine you're gonna run your model.......

Not sure if this is can solve your problem..
 

eaton

CSEG and Liaisons
This is a CCSM scripts issue. The config_machines.xml file has an entry for midnight which references the variable WORKDIR, but I don't see where it is set either. For most machines this info is hardcoded in the config_machines.xml file. I assume you just need to set this environment variable yourself before running create_newcase. But I'm not familiar with the machine midnight, so I don't know the best place to set up your case directory.
 
thanks for the info about WORKDIR. Unfortunately I couldn't find a way to set the environment variable and so I thought it would be easier to just try a different machine. I switched over to the atlas machine and it also refused to configure, but for a different reason. The error message was

/usr/global/tools/dotkit/init.csh: No such file or directory.

I don't know why it would try to find a file in usr/global unless that was the default for the atlas system.
 

eaton

CSEG and Liaisons
It looks like the /usr/global/tools/dotkit/init.csh file is being referenced from the scripts file scripts/ccsm_utils/Machines/env_machopts.prototype_atlas. It is apparently being used to set up the pgi compiler version. You'll need to modify this file to do something appropriate. You may not still have access to the pgi_7.2.5 compiler version that is requested in this script, and so updating to a later compiler version would be the right thing to do.
 
Thanks for responding, I'm just a little confused. You said I need to modify the file. Do you mean the init.csh file or the machopts scripts file? Because I don't think the init.csh file exists on my computer, and I would have to reference a new source for the updated pgi compiler.

Hope this isn't a dumb question
 

eaton

CSEG and Liaisons
I meant modify the env_machopts.prototype_atlas file. Since it refers to something that's not on your system you need to change that to either 1) refer to something that is on your system (you may need to ask your sys-admins for help), or 2) remove both the source and the "use pgi-7.2.5" lines entirely. If you remove those lines then presumably you'll use the default pgi compiler for your system rather than trying to force the system to use an old compiler.

I guess atlas is not a supported CCSM machine. I just looked in the latest trunk code and the env_machopts.prototype_atlas file hasn't been updated.
 
I tried removing both lines but the system still reads them as being there and the configure gives the same error message. I assume the file is referenced somewhere else as well, but I can't find where. I decided to see if I could bypass the error by running a default version of CAM using

%setenv INC_NETCDF /usr/local/include
% setenv LIB_NETCDF /usr/local/lib
%$camcfg/configure -dyn fv -hgrid 10x15 -nospmd -nosmp

instead of
% configure -case

while at the same time running -test to show me any errors I might have.
The configure worked, but the test picked up the following:

Looking for a valid GNU make... using gmake
Test linking to NetCDF library... **** FAILED ****
Issued the command:
gmake -f /Users/Will/ccsm4_working_copy/models/atm/cam/bld/Makefile test_nc 2>&1

The output was:
/Users/Will/ccsm4_working_copy/models/atm/cam/bld/mkSrcfiles > /Users/Will/ccsm4_working_copy/models/atm/cam/bld/configure-tests/Srcfiles
/Users/Will/ccsm4_working_copy/models/atm/cam/bld/mkDepends Filepath Srcfiles > /Users/Will/ccsm4_working_copy/models/atm/cam/bld/configure-tests/Depends
xlf90_r -c -qsuffix=f=f90:cpp=F90 -I. -I/Users/Will/ccsm4_working_copy/models/atm/cam/bld/configure-tests -I/usr/local/include -I/usr/local/include -WF,-DNO_SHR_VMATH -WF,-DSEQ_MCT -WF,-DFORTRAN_SAME -WF,-DCO2A -WF,-DMAXPATCH_PFT=numpft+1 -WF,-DLSMLAT=1 -WF,-DLSMLON=1 -WF,-DCOUP_DOM -WF,-DPLON=24 -WF,-DPLAT=19 -WF,-DPLEV=26 -WF,-DPCNST=3 -WF,-DPCOLS=16 -WF,-DPTRM=1 -WF,-DPTRN=1 -WF,-DPTRK=1 -WF,-DSTAGGERED -WF,-DCCSMCOUPLED -WF,-Dcoupled -WF,-Dncdf -WF,-DNCAT=1 -WF,-DNXGLOB=24 -WF,-DNYGLOB=19 -WF,-DNTR_AERO=0 -WF,-DBLCKX=24 -WF,-DBLCKY=19 -WF,-DMXBLCKS=1 -WF,-D_USEBOX -WF,-D_NETCDF -WF,-DAIX -WF,-DDarwin -qspillsize=2500 -O3 -qstrict -WF,-DHIDE_MPI,-D_MPISERIAL test_nc.F90
gmake: xlf90_r: Command not found
gmake: *** [test_nc.o] Error 127

then if I try to issue the make command, I get the same error on the last two lines. I assumed the error would continue when I attempted to build the namelist using

cd /work/user/cam_test
% setenv CSMDATA /fs/cgd/csm/inputdata
% $camcfg/build-namelist -test -config /work/user/cam_test/bld/config_cache.xml

but the error is

** build-namelist - CCSM inputdata root is not a directory: "./Desktop/CSMDATA" **

Are these two completely separate issues?

Thanks
 
Top