Welcome to the new DiscussCESM forum!
We are still working on the website migration, so you may experience downtime during this process.

Existing users, please reset your password before logging in here: https://xenforo.cgd.ucar.edu/cesm/index.php?lost-password/

problem running CVDP in parallel

milena@lanl_gov

New Member
We are trying to run the CVDP on a mac, processing just atm fields (PSL,TS,TREFHT,PREC*) that have been regridded to T85. To make things simple, I am only running psl.sam_psa.ncl for now, with *no* comparison to obs.If I don't use the parallel option, I get a Seg fault as it goes through the first data_read statement in psl.sam_psa.ncl:arr = data_read_in(paths(ee),"PSL",syear(ee),eyear(ee))If we do use the parallel option (I tried ntasks=2 or 5), then nothing happens. We go through the line above with no printed errors/messages, but then nothing is really done (no output whatsoever is generated). We followed all instructions on how to set the filenames and have something like:runname.varname.YYYYMM-YYYYMM.ncfor each monthly files and 90 years (90*12 files total). Any idea what could be wrong? Thanks! Milena
 

asphilli

Adam Phillips
CVCWG Liaison
Staff member
Hi Milena,My first guess as to what is happening is that you might be hitting up against a limit to the number of files that can be open on your machine. The limit on my machine is currently set to 1024. See this ncl-talk email here for more information about this:https://www.ncl.ucar.edu/Support/talk_archives/2013/3495.htmlYou can test whether that is the problem by creating a new directory and then use soft links to link say 20 years of data, and see if the same error occurs.Regardless of whether that is the problem, I would use the NetCDF operator ncrcat to concatenate those files together to form fewer files:ncrcat mymodelrun.PSL.*.nc mymodelrun.PSL.000101-009012.ncAt the very least that will speed things up.So you know: The CVDP uses task parallelism, so turning the parallel option on when running the package for one script will not result in any speedup.If the above does not help please send back a file containing the output, along with the namelist and the results of a ls of the model directory. Adam    
 

milena@lanl_gov

New Member
Hi Adam, thanks for your response. Indeed I found out that was exactly my problem. The package works now: we have tested it for the psl.sam.. script and also turning on comparison with obs, and things work as expected.If it can be of any help to others, here are the things I learned:1) model filenames (including their path) cannot be too long. Suggestion: set up a link to the model data files in the directory where driver.ncl is run;2) the list of files also cannot be too long, so one needs to ncrcat say monthly files together in 10-year files, for example.3) this is fairly trivial, but one needs to make sure that the record dimension is preserved in the monthly files. I had regridded data where the time dimension had been previously removed, so I had to add the time dimesion back in for each file (easy with "ncecat -u time -O $infile $infile"). thanks again, Milena
 
Top