Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

Several errors when running scripts_regression_test.py

cdevaneprugh

Cooper
New Member
Update: I built the CAM and POP runs after loading python/2.7.14

Now I am trying to add the meta data and am getting a file not found error. Is the documentation on Python Tools | Community Earth System Model outdated? Their isn't even a "run" directory in the case directory that was created.

This is the command I am using to try and add meta data
$ ./addmetadata.sh --caseroot /blue/gerber/cdevaneprugh/earth_model_output/cime_output_root/ect_runs/case.cesm_tag.uf.000 --histfile /blue/gerber/cdevaneprugh/earth_model_output/cime_output_root/ect_runs/case.cesm_tag.uf.000/run/case.cesm_tag.uf.000.cam.h0.0001-01-01-00000.nc
 

jedwards

CSEG and Liaisons
Staff member
If there is no run directory under that case directory go to the case and do
./xmlquery RUNDIR
that should provide the path to the run directory.
 

cdevaneprugh

Cooper
New Member
Okay I found the run directory and .nc files. Using the same command as above I tried adding metadata to the files but got the following errors.

The first was an error with mv stating that /blue/gerber/cdevaneprugh/earth_model_output/cime_output_root/ect_runs/case.cesm_tag.uf.000/run/case.cesm_tag.uf.000.cam.h0.0001-01-01-00000.nc.tmp did not exist

I fixed this by creating an empty file with that name.

The other error I am getting seems to be related to ncks saying:
ncks: unrecognized option '--glb'

We have nco versions 4.2.1 and 4.4.3 on our machine.

Do I need to update nco to a newer version?
 

cdevaneprugh

Cooper
New Member
I created a conda environment with python 3.8 and a newer version of nco (5.2.2) and was able to add meta data without an issue. However when I uploaded the .nc files to be verified I got an error telling me my version of cesm is not supported (see the screenshot). I don't understand what went wrong as I am using cesm 2.1.5 which it says on the website is supported. Did I upload the wrong files or add the meta data incorrectly? Has anyone else ran into this?

Thanks
 

Attachments

  • ect_scrnsht.png
    ect_scrnsht.png
    57.3 KB · Views: 6

jedwards

CSEG and Liaisons
Staff member
I think that there is a problem on our end, I'll report the issue and hopefully have it solved soon.
 

cdevaneprugh

Cooper
New Member
Thanks Jim.

While the three UF-CAM-ECT tests all appear to have built and run successfully, I noticed that the POP-ECT test did not. I got a notification from our scheduler that it timed out after 2 hours. Additionally, when I look at the CaseStatus file in $CIME_OUTPUT_ROOT/popcase.cesm_tag.000/ it shows that case.build and case.submit was successful, and that the model execution started but did not finish. What error logs should I be looking at to diagnose this?
 

jedwards

CSEG and Liaisons
Staff member
I would first try to determine if it just needs more time or if it was deadlocked. Usually you can gather this from the timestamps on the log files, if the
logs other than cesm.log are much older then it was deadlocked, if all of the logs are within a minute or two of the timeout time, then you probably just need to increase the wallclock time.
 

cdevaneprugh

Cooper
New Member
All the logs were last modified within seconds of the timeout time reported by our scheduler so I'll increase the wallclock time.

Considering that the case was built successfully and it just timed out, would something like the following be fine?
$ ./xmlchange JOB_WALLCLOCK_TIME=06:00:00
$ ./case.submit

Additionally, I poked around the the cesm log in /popcase.cesm_tag.000/run and saw many lines towards the end of the file saying:

NetCDF: Invalid dimension ID or name
NetCDF: Variable not found
NETCDF: Attribute not found

Does this mean my netcdf linking or install is still not successful?
 

jedwards

CSEG and Liaisons
Staff member
Yes that wallclock should do it. NetCDF prints messages like this when you inquire for a variable, dimension or attribute in a file. It
may be that the model is just inquiring about optional variables and these messages can be ignored.
 

cdevaneprugh

Cooper
New Member
I was able to successfully upload and test the files but am unsure of how to interpret the results and where to go from here. This is the output:

CESM Version Tested: CESM 2.1.5
Metadata retrieved from: case.cesm_tag.uf.000.cam.h0.0001-01-01-00000.nc

***********************************************
PCA Test Results
***********************************************

Summary: 1 PC scores failed at least 2 runs: [8]

These runs PASSED according to our testing criterion.
PC 2: failed 1 runs [3]
PC 5: failed 1 runs [3]
PC 8: failed 2 runs [2, 3]
PC 30: failed 1 runs [1]
PC 43: failed 1 runs [3]
PC 47: failed 1 runs [1]

Run 1: 2 PC scores failed [30, 47]
Run 2: 1 PC scores failed [8]
Run 3: 4 PC scores failed [2, 5, 8, 43]

Testing complete.
 

jedwards

CSEG and Liaisons
Staff member
These runs PASSED according to our testing criterion. There were a couple of principle components out of spec in each run but the same PC is not out of spec in all three runs and the number out of spec is within the tolerance of the test. I'm not sure what the difficulty in interpreting the results could be?
 

cdevaneprugh

Cooper
New Member
Thank you for the clarification. It's not clear from the results that the principle component failures were within an acceptable tolerance. I wasn't sure if I needed to get the tests to pass with zero PC failures.
 

cdevaneprugh

Cooper
New Member
Hey Jim, is there a way to download the summary files and validate my ECT runs locally? I am having an issue with the CESM website timing out before anything can validate now.
 

jedwards

CSEG and Liaisons
Staff member
Not easily - Do you have a poor network connection or does the timeout error appear to be on our end?
 

cdevaneprugh

Cooper
New Member
I think the time out error is on your end. I've tried doing this from several reliable internet connections and the results are always the same. The files upload just fine, but when verifying it times out.
 

jedwards

CSEG and Liaisons
Staff member
Have you tried today - the systems people worked on it yesterday and think that they've solved the issue.
 
Top