CLM5 checks input data directory for GSWP and CRUNCEP before path specified in user_datm.streams.txt.

MrIgnition · Aug 2, 2021

Dear all,
I have used different atmospheric forcing in CLM5 successfully in the past different from GSWP and CRUNCEP. During the period, I observed that CLM5 checks the input data directory for GSWP before checking the path I specified in user_datm.streams.txt. I considered it understandable since I am using a I2000Clm50SpGs compset. My way around it was to ensure i had datasets for my reference period in the GSWP input data directories and It worked without getting "Model datm missing file file1 =" errors.

Now I like to simulate with climate projections with a reference period that is not available in GSWP but i get the error "Model datm missing file file1 =.......".
My question is this: how do I override or control the generation of the list of paths for my dataset so that CLM5 will only look at the specified directory in my user_datm.streams.txt.
Best regards.

oleson · Aug 2, 2021

I'm not sure at what point in your workflow you are getting that error, but you can change the path of the forcing data in env_run.xml by setting:

DIN_LOC_ROOT_CLMFORC

The subdirectory would still have to be, e.g., atm_forcing.datm7.GSWP3.0.5d.v1.c200929/Precip, but you could put your own data in there.

MrIgnition · Aug 2, 2021

Thanks. The error comes immediately after submitting when checking for data availability. So submission fails.

oleson · Aug 2, 2021

Did you try my suggestion and it still failed?

MrIgnition · Aug 2, 2021

I still have the same error.

oleson · Aug 2, 2021

Can you post the full output of your case.submit.
And attach your datm.streams.txt.CLMGSWP3v1.* files and your datm_in file.
I'll try it here.

MrIgnition · Aug 2, 2021

ok. Thanks

MrIgnition · Aug 2, 2021

Here are the requested files. Thanks.

oleson · Aug 2, 2021

Can you attach your env_run.xml?

MrIgnition · Aug 2, 2021

oleson said:
Can you attach your env_run.xml?

oleson · Aug 2, 2021

Thanks. I don't seem to have any problem changing paths to my datm data using user_* files.
Do you get any errors if you run preview_namelists in your case directory?
This should be generating a list of datm files from your user_* files.
The model is checking for files in that list at Buildconf/datm.input_data_list in your case directory.
Maybe the user_* files aren't being read in correctly?

MrIgnition · Aug 3, 2021

oleson said:
Thanks. I don't seem to have any problem changing paths to my datm data using user_* files.
Do you get any errors if you run preview_namelists in your case directory?
This should be generating a list of datm files from your user_* files.
The model is checking for files in that list at Buildconf/datm.input_data_list in your case directory.
Maybe the user_* files aren't being read in correctly?

It has also worked for me when carrying out simulations over periods for which GSWP forcings are available. Even if I do not have the data locally, upon case submission the data is downloaded.
I like to share my .xml settings for you to look at perhaps I am getting a setting wrong. I run CLM5 with a prepared bash script.

./xmlchange CLM_USRDAT_NAME=$CASENAME
./xmlchange JOB_WALLCLOCK_TIME=$JOB_WALLCLOCK_TIME
./xmlchange ATM_DOMAIN_FILE=$ATM_DOMAIN_FILE
./xmlchange LND_DOMAIN_FILE=$LND_DOMAIN_FILE
./xmlchange NTASKS=$NTASKS
./xmlchange RESUBMIT=$RESUBMIT,STOP_N=$STOP_N,STOP_OPTION=$STOP_OPTION,RUN_STARTDATE=$RUN_STARTDATE,STOP_DATE=$STOP_DATE
./xmlchange DATM_CLMNCEP_YR_ALIGN=$DATM_CLMNCEP_YR_ALIGN,DATM_CLMNCEP_YR_START=$DATM_CLMNCEP_YR_START,DATM_CLMNCEP_YR_END=$DATM_CLMNCEP_YR_END

oleson · Aug 3, 2021

I don't know what the actual values of the environment variables you are using are (e.g., $DATM_CLMNCEP_YR_ALIGN), so I can't judge whether they are correct.
The only other thing I can think of is that your user_* files aren't being read because they aren't in the correct directory. They should be in your case directory (not in CaseDocs). Are they?

MrIgnition · Aug 4, 2021

Thanks, They are in my Case directory and not case docs. I found a way around it even though it might not be the cleanest. But it works. As advised above, I specified a different path to my dataset in env_run.xml and created the specified directory (to avoid overwriting original GSWP datasets). I made sure that the subdirectories were the same and named all my files like the GSWP files. Since my files were in the form "2021-01.nc" before, i used a bash script to add "clmforc.GSWP3.c2011.0.5x0.5.TPQWL." as a prefix for TPQWL for instance. I also renamed my atm.domain file and put it where datm likes to see it.

erik · Aug 4, 2021

Ahh, OK, I'm glad you got something that is working for you.

I think the bottom line problem is that when you have a user_datm.stream.txt.* file overriding the default one, datm doesn't update the datm.input_data_list with the files in the user defined stream file. So it still thinks you need the other files, and doesn't ask about files that it's actually going to use. The input_data_list files get updated each time you run preview_namelist (which is also run at least once for most of the case.* scripts as well). Since, it's constantly rerun, you can't just fix it one time.

One workaround though would be to make sure there are files with the names specified in datm.input_data_list (which it won't actually use). The files in the user_datm.stream.txt.* file then do need to exist and have valid data in them for the actual model run. But, the files listed in the datm.input_data_list file don't actually need to have valid data in them (at least for the ones that are replaced by the user_datm.streams.txt.* files), as it's only going to check that they exist.

erik · Aug 4, 2021

I created an issue for this in cime

datm.input_data_list files aren't updated wtih user_datm.streams.txt file updates... · Issue #4062 · ESMCI/cime

This is a problem that was pointed out in a CESM discussion forum. https://bb.cgd.ucar.edu/cesm/threads/clm5-checks-input-data-directory-for-gswp-and-cruncep-before-path-specified-in-user_datm-stre...

github.com

Min-Seok Kim · Aug 30, 2021

erik said:
I created an issue for this in cime

datm.input_data_list files aren't updated wtih user_datm.streams.txt file updates... · Issue #4062 · ESMCI/cime

This is a problem that was pointed out in a CESM discussion forum. https://bb.cgd.ucar.edu/cesm/threads/clm5-checks-input-data-directory-for-gswp-and-cruncep-before-path-specified-in-user_datm-stre...

github.com

I think I'm experiencing same problem.
Even though I put 'user_datm.streams.txt' files in my case directory then I did 'preview_run', any change has not been made.
Can I ask a way to solve the problem to change atmospheric forcing data from original GSWP3 to my own data?
Should I replace 'DIN_LOC_ROOT_CLMFORC' in 'env_run.xml' file with renaming the files as mentioned above?

Thanks in advance for your helps.

MrIgnition · Aug 31, 2021

MrIgnition said:
Thanks, They are in my Case directory and not case docs. I found a way around it even though it might not be the cleanest. But it works. As advised above, I specified a different path to my dataset in env_run.xml and created the specified directory (to avoid overwriting original GSWP datasets). I made sure that the subdirectories were the same and named all my files like the GSWP files. Since my files were in the form "2021-01.nc" before, i used a bash script to add "clmforc.GSWP3.c2011.0.5x0.5.TPQWL." as a prefix for TPQWL for instance. I also renamed my atm.domain file and put it where datm likes to see it.

This was what I did. After reading Erik's suggestion, however, I agree that it was a smarter way. kindly see here:

erik said:
Ahh, OK, I'm glad you got something that is working for you.

I think the bottom line problem is that when you have a user_datm.stream.txt.* file overriding the default one, datm doesn't update the datm.input_data_list with the files in the user defined stream file. So it still thinks you need the other files, and doesn't ask about files that it's actually going to use. The input_data_list files get updated each time you run preview_namelist (which is also run at least once for most of the case.* scripts as well). Since, it's constantly rerun, you can't just fix it one time.

One workaround though would be to make sure there are files with the names specified in datm.input_data_list (which it won't actually use). The files in the user_datm.stream.txt.* file then do need to exist and have valid data in them for the actual model run. But, the files listed in the datm.input_data_list file don't actually need to have valid data in them (at least for the ones that are replaced by the user_datm.streams.txt.* files), as it's only going to check that they exist.

All the best!

CLM5 checks input data directory for GSWP and CRUNCEP before path specified in user_datm.streams.txt.

Member

Keith Oleson

CSEG and Liaisons

Member

Keith Oleson

CSEG and Liaisons

Member

Keith Oleson

CSEG and Liaisons

Member

Member

Attachments

Keith Oleson

CSEG and Liaisons

Member

Attachments

Keith Oleson

CSEG and Liaisons

Member

Keith Oleson

CSEG and Liaisons

Member

Erik Kluzek

CSEG and Liaisons

Erik Kluzek

CSEG and Liaisons

Min-Seok Kim

New Member

Member