Scheduled Downtime
On Tuesday 24 October 2023 @ 5pm MT the forums will be in read only mode in preparation for the downtime. On Wednesday 25 October 2023 @ 5am MT, this website will be down for maintenance and expected to return online later in the morning.
Normal Operations
The forums are back online with normal operations. If you notice any issues or errors related to the forums, please reach out to help@ucar.edu

CESM2/2.1 run fail on NERSC Cori

Hi all,Since the new allocation year started on NERSC Cori, I've been getting endless run fails for CESM2 (and now 2.1). Bear in mind this exact configuration ran without issues before the new AY, which leads me to suspect some important files got moved around somehow.It seems to have something to do with GPTL PAPI library version changes, and some incompatability there (?). I've already reset and rebuilt (and cleaned) the case. I am trying to run a restart of an existing case, and I have all restart files for the date, so I do not think it is due to any lack in input files.I also get this same error for new cases. I am at a loss for what else to do. I've attached the cesm.log output. Let me know if you need anything else, or if I posted this in the wrong place! 
 

jedwards

CSEG and Liaisons
Staff member
You can remove the dependance on papi : in config_compilers.xml in the cori section(s) remove the -DHAVE_PAPI 
 

jedwards

CSEG and Liaisons
Staff member
You can remove the dependance on papi : in config_compilers.xml in the cori section(s) remove the -DHAVE_PAPI 
 

jedwards

CSEG and Liaisons
Staff member
You can remove the dependance on papi : in config_compilers.xml in the cori section(s) remove the -DHAVE_PAPI 
 
Hello,Thank you for your prompt help! I removed that from the XML file, then reset, cleaned, and rebuilt the case, but I still had a failure. I've attached the log file for the latest run failure. Note that it does now say "profile_papi_enable=      F", so I do think I properly did as you suggested. Any thoughts on where to go from here?
 
Hello,Thank you for your prompt help! I removed that from the XML file, then reset, cleaned, and rebuilt the case, but I still had a failure. I've attached the log file for the latest run failure. Note that it does now say "profile_papi_enable=      F", so I do think I properly did as you suggested. Any thoughts on where to go from here?
 
Hello,Thank you for your prompt help! I removed that from the XML file, then reset, cleaned, and rebuilt the case, but I still had a failure. I've attached the log file for the latest run failure. Note that it does now say "profile_papi_enable=      F", so I do think I properly did as you suggested. Any thoughts on where to go from here?
 

jedwards

CSEG and Liaisons
Staff member
I think that you removed the module but you also need to remove the -DHAVE_PAPI from the build. 
 

jedwards

CSEG and Liaisons
Staff member
I think that you removed the module but you also need to remove the -DHAVE_PAPI from the build. 
 

jedwards

CSEG and Liaisons
Staff member
I think that you removed the module but you also need to remove the -DHAVE_PAPI from the build. 
 
I apologize, but I am not sure I follow what you mean when you suggest I remove it "from the build". I edited $CIME_ROOT/config/cesm/machines/config_machines.xml (attached) to remove the -DHAVE_PAPI flag, then reset, cleaned, and rebuilt the case. I assumed that changing the files in $CIME_ROOT then rebuilding, etc, should have also changed the build, but I suppose it doesn't? In that case, could you point me to where that -DHAVE_PAPI flag would be in the case directory? Or is there something else I'm missing here? Thank you for your help and patience! 
 
I apologize, but I am not sure I follow what you mean when you suggest I remove it "from the build". I edited $CIME_ROOT/config/cesm/machines/config_machines.xml (attached) to remove the -DHAVE_PAPI flag, then reset, cleaned, and rebuilt the case. I assumed that changing the files in $CIME_ROOT then rebuilding, etc, should have also changed the build, but I suppose it doesn't? In that case, could you point me to where that -DHAVE_PAPI flag would be in the case directory? Or is there something else I'm missing here? Thank you for your help and patience! 
 
I apologize, but I am not sure I follow what you mean when you suggest I remove it "from the build". I edited $CIME_ROOT/config/cesm/machines/config_machines.xml (attached) to remove the -DHAVE_PAPI flag, then reset, cleaned, and rebuilt the case. I assumed that changing the files in $CIME_ROOT then rebuilding, etc, should have also changed the build, but I suppose it doesn't? In that case, could you point me to where that -DHAVE_PAPI flag would be in the case directory? Or is there something else I'm missing here? Thank you for your help and patience! 
 
Haha, I also removed it from config_compilers.xml, but I must have forgotten I had! Okay, I have removed it from Macros.make (you were correct), reset, cleaned, and rebuilt successfully. I am submitting a test run now and will let you know how it goes. FYI, I have just heard back from help at NERSC. They gave me a much more extensive list of things to do to fix this. I will try your (very simple) fixes first, then will move on to their guidance. Would you like me to keep you in the loop if I continue working with them instead? Again, thank you!
 
Haha, I also removed it from config_compilers.xml, but I must have forgotten I had! Okay, I have removed it from Macros.make (you were correct), reset, cleaned, and rebuilt successfully. I am submitting a test run now and will let you know how it goes. FYI, I have just heard back from help at NERSC. They gave me a much more extensive list of things to do to fix this. I will try your (very simple) fixes first, then will move on to their guidance. Would you like me to keep you in the loop if I continue working with them instead? Again, thank you!
 
Haha, I also removed it from config_compilers.xml, but I must have forgotten I had! Okay, I have removed it from Macros.make (you were correct), reset, cleaned, and rebuilt successfully. I am submitting a test run now and will let you know how it goes. FYI, I have just heard back from help at NERSC. They gave me a much more extensive list of things to do to fix this. I will try your (very simple) fixes first, then will move on to their guidance. Would you like me to keep you in the loop if I continue working with them instead? Again, thank you!
 

jedwards

CSEG and Liaisons
Staff member
I'm sure that this change is going to work, but I do want to hear about it if it doesn't.
 
Top