I am modifying the POP source code in CESM2.1.3 and having trouble optimizing the performance.
In forcing_coupled.F90, which receives surface forcing fields from the coupler, I overwrite the CESM prediction of wind stress with my own statistical prediction based on SST from CAM. My statistical prediction is based on some coefficients/matrices (stored in netcdf files) which I have loaded as 1D/2D/3D arrays in forcing_fields.F90.
In an earlier version of this "hybrid" model (CESM simulation with statistical prediction for wind stress in some regions), things were running quite efficiently. I loaded global geospatial fields to forcing_fields.F90, and then used a call to scatter_global to distribute the field among ranks. Now, I am trying to use a single point time series in my statistical model (ie. I have a netcdf file with one dimension: time = 31390). Because it is just an array (not a geospatial field), it is my understanding that I can no longer use scatter_global.
I have tried many methods to do this efficiently. It is not possible to broadcast the data as a global spatial field because the time series is very long, and it seems CESM will not allow a matrix with dimensions (nx_global, ny_global, time), where time > ~800. My other approaches, including just accessing one time step's data at a time, or using broadcast_array to try to make a copy of the time series on each rank once, are quite costly.
I know this is a bit of a niche problem, but I am hoping someone can help me understand how to efficiently load/distribute a single point time series (a real array) among ranks in POP, so that it only occurs once per simulation (at initialization).
In forcing_coupled.F90, which receives surface forcing fields from the coupler, I overwrite the CESM prediction of wind stress with my own statistical prediction based on SST from CAM. My statistical prediction is based on some coefficients/matrices (stored in netcdf files) which I have loaded as 1D/2D/3D arrays in forcing_fields.F90.
In an earlier version of this "hybrid" model (CESM simulation with statistical prediction for wind stress in some regions), things were running quite efficiently. I loaded global geospatial fields to forcing_fields.F90, and then used a call to scatter_global to distribute the field among ranks. Now, I am trying to use a single point time series in my statistical model (ie. I have a netcdf file with one dimension: time = 31390). Because it is just an array (not a geospatial field), it is my understanding that I can no longer use scatter_global.
I have tried many methods to do this efficiently. It is not possible to broadcast the data as a global spatial field because the time series is very long, and it seems CESM will not allow a matrix with dimensions (nx_global, ny_global, time), where time > ~800. My other approaches, including just accessing one time step's data at a time, or using broadcast_array to try to make a copy of the time series on each rank once, are quite costly.
I know this is a bit of a niche problem, but I am hoping someone can help me understand how to efficiently load/distribute a single point time series (a real array) among ranks in POP, so that it only occurs once per simulation (at initialization).