Hi Marshall and Russ,
This is in continuation to our previous discussion in Google MOM6 forum. To help others, I have appended our previous conversion below my main message.
I am still unable to figure out the problem. Initially, the layout was not mentioned in the SIS_input. As you suggested, I included the below lines in SIS_input and SIS_layout:
LAYOUT = 32,18 !
The SIS_layout and MOM_layout is now identical and as given below:
LAYOUT = 32,18
IO_LAYOUT = 2,2
MASKTABLE = "mask_table.96.32x18" ! 32*18-96 = 480 PEs
Now while running, first I am getting a warning:
ARNING: MOM_file_parser : LAYOUT occurs more times than is permitted. Line: 'LAYOUT = 32,18' in file SIS_layout is being ignored.
and finally, the model is coming out with the same error,
---------
NOTE: MOM_domains_init: reading maskmap information from INPUT/mask_table.96.32x18
parse_mask_table: Number of domain regions masked in ice model = 96
FATAL: fms_io(parse_mask_table_2d): mpp_npes() .NE. layout(1)*layout(2) - nmask for ice model
----------
So, I now I have modifed the fms_io.F90 code (as Russ suggested) for the lines: 8444-8445 as given below:
-----
if( mpp_npes() .NE. layout(1)*layout(2) - nmask )then
write(stdoutunit,*)"npes=",mpp_npes(),",layout(1)="layout(1),",layout(2)="layout(2)
call mpp_error(FATAL, &
"fms_io(parse_mask_table_2d): mpp_npes() .NE. layout(1)*layout(2) - nmask for "//trim(modelname))
endif
-----
I hope this is correct. It is now compiling. I will update you with the outcome.
The fms.out file is attached with this message for reference (in case it helps in diagnosing the problem).
Thanks,
Abhisek Chatterjee
INCOIS
**************************************************OLD Messages*********************************************
Hi Marshall,
The big problem with this FMS error message is that it only tells you what the problem is when it knows what the full error is, why it occurred and has all the information at hand but would rather keep it a secret!
In fms_io.F90 (line 8451)we have
"fms_io(parse_mask_table_2d): mpp_npes() .NE. layout(1)*layout(2) - nmask for "//trim(modelname))
The problem is detected so why not print out mpp_npes(), layout(1), layout(2) and nmask so the user can quickly see the problem rather than hunting down things all over the place? It's done for the check just above it.
Cheers,
Russ
to MOM Users Mailing List
480 CPUs is correct (32*18 - 96) when the mask table is present. Based on your error, I don't think anything else is required. It may be that your LAYOUT is set in MOM_input but not SIS_input. It could also be an issue with the MPI launcher flags, e.g. maybe you are specifying nodes and it is implicitly assigning more than 480 CPUs.
First I would suggest looking in `MOM_parameter_doc.layout` and `SIS_parameter_doc.layout` and confirming that LAYOUT is 32, 18 and that MASKTABLE points to the correct file in both MOM and SIS. (You could also explicitly set these in MOM_input and SIS_input).
Next, try to confirm that you are launching MPI with 480 ranks.
If none of that works, then we may need more information.
We are currently trying to migrate MOM6 support to the CESM forums, so you may want to ask your question over there: MOM6
On Monday, June 7, 2021 at 3:13:11 AM UTC-4 chatterj...@gmail.com wrote:
This is in continuation to our previous discussion in Google MOM6 forum. To help others, I have appended our previous conversion below my main message.
I am still unable to figure out the problem. Initially, the layout was not mentioned in the SIS_input. As you suggested, I included the below lines in SIS_input and SIS_layout:
LAYOUT = 32,18 !
The SIS_layout and MOM_layout is now identical and as given below:
LAYOUT = 32,18
IO_LAYOUT = 2,2
MASKTABLE = "mask_table.96.32x18" ! 32*18-96 = 480 PEs
Now while running, first I am getting a warning:
ARNING: MOM_file_parser : LAYOUT occurs more times than is permitted. Line: 'LAYOUT = 32,18' in file SIS_layout is being ignored.
and finally, the model is coming out with the same error,
---------
NOTE: MOM_domains_init: reading maskmap information from INPUT/mask_table.96.32x18
parse_mask_table: Number of domain regions masked in ice model = 96
FATAL: fms_io(parse_mask_table_2d): mpp_npes() .NE. layout(1)*layout(2) - nmask for ice model
----------
So, I now I have modifed the fms_io.F90 code (as Russ suggested) for the lines: 8444-8445 as given below:
-----
if( mpp_npes() .NE. layout(1)*layout(2) - nmask )then
write(stdoutunit,*)"npes=",mpp_npes(),",layout(1)="layout(1),",layout(2)="layout(2)
call mpp_error(FATAL, &
"fms_io(parse_mask_table_2d): mpp_npes() .NE. layout(1)*layout(2) - nmask for "//trim(modelname))
endif
-----
I hope this is correct. It is now compiling. I will update you with the outcome.
The fms.out file is attached with this message for reference (in case it helps in diagnosing the problem).
Thanks,
Abhisek Chatterjee
INCOIS
**************************************************OLD Messages*********************************************
Hi Marshall,
The big problem with this FMS error message is that it only tells you what the problem is when it knows what the full error is, why it occurred and has all the information at hand but would rather keep it a secret!
In fms_io.F90 (line 8451)we have
!--- make sure mpp_npes() == layout(1)*layout(2) - nmask |
if( mpp_npes() .NE. layout(1)*layout(2) - nmask ) call mpp_error(FATAL, & |
The problem is detected so why not print out mpp_npes(), layout(1), layout(2) and nmask so the user can quickly see the problem rather than hunting down things all over the place? It's done for the check just above it.
Cheers,
Russ
marsha...@noaa.gov
to MOM Users Mailing List
480 CPUs is correct (32*18 - 96) when the mask table is present. Based on your error, I don't think anything else is required. It may be that your LAYOUT is set in MOM_input but not SIS_input. It could also be an issue with the MPI launcher flags, e.g. maybe you are specifying nodes and it is implicitly assigning more than 480 CPUs.
First I would suggest looking in `MOM_parameter_doc.layout` and `SIS_parameter_doc.layout` and confirming that LAYOUT is 32, 18 and that MASKTABLE points to the correct file in both MOM and SIS. (You could also explicitly set these in MOM_input and SIS_input).
Next, try to confirm that you are launching MPI with 480 ranks.
If none of that works, then we may need more information.
We are currently trying to migrate MOM6 support to the CESM forums, so you may want to ask your question over there: MOM6
On Monday, June 7, 2021 at 3:13:11 AM UTC-4 chatterj...@gmail.com wrote:
Dear MOM community,
I am new to MOM6 and currently exploring test cases to understand the process of setting up a regional model for my applications.
I have now encountered an error while running the OM4_025 test case under the ice_ocean_SIS2 experiments. It seems this error is coming from a mismatch with the layout and number of processor allocated.
I have tried using npes=576 processors (18 cores with 32 processors each) as the layout suggested. Also, tried with npes= 480 with the same number of cores allocated. But both the times it came out with the same error.
---
NOTE: MOM_domains_init: reading maskmap information from INPUT/mask_table.96.32x18
parse_mask_table: Number of domain regions masked in ice model = 96
FATAL: fms_io(parse_mask_table_2d): mpp_npes() .NE. layout(1)*layout(2) - nmask for ice model
---
Can someone please suggest if I have to make any other modification somewhere to run this test case?
Thank you,
With best regards,
Abhisek