Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OBC fails if not at the outer edge of the model domain #753

Open
ThomasNeumann2 opened this issue Nov 2, 2024 · 25 comments
Open

OBC fails if not at the outer edge of the model domain #753

ThomasNeumann2 opened this issue Nov 2, 2024 · 25 comments

Comments

@ThomasNeumann2
Copy link

I am running a regional MOM6 setup with open boundaries. As long as the OBCs are located at the outer edge of the model (I,J = N or 0) domain, everthing runs fine. But moving the OBC inwards, the model fails after a couple of time steps. Error messages are different.
Then I made a simple test. I set up a rectangle model with flat bottom and defined 4 OBCs at each model edge, W, N, E, and S. Everything works as expected:
=============================================>>>>>>
OBC_NUMBER_OF_SEGMENTS = 4

OBC_SEGMENT_001 = "I=0,J=N:0,FLATHER,ORLANSKI"
OBC_SEGMENT_001_DATA = "SSH=value:0.1,U=value:0.0,V=value:0,TEMP=value:7.0,SALT=value:33.0"

OBC_SEGMENT_002 = "I=N,J=0:N,FLATHER,ORLANSKI"
OBC_SEGMENT_002_DATA = "SSH=value:0.1,U=value:0.0,V=value:0,TEMP=value:7.0,SALT=value:33.0"

OBC_SEGMENT_003 = "J=N,I=N:0,FLATHER,ORLANSKI"
OBC_SEGMENT_003_DATA = "SSH=value:0.1,U=value:0.0,V=value:0,TEMP=value:7.0,SALT=value:33.0"

OBC_SEGMENT_004 = "J=0,I=0:N,FLATHER,ORLANSKI"
OBC_SEGMENT_004_DATA = "SSH=value:0.1,U=value:0.0,V=value:0,TEMP=value:7.0,SALT=value:33.0"
=============================================<<<<<<<

Moving e.g. the West OBC one grid point inwards:
=============================================>>>>>>>
OBC_NUMBER_OF_SEGMENTS = 4

OBC_SEGMENT_001 = "I=1,J=N:0,FLATHER,ORLANSKI"
OBC_SEGMENT_001_DATA = "SSH=value:0.1,U=value:0.0,V=value:0,TEMP=value:7.0,SALT=value:33.0"

OBC_SEGMENT_002 = "I=N,J=0:N,FLATHER,ORLANSKI"
OBC_SEGMENT_002_DATA = "SSH=value:0.1,U=value:0.0,V=value:0,TEMP=value:7.0,SALT=value:33.0"

OBC_SEGMENT_003 = "J=N,I=N:1,FLATHER,ORLANSKI"
OBC_SEGMENT_003_DATA = "SSH=value:0.1,U=value:0.0,V=value:0,TEMP=value:7.0,SALT=value:33.0"

OBC_SEGMENT_004 = "J=0,I=1:N,FLATHER,ORLANSKI"
OBC_SEGMENT_004_DATA = "SSH=value:0.1,U=value:0.0,V=value:0,TEMP=value:7.0,SALT=value:33.0"

MASK_OUTSIDE_OBCS = True
=================================================<<<<<<

MOM6 fails with: FATAL from PE 0: NaN detected: u radiation_OBCs: OBC%tres_[xy]_001

The error message is independent on which OBC I move and independent on the number of grid points.

For the realistic setup, the error message was different and depends on the concrete setting.

Has anybody else detected this behavior or has running a regional model with OBCs inside the model domain?

@kshedstrom
Copy link

I tried moving the "west" boundary in my Bering domain two grids east and everything was fine. Moving the "east" boundary two grids west also works (the ice is in the east and doesn't even know about those two grids being masked out). This is a short test of three hours.

When does yours blow up? Can you put the MOM_input of the idealized case here or does it depend on other files?

@ThomasNeumann2
Copy link
Author

Thanks for trying out. The model blows up almost immediately, a few time steps. Now I am a bit puzzled.
(1) I manipulated my coastal model by using a flat bathymetry in both MOM and SIS.
(2) I started from scratch and made a new mosaic with flat bathy and less grid points.
In (1) all grid points outside the OBC are masked out (wet in ocean_geometry), but not in SIS
In (2) only one row/column outside the OBC is masked out and the other points are wet. Similar as you discovered.
Up to now I did not found the difference which caused the different masking.
However, both models fail with FATAL from PE 0: NaN detected: u radiation_OBCs: OBC%tres_[xy]_001
after a few time steps.

I append MOM*all and MOM_layout.

MOM_parameter_doc.layout.txt
MOM_parameter_doc.all.txt

@kshedstrom
Copy link

You have to have land outside all OBCs, all the way to the edge. Mine was wrong because I was using a MASKING_DEPTH, but hadn't taught MASK_OUTSIDE_OBCS about it. I suggest you get rid of your MASKING_DEPTH line.

@ThomasNeumann2
Copy link
Author

Yes, this is the difference between my cses (1) and (2). No MASKING_DEPTH or MASKING_DEPTH = 0.0 makes all points outside the OBCs to land. But still it blows up . Maybe you can provide me with tar ball of your MOM source? The I can compare whether I have a different development stage in use.

@kshedstrom
Copy link

My sources are here: https://github.com/ESMG/MOM6/tree/fix_obc_maskingdepth

My Bering Sea ascii files are in: https://github.com/ESMG/Bering/tree/main/test_short

@ThomasNeumann2
Copy link
Author

Thanks for the code, but with your MOM6 the bug remains.
I wonder which SIS2 version you are unsing. I am trying out step by step your settings. But some options are not in the code I am using, e.g.:
DO_DYNAMICS, USE_TRIPOLAR_GEOLONB_BUG
I use this repository: https://github.com/NOAA-GFDL/SIS2.git

@ThomasNeumann2
Copy link
Author

I don't think that SIS is the problem. I switched SIS and flux of in input.nml
&coupler_nml
months = 0,
days = 0,
hours = 1,
current_date = 1990,1,1,0,0,0,
calendar = 'gregorian',
dt_cpld = 3600,
dt_atmos = 3600,
do_atmos = .false.,
do_land = .false.,
do_ice = .false.,
do_ocean = .true.,
do_flux = .false.,
atmos_npes = 0,
concurrent = .false.
use_lag_fluxes=.false.
check_stocks = 0
do_chksum = .false.
/

The model blows up after 1 to 2 hours.

@ThomasNeumann2
Copy link
Author

The run time depends on time step setting, but only on DT_THERM. The longer DT_THERM the longer the model runs. For a combination DT=600, and DT_THERM=7200, it runs about 18 hours, for DT_THERM=3600 9 hours. That is, always 8 DT_THERM time steps. I tested several other DT_THERMs.
The barotropic waves develop nicely until NaNs develop at the OBC and spread into the model domain.

If I got it right, you run your model for three hours, that is for 3 DT_THERM time steps?

@ThomasNeumann2
Copy link
Author

Last message for today. I made an ocean_only build and it is the same problem. OBCs at the model's edge run fine but inside the model domain it blows up. Thus, we can exclude SIS and some masking problems, I think.

@kshedstrom
Copy link

OK, yes, I didn't run it long enough. I can go 14 hours, then get:

WARNING from PE    23: Bad ice data End set_ice_surface_state ;  at  172.6  72.9 or i,j,k =   62  14   0; nbad =      2 on pe   23 ; T_sfc =  -1.9034E+09, ps =   9.9999E-01, S_sfc =   3.8475E+09

[n1:1957273:0:1957273] Caught signal 8 (Floating point exception: floating-point invalid operation)

@kshedstrom
Copy link

Does your ocean_only case require any netcdf files at all? Can you make a tarball of the inputs?

@ThomasNeumann2
Copy link
Author

No nc files are needed. Here are my INPUT , input.nml and *_table files. Sorry, the files are not very clean due to the various testing. (I renamed the tars to *.tar.gz to get git accepting it)
INPUT.tar.gz
tables.tar.gz

I discovered the the problem my start in the barotropic solver. I diagnosed each barotropic time step and it seems that "High Frequency Predictor Barotropic SSH" will have at some time an invalid value at the obc.

The first fig shows the Predictor Barotropic SSH in time step 109 with a strong negative value. It happens just once. Fig 2 shows the time development of the predictor ssh at lat=54 for the last time steps until the model blows up. One can also see that the NaNs (blanc values in the fig) propagate inwards one grid cell each time step. After the barotropic mode is completed the corrupted ssh and barotropic velocities enter the baroclinic mode and bring the model to a fatal error.
The West OBC is at I=2, the others at I,J=0,N

Maybe this is a hint where to look at?

eta_pred_hifreq
eta_pred_hifreq100-119

@ThomasNeumann2
Copy link
Author

Ups, I forgot the color bar. the negative value is -0.9.
eta_pred_hifreq

@kshedstrom
Copy link

Yes, I can run your test and see what you are seeing. The T and S values get insane inside the land mask next to the OBC. Then the equation of state can't handle it. I will see if I can track it down.

@ThomasNeumann2
Copy link
Author

Could it be that in MOM_dynamics_split_RK2.F90 is a missmatch between bathyT and h(:,:,:) when eta is initialized at about code line 1546. bathyT(J=OBS) = 10m, while h(i,J=OBC,k)=0.001m. This yields an eta(J=OBC) = -9.99 m, which then enters the BT solver.
This is not the case if the OBC is at J=N. (maybe it is somehow outside the domain?) It is in my teste case so with the OBC now in the North.

@kshedstrom
Copy link

Nothing needs to be valid outside of the OBC. Each OBC is on a grid cell edge and the cell next to it is out of bounds for any algorithm. However, what is going on for me is that the tracer values are getting updated on the outside. I will see about submitting a fix for it.

Tracer values never get updated outside of the computational tile so the edge OBCs are safe from this problem.

@kshedstrom
Copy link

OK, checkout this version: https://github.com/NOAA-GFDL/MOM6/pull/752/files

@kshedstrom
Copy link

It is possible that eta is also getting updated outside the OBCs in a similar manner. Those values get masked out with the special value on output, so I can't see it from outside the model. I don't know how much trouble that would generate, but it can be fixed in the same way - tomorrow.

@ThomasNeumann2
Copy link
Author

Unfortunately the fix does not work for me. I cloned dev/gfdl and copied your updated 2 files in. However, it seems to be the same bug. Have you tried out it in the flat bathymetry ocean-only setup? It is working for you? If so, I would be keen to try your entire code,

@ThomasNeumann2
Copy link
Author

Oh sorry, I have to apologize. It works fine for E and W boundaries, but unfortunately not for N and S.

@kshedstrom
Copy link

OK, sorry about that. Try again.

If the eta outside changing is a problem, I alsohave a eta_outside branch.

@ThomasNeumann2
Copy link
Author

Thanks, now it works fine. I also downloaded your eta_outside patch.
Please tell me if I am too much a pain in the neck. However, there is another issue with MASK_OUTSIDE_OBCS. It seems that it works not correctly if the OBC do not span the entire model edge. It is ok if I,J=N,0 but not inside. I tried the following test cases:

=====================================>>>>

OBC_NUMBER_OF_SEGMENTS = 2
! test case 1
!=========================
!OBC_SEGMENT_001 = "I=N-2,J=70:N-2,FLATHER,ORLANSKI"
!OBC_SEGMENT_002 = "J=N-2,I=N-2:70,FLATHER,ORLANSKI"
!
!=========================
! test case 2
!OBC_SEGMENT_001 = "I=0,J=N:70,FLATHER,ORLANSKI"
!OBC_SEGMENT_002 = "J=N,I=30:0,FLATHER,ORLANSKI"
!
!=========================
! test case 3
!OBC_SEGMENT_001 = "I=2,J=N-2:70,FLATHER,ORLANSKI"
!OBC_SEGMENT_002 = "J=N-2,I=30:2,FLATHER,ORLANSKI"
!
!========================
! test case 4
!OBC_SEGMENT_001 = "I=2,J=30:2,FLATHER,ORLANSKI"
!OBC_SEGMENT_002 = "J=2,I=2:30,FLATHER,ORLANSKI"
!
!========================
! test case 5
!OBC_SEGMENT_001 = "I=0,J=30:0,FLATHER,ORLANSKI"
!OBC_SEGMENT_002 = "J=0,I=0:30,FLATHER,ORLANSKI"
!
!========================
! test case 6
!OBC_SEGMENT_001 = "J=2,I=70:N-2,FLATHER,ORLANSKI"
!OBC_SEGMENT_002 = "I=N-2,J=2:30,FLATHER,ORLANSKI"
!
!========================
! test case 7
OBC_SEGMENT_001 = "J=0,I=70:N,FLATHER,ORLANSKI"
OBC_SEGMENT_002 = "I=N,J=0:30,FLATHER,ORLANSKI"
!
!###############
OBC_SEGMENT_001_DATA = "SSH=value:0.1,U=value:0.0,V=value:0,TEMP=value:7.0,SALT=value:33.0"
OBC_SEGMENT_002_DATA = "SSH=value:0.1,U=value:0.0,V=value:0,TEMP=value:7.0,SALT=value:33.0"

<<<===========================================

test cases 2 and 7 look ok. When the western boundary is involved and J>0 , the mask looks really strange.
Sorry for approaching you again.

@kshedstrom
Copy link

I don't mind this at all, but I think your failing cases are not well formed. Having an open boundary out in the middle of the ocean doesn't work unless you build a staircase of OBCs with connecting corners from domain edge to domain edge or from land to land. The MASK_OUTSIDE routine does a flood fill of turning water to land until it hits more land or the edge. If it fills the entire domain, that is not following the rules. I will try something like this:

OBC_SEGMENT_001 = "J=70,I=N:70,FLATHER,ORLANSKI"
OBC_SEGMENT_002 = "I=70,J=70:N,FLATHER,ORLANSKI"

This runs for me.

I thought of a better way to do the outside eta. If that runs, I'll add it to the main PR.

@ThomasNeumann2
Copy link
Author

You are right, I got it now. My regional model is running fine with all 6 OBCs. Thank you very much for your help and assistance.

@kshedstrom
Copy link

Good to hear! I will close this assuming PR #752 will somehow pass the automated testing and be accepted.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants