Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VODE failure in Detonation with self-consistent NSE #2768

Open
yut23 opened this issue Mar 9, 2024 · 7 comments
Open

VODE failure in Detonation with self-consistent NSE #2768

yut23 opened this issue Mar 9, 2024 · 7 comments

Comments

@yut23
Copy link
Collaborator

yut23 commented Mar 9, 2024

I was testing stuff on my workstation, and ran into this error (which Zhi was able to reproduce on groot, but not on bender).

This change in Microphysics fixes the integration failure, but I have no idea why. I was getting an infinity when dividing by k_B, so I moved it after dividing by T_in (>1) and multiplying by MeV2erg (<1) to try and avoid the overflow. That shouldn't matter though, since it would be clamped to 500 anyway.

diff --git a/nse_solver/nse_solver.H b/nse_solver/nse_solver.H
index 3b6578f4..84cfdbe4 100644
--- a/nse_solver/nse_solver.H
+++ b/nse_solver/nse_solver.H
@@ -135,7 +135,7 @@ void apply_nse_exponent(T& nse_state) {
       exponent = amrex::min(500.0_rt,
                             (zion[n] * nse_state.mu_p + (aion[n] - zion[n]) *
                              nse_state.mu_n - u_c + network::bion(n+1)) /
-                            C::k_B / T_in * C::Legacy::MeV2erg);
+                            T_in * C::Legacy::MeV2erg / C::k_B);
 
       nse_state.xn[n] *= std::exp(exponent);
   }

I built with make USE_MPI=FALSE USE_NSE_NET=TRUE USE_SIMPLIFIED_SDC=TRUE and ran ./Castro1d.gnu.SMPLSDC.ex inputs-det-x.nse_net.

Output from last step

...estimated hydro-limited timestep at level 0: 3.222580504e-07
...hydro-limited CFL timestep constrained at (i) = (193)
Castro::estTimeStep (hydro-limited) at level 0:  estdt = 3.222580504e-07

[Level 0 step 56] ADVANCE at time 9.951155886e-06 with dt = 3.222580504e-07

  Beginning subcycle 1 starting at time 9.951155886e-06 with dt = 3.222580504e-07
  Estimated number of subcycles remaining: 1

Beginning SDC iteration 1 of 2.

... Entering construct_ctu_hydro_source() on level 0

... Leaving construct_ctu_hydro_source() on level 0

Castro::construct_ctu_hydro_source() time = 0.001451384 on level 0

... Entering burner on level 0 and doing full timestep of burning.

burn entered NSE during integration (after 60 steps), zone = (184, 0, 0)
recovering burn failure in NSE, zone = (184, 0, 0)
burn entered NSE during integration (after 60 steps), zone = (185, 0, 0)
recovering burn failure in NSE, zone = (185, 0, 0)
burn entered NSE during integration (after 59 steps), zone = (186, 0, 0)
recovering burn failure in NSE, zone = (186, 0, 0)
burn entered NSE during integration (after 19 steps), zone = (187, 0, 0)
recovering burn failure in NSE, zone = (187, 0, 0)
DVODE: error test failed repeatedly or with abs(H) = HMIN
[ERROR] integration failed in net
istate = -2
zone = (194, 0, 0)
time = 2.025443356e-07
dt = 3.22258050406865e-07
dens start = 212399378.6385347
temp start = 3981719770.328372
rhoe start = 1.904100715097615e+26
xn start = 0.0007153127934977045 0.812844273308981 0.002962516354105738 9.128545590383544e-08 0.05272940826057532 0.002210605450790213 0.001337929768998882 0.001078099952268411 0.006333973696701896 0.01574582073890216 0.01309927771736488 0.020138575954406 0.01535275490213217 0.04506819421271176 0.001062676621846327 0.0006935265398149839 0.001921922549013335 0.006705039892433533 
dens current = 216440008.8108908
temp current = 7511468610.831602
xn current = 0.0006959642445108953 0.5907176390225434 -0.001561313237238796 -1.444856999715345e-09 -0.009999999986059249 1.226987861599856e-07 1.299721875453768e-06 1.530013282592945e-05 0.0004690583581444161 0.002557895007678554 0.02103677770907929 0.003968387633176645 0.00451799109766617 0.007302822611138544 0.001761289015035479 0.01192144675640181 0.07894543844077183 0.2876514946522588 
A(rho) = 19949361507225.72
A(rho e) = 6.987402724044161e+31
A(rho X_k) = 13585002171.45476 -5463760884135.337 -169546081756.1912 -2280381.087711663 -3832671099895.636 -160642110030.9785 -97049770872.25642 -77427775252.36699 -416523621923.3657 -913454101324.0404 1029506938445.772 -1102818950874.122 -702520593397.8518 -2607860536593.861 85364403657.76393 1052176817164.868 7168432931476.591 26144573220746.41 
rho = 216440008.811
T =   7511468610.83
xn = 0.000695964244511 0.590717639023 -0.00156131323724 -1.44485699972e-09 -0.00999999998606 1.2269878616e-07 1.29972187545e-06 1.53001328259e-05 0.000469058358144 0.00255789500768 0.0210367777091 0.00396838763318 0.00451799109767 0.00730282261114 0.00176128901504 0.0119214467564 0.0789454384408 0.287651494652 
(i, j, k) = 194 0 0
y[SRHO] = 216440008.811
y[SEINT] = 2.81045443531e+26
y[SFS:] = 148912.641199 126393452.715 0 0 0 26.253360662 278.096207995 3273.70724542 100362.510774 547302.400243 4501153.83899 849099.775484 966696.19535 1562555.27753 376855.853175 2550783.51713 16891634.6516 61547621.3777 
ydot_a[SRHO] = 1.99493615072e+13
ydot_a[SEINT] = 6.98740272404e+31
ydot_a[SFS:] = 13585002171.5 -5.46376088414e+12 -169546081756 -2280381.08771 -3.8326710999e+12 -160642110031 -97049770872.3 -77427775252.4 -416523621923 -913454101324 1.02950693845e+12 -1.10281895087e+12 -702520593398 -2.60786053659e+12 85364403657.8 1.05217681716e+12 7.16843293148e+12 2.61445732207e+13 

0
amrex::Error::0::unsuccessful burn !!!
SIGABRT
See Backtrace.0 file for details

Castro --describe

==============================================================================
 Castro Build Information
==============================================================================
build date:    2024-03-08 18:59:30.544102
build machine: Linux xrb 6.2.8-200.fc37.x86_64 #1 SMP PREEMPT_DYNAMIC Wed Mar 22 19:11:02 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
build dir:     /home/eric/dev/Castro/Exec/science/Detonation
AMReX dir:     /home/eric/dev/amrex

COMP:          gnu
COMP version:  12.2.1

C++ compiler:  g++
C++ flags:     -Werror=return-type -gdwarf-4 -O3 -std=c++17  -pthread   -DBL_NO_FORT -DAMREX_GPU_MAX_THREADS=0 -DBL_SPACEDIM=1 -DAMREX_SPACEDIM=1 -DBL_FORT_USE_UNDERSCORE -DAMREX_FORT_USE_UNDERSCORE -DBL_Linux -DAMREX_Linux -DAMREX_DIMENSION_AGNOSTIC -DNDEBUG -DAMREX_NO_PROBINIT -DSPONGE -DREACTIONS -DSDC_EVOLVE_ENERGY -DSIMPLIFIED_SDC -DSDC -DSCREEN_METHOD=SCREEN_METHOD_chabrier1998 -DNSE_NET -DNSE -DNETWORK_HAS_CXX_IMPLEMENTATION -DALLOW_JACOBIAN_CACHING -DSCREENING -DNEUTRINOS -DNAUX_NET=0 -Itmp_build_dir/s/1d.gnu.SMPLSDC.EXE -I. -I/home/eric/dev/amrex/Src/Base -I/home/eric/dev/amrex/Src/Base/Parser -I/home/eric/dev/amrex/Src/AmrCore -I/home/eric/dev/amrex/Src/Amr -I/home/eric/dev/amrex/Src/Boundary -I/home/eric/dev/Microphysics/util -I/home/eric/dev/Microphysics/util/gcem/include -I/home/eric/dev/Microphysics/integration/VODE -I/home/eric/dev/Microphysics/integration/utils -I/home/eric/dev/Microphysics/integration -I/home/eric/dev/Microphysics/screening -I/home/eric/dev/Microphysics/neutrinos -I./ -I/home/eric/dev/Castro/Source/driver -I/home/eric/dev/Castro/Source/hydro -I/home/eric/dev/Castro/Source/problems -I/home/eric/dev/Castro/Source/sources -I/home/eric/dev/Castro/Source/reactions -I/home/eric/dev/Microphysics/EOS -I/home/eric/dev/Microphysics/EOS/helmholtz -I/home/eric/dev/Microphysics/networks/subch_base -I/home/eric/dev/Microphysics/EOS -I/home/eric/dev/Microphysics/networks -I/home/eric/dev/Microphysics/interfaces -I/home/eric/dev/Microphysics/constants -I/home/eric/dev/Microphysics/nse_solver -I/home/eric/dev/Microphysics/util/hybrj -Itmp_build_dir/castro_sources/1d.gnu.SMPLSDC.EXE -I/home/eric/dev/amrex/Tools/C_scripts 

Fortran comp:  gfortran
Fortran flags: -g -O3 -ffree-line-length-none -fno-range-check -fno-second-underscore -fimplicit-none 

Link flags:    -L. 
Libraries:       

EOS: /home/eric/dev/Microphysics/EOS/helmholtz
NETWORK: /home/eric/dev/Microphysics/networks/subch_base
INTEGRATOR: VODE
SCREENING: chabrier1998

Castro       git describe: 24.03-1-g25cc6d8f8
AMReX        git describe: 24.03-5-gba95d4cd8
Microphysics git describe: 24.03-2-g9dd9dbb6

@zhichen3
Copy link
Collaborator

zhichen3 commented Mar 9, 2024

what was the value you're getting after moving k_B to the end?

@yut23
Copy link
Collaborator Author

yut23 commented Mar 9, 2024

I get 1.0326211552283207e+297, with mu_p = 2.2538475573742366e+296. I also tried avoiding the overflow altogether by replacing the amrex::min call with a comparison against 500.0_rt * C::k_B * T_in / C::Legacy::MeV2erg, but the integration still failed:

      // prevent an overflow on exp by capping the exponent -- we hope that a subsequent
      // iteration will make it happy again

      Real numer = (zion[n] * nse_state.mu_p + (aion[n] - zion[n]) *
                    nse_state.mu_n - u_c + network::bion(n+1));
      if (numer > 500.0_rt * T_in * C::k_B / C::Legacy::MeV2erg) {
          exponent = 500.0_rt;
      } else {
          exponent = numer / C::k_B / T_in * C::Legacy::MeV2erg;
          // uncommenting this fixes the integration failure
          //exponent = numer / T_in * C::Legacy::MeV2erg / C::k_B;
      }

@zhichen3
Copy link
Collaborator

zhichen3 commented Mar 9, 2024

I think the problem is with nse state mass fraction in the previous timestep. If it overflows we get e500, but we get e297, which means a different NSE composition, and its the wrong one and could be an unrealistic one. So vode in the next timestep is having hard time integrating.

So I think we need to be careful about the overflowing issue, and most importantly mu_p and mu_n need to be capped during hybrj because it certainly cannot be e296, which I think is also why it thinks e500 is a valid exponent that satisfies the constraint equations due to the absurd chemical potentials.

@zhichen3
Copy link
Collaborator

zhichen3 commented Mar 9, 2024

hmm, never mind. I think it never really returns those answers as the solution.

@zingale
Copy link
Member

zingale commented Jun 24, 2024

is this still an issue?

@yut23
Copy link
Collaborator Author

yut23 commented Jun 25, 2024

I managed to reproduce it with burn_cell_sdc on the same checkout of Microphysics, but I had to manually set burn_state.y[9] = 3344402.5410965709 in burn_cell.H. The crash only shows up with that exact value, which is not accessible by doing rho * xn[9]: rho * xn[9] is 1 ULP too low, and rho * nextafter(xn[9], xn[9]+1) is 1 ULP too high.

It no longer crashes on development, and git bisect says it was fixed in AMReX-Astro/Microphysics@f361dc3. I'm pretty sure whatever is causing this is still present, it's just harder to find.

@yut23
Copy link
Collaborator Author

yut23 commented Jun 25, 2024

Here's the inputs file I used:

unit_test.density = 212399378.63853467
unit_test.temperature = 3981719770.3283715
unit_test.rhoe = 1.904100715097615e+26
unit_test.tmax = 3.22258050406865e-07
unit_test.nsteps = 1

unit_test.X1 = 0.0007153127934977045
unit_test.X2 = 0.812844273308981
unit_test.X3 = 0.0029625163541057383
unit_test.X4 = 9.128545590383544e-08
unit_test.X5 = 0.052729408260575324
unit_test.X6 = 0.0022106054507902134
unit_test.X7 = 0.0013379297689988816
unit_test.X8 = 0.0010780999522684105
unit_test.X9 = 0.0063339736967018955
unit_test.X10 = 0.015745820738902152  # this is the problematic value
unit_test.X11 = 0.013099277717364884
unit_test.X12 = 0.020138575954406004
unit_test.X13 = 0.015352754902132172
unit_test.X14 = 0.04506819421271176
unit_test.X15 = 0.0010626766218463275  # this one is also different from Castro
unit_test.X16 = 0.0006935265398149839
unit_test.X17 = 0.0019219225490133352
unit_test.X18 = 0.006705039892433533


unit_test.mu_p = -7.732686016863321
unit_test.mu_n = -9.806752357425662

unit_test.Adv_rho = 19949361507225.723
unit_test.Adv_rhoe = 6.987402724044161e+31

unit_test.Adv_X1 = 13585002171.454762
unit_test.Adv_X2 = -5463760884135.337
unit_test.Adv_X3 = -169546081756.1912
unit_test.Adv_X4 = -2280381.087711663
unit_test.Adv_X5 = -3832671099895.6357
unit_test.Adv_X6 = -160642110030.9785
unit_test.Adv_X7 = -97049770872.25642
unit_test.Adv_X8 = -77427775252.36699
unit_test.Adv_X9 = -416523621923.3657
unit_test.Adv_X10 = -913454101324.0404
unit_test.Adv_X11 = 1029506938445.7719
unit_test.Adv_X12 = -1102818950874.1218
unit_test.Adv_X13 = -702520593397.8518
unit_test.Adv_X14 = -2607860536593.861
unit_test.Adv_X15 = 85364403657.76393
unit_test.Adv_X16 = 1052176817164.8684
unit_test.Adv_X17 = 7168432931476.591
unit_test.Adv_X18 = 26144573220746.41


integrator.call_eos_in_rhs = 1

integrator.rtol_spec = 1.e-6
integrator.atol_spec = 1.e-6
integrator.rtol_enuc = 1.e-6

integrator.jacobian = 2
integrator.renormalize_abundances = 0
integrator.ode_max_steps = 1500000

nse.nse_molar_independent = 0
nse.nse_dx_independent = 1
nse.ase_tol = 0.1
nse.nse_skip_molar = 0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants