Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Questionable parametrization of V_pulse in Spice3 library leads to numerically unreliable computations #4478

Open
casella opened this issue Oct 15, 2024 · 22 comments · May be fixed by #4481
Open
Assignees
Labels
L: Electrical.Spice3 Issue addresses Modelica.Electrical.Spice3 V: 4.1.0-dev Issue originates in MSL v4.1.0-dev (and is not present in earlier releases)
Milestone

Comments

@casella
Copy link
Contributor

casella commented Oct 15, 2024

While analyzing possible regressions from MSL 4.0.0 to 4.1.0 in OpenModelica, we found issues with several Spice3 model:

Modelica.Electrical.Spice3.Examples.Inverter
Modelica.Electrical.Spice3.Examples.FourInverters
Modelica.Electrical.Spice3.Examples.InvertersApartRecord
Modelica.Electrical.Spice3.Examples.InvertersExtendedModel
Modelica.Electrical.Spice3.Examples.Nor
Modelica.Electrical.Spice3.Examples.Spice3BenchmarkFourBitBinaryAdder

all failing at the beginning of the simulation with this runtime error:

division by zero at time 1.000000000300011, (a=inf) / (b=0), where divisor b expression is: v.Tfalling - v.Twidth

I checked the Inverter model more in depth. It contains an instance v of the Modelica.Electrical.Spice3.Sources.V_pulse model:

model V_pulse "Pulse voltage source"
extends Modelica.Electrical.Analog.Interfaces.OnePort;
parameter SI.Voltage V1 = 0 "Initial value";
parameter SI.Voltage V2 = 0 "Pulsed value";
parameter SI.Time TD = 0.0 "Delay time";
parameter SI.Time TR(start=1) "Rise time";
parameter SI.Time TF = TR "Fall time";
parameter SI.Time PW = Modelica.Constants.inf "Pulse width";
parameter SI.Time PER= Modelica.Constants.inf "Period";
protected
parameter SI.Time Trising=TR "End time of rising phase within one period";
parameter SI.Time Twidth=Trising + PW
"End time of width phase within one period";
parameter SI.Time Tfalling=Twidth + TF
"End time of falling phase within one period";
SI.Time T0(final start=TD, fixed=true) "Start time of current period";
Integer counter(start=-1, fixed=true) "Period counter";
Integer counter2(start=-1, fixed=true);
equation
when pre(counter2) <> 0 and sample(TD, PER) then
T0 = time;
counter2 = pre(counter);
counter = pre(counter) - (if pre(counter) > 0 then 1 else 0);
end when;
v = V1 + (if (time < TD or counter2 == 0 or time >= T0 +
Tfalling) then 0 else if (time < T0 + Trising) then (time - T0)*
(V2-V1)/Trising else if (time < T0 + Twidth) then V2-V1 else
(T0 + Tfalling - time)*(V2-V1)/(Tfalling - Twidth));

The component v is instantiated as:

Sources.V_pulse v(V2=5, TR=0.1e-12) annotation (Placement(

so both v.PW and v.PER retain their default value of Modelica.Constants.inf. The resulting chain of parameter assignments is thus:

v.Trising := v.TR = 1e-13
v.Twidth := v.Trising + v.PW = 1e-13 + 1.7e308
v.Tfalling := v.Twidth + v.TF + v.TR = 1e-13 + 1.7e308 + 1e-13

it is then obvious that the computation of the denominator v.Tfalling - v.Twidth = 1e-13 + 1.7e308 + 1e-13 - (1e-13 + 1.7e308) in the expression for v.v is numerically ill-defined. In fact, it looks quite odd to me that this issue hasn't been raised before.

I also checked the documentation of the component, which is not very clear as it states "All parameters of sources should be set explicitly"; if "set explicitly" meant "have explicit modifiers", then there should be no defaults.

In any case, if a MSL component has default values, it should work properly when those values are not changed, and this is obviously not the case with this component, so we should do something about it.

@HansOlsson
Copy link
Contributor

The (T0 + Tfalling - time)*(V2-V1)/(Tfalling - Twidth) is found inside else-part of if (time < T0 + Twidth) then V2-V1 else (T0 + Tfalling - time)*(V2-V1)/(Tfalling - Twidth).

Thus it should only be activated when time>=T0+Twidth, and as previously discussed Twidth is infinite, i.e., the else-branch is not active.

Trying to evaluate an else-branch in such cases is a clear tool-error.

@HansOlsson HansOlsson added tool-issue Issue targets a specific Modelica tool only invalid Invalid issue labels Oct 15, 2024
@casella
Copy link
Contributor Author

casella commented Oct 15, 2024

@HansOlsson I'm not sure about that.

It is well-known that writing something like

v = if x>=0 then sqrt(x) else 0;

does not prevent a tool from computing sqrt(x) with negative x. Why should this case be different?

@casella
Copy link
Contributor Author

casella commented Oct 15, 2024

I checked the structure of those failing models. They all have one feature in common: on the left side, there is a V_pulse component which is obviously meant to generate some kind of input signal to the circuit. On the right side, there is another V_pulse component that apparently provides the voltage supply to the circuit. I'm not 100% sure why the right-side component was not implemented with a V_constant source, but I guess the intent was to start the simulation with the source switched off and then ramp it up gradually, to avoid initialization problem. It would be good to have some feedback from @kristinmajetta about this, as she actually wrote those models in 2009.

If that is the intent, basically using V_pulse as a truncated ramp source, then we should modify the defaults so that this ramp is computed reliably. What we need to avoid is the division by zero when computing those differences, so we need the default for PER to be larger than the default for PW, e.g.

  parameter SI.Time PW = 1e10 "Pulse width";
  parameter SI.Time PER= 1e11 "Period";

I guess those two values (approximately 300 and 3000 years) are large enough to represent an infinite time horizon for an electronic circuit for all practical purposes, but they will avoid the division-by-zero issue.

EDIT: unfortunately this won't solve the Tfalling-Twidth = 0 issue, since 1e-13 + 1e10 + 1e-13 - (1e-13 + 1e10) is still not very numerically reliable.

@casella
Copy link
Contributor Author

casella commented Oct 15, 2024

Comments from other tool developers are welcome 😃

@HansOlsson
Copy link
Contributor

HansOlsson commented Oct 15, 2024

@HansOlsson I'm not sure about that.

It is well-known that writing something like

v = if x>=0 then sqrt(x) else 0;

does not prevent a tool from computing sqrt(x) with negative x. Why should this case be different?

Because that is due to event-epsilon, and as stated in https://specification.modelica.org/maint/3.6/operators-and-expressions.html#evaluation-order "relational operators generating state or time events will during continuous integration have the value from the most recent event".

In x>=0 the variable x may during continuous integration drop below 0, whereas time < T0 + Twidth with right-hand-side infinity would require that time magically increased to infinity (and beyond) during integration to a finite point.

Added:
Oh, and even if time magically did increase to infinity it wouldn't change anything, as the boolean expression keeps the value from the most recent event; so it would require time to be at infinity at an event (when integrating between finite points in time), and then decrease to be inside the finite interval during the continuous integration.

@casella
Copy link
Contributor Author

casella commented Oct 15, 2024

I checked more in depth what happens in OpenModelica. If we define ModelicaServices.Machine.inf with any value up to half than the largest representable IEEE 754 double precision number (0.89885e308) this works as expected, see @HansOlsson's comment. With larger values of ModelicaServices.Machine.inf, I get the division by zero. Apparently, there is some subtle overflow issue going on in the run-time event detection mechanism.

Problem solved.

EDIT: the problem is not solved, see below.

@casella casella closed this as not planned Won't fix, can't repro, duplicate, stale Oct 15, 2024
@casella
Copy link
Contributor Author

casella commented Oct 15, 2024

In fact, it turns out the issue with OpenModelica was triggered by #4042, which changed

final constant Real inf=1e60
"Biggest Real number such that inf and -inf are representable on the machine";

into

final constant Real inf=1.7976931348623157E+308
"Maximum representable finite floating-point number";

so in some sense it is indeed a regression caused by changes of the MSL. I'm not 100% sure how this should be handled, but I guess since ModelicaService can be patched by each tool vendor, this is really a tool issue, so the easiest way for us is just to patch it locally.

However, I'm still wondering, we are actually using Modelica.Constants.inf in models to represent very large values (e.g. as defaults for event times that are never triggered) that can still be represented. Which means, people do write (as in the V_pulse model) Modelica code such as

v = if time < Modelica.Constants.inf then 0 else 1;

Having the maximum representable value there seems to me a bit borderline (in the precise meaning of the word). Shouldn't we allow for some safety margin? Mabye 1e60 was really too small, but what about 1e300?

@HansOlsson, @maltelenz, @eshmoylova, @hubertus65 what do you think?

@HansOlsson
Copy link
Contributor

v = if time < Modelica.Constants.inf then 0 else 1;

Having the maximum representable value there seems to me a bit borderline (in the precise meaning of the word). Shouldn't we allow for some safety margin? Mabye 1e60 was really too small, but what about 1e300?

@HansOlsson, @maltelenz, @eshmoylova, @hubertus65 what do you think?

There are three parts to this:

  • Should Modelica.Constants.inf have a value corresponding to the floating point numbers (technically DBL_MAX - not infinity)? I think the answer is clearly yes.
  • Should the line above work? I would say so.
  • Should we use Modelica.Constants.inf in this way in models or some other number? That isn't clear, but other models have shown that 1e300 wouldn't be safe either - it depends on the operations:
    • if add a normal number it should work I think
    • if you multiply it by another number it depends on the size of that number
    • if you cube it (as was done in some models) you need something around 1e100 or less.

For cubing and multiplying I believe it would be better to rewrite the model.

@casella
Copy link
Contributor Author

casella commented Oct 15, 2024

I moved the discussion to #4479 to avoid spillover.

@beutlich beutlich added the V: 4.1.0-dev Issue originates in MSL v4.1.0-dev (and is not present in earlier releases) label Oct 16, 2024
@beutlich beutlich added this to the never milestone Oct 16, 2024
@casella
Copy link
Contributor Author

casella commented Oct 17, 2024

Given how the discussion in #4479 goes, maybe I should re-open this ticket.

We do have a regression in OpenModelica, due to #4042, which changed from MSL 4.0.0 to 4.1.0. The reason is that PW and PER went from 1e60 to 1.7e308.

Since there is no reason whatsoever to have such huge default values in this component, which is meant for circuit simulations that typically last few seconds, I think the best solution is to change them to more reasonable values like 1e10 (300 years). That achieves the same purpose (obtaining a ramp instead of a periodic pulse) and avoids the regression by steering clear of finite precision representation limits.

@casella casella reopened this Oct 17, 2024
@casella
Copy link
Contributor Author

casella commented Oct 17, 2024

I'd like to hear opinions outside the set of Dymola developers, if possible 😃

@casella casella linked a pull request Oct 17, 2024 that will close this issue
@casella
Copy link
Contributor Author

casella commented Oct 17, 2024

I thought a bit more about this issue, and I think I can articulate a more compelling argument in favour of changing the defaults of this model than "it doesn't work in OpenModelica", which per se is not a valid argument to change a model in the MSL.

These are the parameter definitions of the V_pulse model

parameter SI.Voltage V1 = 0 "Initial value";
parameter SI.Voltage V2 = 0 "Pulsed value";
parameter SI.Time TD = 0.0 "Delay time";
parameter SI.Time TR(start=1) "Rise time";
parameter SI.Time TF = TR "Fall time";
parameter SI.Time PW = Modelica.Constants.inf "Pulse width";
parameter SI.Time PER= Modelica.Constants.inf "Period";
protected
parameter SI.Time Trising=TR "End time of rising phase within one period";
parameter SI.Time Twidth=Trising + PW
"End time of width phase within one period";
parameter SI.Time Tfalling=Twidth + TF
"End time of falling phase within one period";

and this is the equation that determines the output voltage

v = V1 + (if (time < TD or counter2 == 0 or time >= T0 +
Tfalling) then 0 else if (time < T0 + Trising) then (time - T0)*
(V2-V1)/Trising else if (time < T0 + Twidth) then V2-V1 else
(T0 + Tfalling - time)*(V2-V1)/(Tfalling - Twidth));

The convention in the MSL is that if a parameter has a default value, this should correspond to some reasonable default behaviour. In this case, the periodic pulse degenerates into a truncated ramp with only a rising front.

The problem is that after #4042, the definition of Modelica.Constants.inf became:

final constant Real inf=ModelicaServices.Machine.inf
"Maximum representable finite floating-point number";

Now, consider the binding equation for Twidth, which is used in the equation that determines the output voltage v:

Twidth = Trising + PW

if the value of PW is the maximum representable finite floating-point number and Trising is a positive number, then Twidth is obviously undefined, because you are adding something to a number which is by definition the largest possible one that can be represented in Modelica. In fact, trying to compute such a value should probably throw an overflow error, in the same way as computing 1/0 throws a division-by-zero error in all Modelica tools. Let me remind that IEEE 754 defines Inf and NaN and the associated semantics, but Modelica 3.6, which we must use for MSL 4.1.0, doesn't.

Hence, the default behaviour of this model is highly questionable and totally implementation-dependent; as such, it shouldn't belong to the Modelica Standard library. The fact that some Modelica tools can produce meaningful results out of such undefined expressions is irrelevant and doesn't make it a valid standard Modelica model.

Summing up, if Modelica.Constants.inf is not just a very large number but rather the largest one that can be represented in Modelica, it should not be used as a default for PW and PER; a smaller value should be used that ensures all expressions determining events in the equation for v, i.e., T0 + Tfalling, T0 + Trising, T0 + Twidth are valid expressions with well-defined values.

In #4479 I proposed to define a one-fits-all value for this kind of problem in Modelica.Constants, as I believe that the previously used value of 1e60 (possibly with a name different from inf) could do a good job in most cases, but I can agree that fulfilling that requirement in a completely general way may be problematic and bear unintended side effects.

Hence, the only solution I see to solve this issue is to define a domain-specific large value that makes sense in the context of the Spice3 package and doesn't lead to undefined expressions, e.g.:

constant SI.Time infiniteTime= 1e20 "Numerically safe very large time horizon (3e12 years)"

and use it to set default values for PW and PER in both V_pulse and I_pulse. PR #4481 implements this proposal.

I hope we can get some consensus on this 😅

@casella casella modified the milestones: never, MSL4.1.0 Oct 17, 2024
@casella casella removed invalid Invalid issue tool-issue Issue targets a specific Modelica tool only labels Oct 17, 2024
@HansOlsson
Copy link
Contributor

HansOlsson commented Oct 18, 2024

I thought a bit more about this issue, and I think I can articulate a more compelling argument in favour of changing the defaults of this model than "it doesn't work in OpenModelica", which per se is not a valid argument to change a model in the MSL.
...
if the value of PW is the maximum representable finite floating-point number and Trising is a positive number, then Twidth is obviously undefined, because you are adding something to a number which is by definition the largest possible one that can be represented in Modelica. In fact, trying to compute such a value should probably throw an overflow error, in the same way as computing 1/0 throws a division-by-zero error in all Modelica tools. Let me remind that IEEE 754 defines Inf and NaN and the associated semantics, but Modelica 3.6, which we must use for MSL 4.1.0, doesn't.

That is not correct reasoning according to IEEE-754; it depends on how much you add, in this case the addition is insignificant and doesn't impact the result.

However, if OpenModelica (not MAP-Lib) wants to change the OpenModelica version of ModelicaServices to work around a weird bug in the OMC compiler (instead of fixing it), that would be one thing, but I still don't see why changing the Spice models is needed at all.

As far as I previously understood the bug in the OpenModelica compiler is that if time<1.7e308 then ... else a/0 incorrectly generates a division by zero, when time has a normal value.

Discussing this overflow is a red herring - as previously indicated the problem occurred if Modelica.Constants.inf was above something like 0.8e308 - and adding a small finite value to that will surely not overflow.

Similarly asking why people would want to integrate that far is a distraction, as the problem occurs for normal times - it's just triggered by these large values in unknown ways.

@DagBruck
Copy link
Contributor

I think the problem is identified, "solved". #4478 (comment)

@casella
Copy link
Contributor Author

casella commented Oct 18, 2024

That is not correct reasoning according to IEEE-754; it depends on how much you add, in this case the addition is insignificant and doesn't impact the result.

IEEE-754 is totally irrelevant here. The Modelica Specification, upon which the MSL builds, does not mention IEEE-754 semantics, except for external fuctions, which are not involved at all in this regression issue.

Let me summarize again my point: the current code of V_pulse and I_pulse in Spice3 is invalid, because it requires to add some positive offsets to a number which is defined as the maximum representable one. This is a bad idea, irrespective of tool implementation details. The fix is simply to use a much smaller number that perfectly serves the modelling purpose. It's as simple as that.

@casella
Copy link
Contributor Author

casella commented Oct 18, 2024

I think the problem is identified, "solved". #4478 (comment)

That's what I thought, until I went through a more thorough analysis later on. I hope I am entitled to change my mind 😄

I added a comment to make it clearer, I can delete that post if you wish.

@casella
Copy link
Contributor Author

casella commented Oct 18, 2024

As far as I previously understood the bug in the OpenModelica compiler is that if time<1.7e308 then ... else a/0 incorrectly generates a division by zero, when time has a normal value.

That's not correct, it also took me a while to figure out.

The problem is that the conditions time < T0 + Trising and time < T0 + Twidth in the conditional equations contain expressions with undefined values, so they are not computed correctly (I guess there's some overflow involved), causing the else clause to be computed when it actually shouldn't.

As I wrote, the problem is upstream: in Modelica (not in IEEE 754 or other irrelevant contexts) the values of T0 + Trising and T0 + Twidth is not well defined, because you are adding something to the maximum possible representable number. From that point onwards, everything can happen, undefined stuff leads to non-deterministic behaviour. IMHO this makes this model invalid, period.

The solution is to completely avoid this issue, by using an appropriate default value for PW and PER. No big deal. No need to change Modelica.Constants.inf for that. See PR #4481.

@beutlich beutlich added the L: Electrical.Spice3 Issue addresses Modelica.Electrical.Spice3 label Oct 18, 2024
@dzimmer
Copy link
Contributor

dzimmer commented Oct 19, 2024

commented on it in #4479

@casella
Copy link
Contributor Author

casella commented Oct 20, 2024

@dzimmer I agree with your comments in #4479 for the long term, regarding how we could handle infinity better than we do now, both in Modelica 3.7 and MSL 4.2.0, and beyond. I understand that is a far from trivial matter that woud require a bit more time for a proper discussion, involving all tool vendors, not just DS and OSMC.

At the moment, however, we need to address this single regression in the short time availble before the release of MSL 4.1.0, without unwanted side effects on other libraries (also beyond the MSL) and other tools.

In #4479 you pointed out that as of MLS 3.6 and after #4042, the V_pulse model has an undefined behaviour.

Verdict: The model with its default parameterization is invalid as soon as T0 > 0.

so it has no place in the MSL as currently formulated.

PR #4481 fixes this issue for MSL 4.1.0 locally in a simple, effective way, without any side effects for other models and tools. I don't think anybody can seriously argue that 1e20 (250 times the currently estimated age of the universe) is not an appropriate default value for an infinite time constant to be used for the periods and pulse durations of the V_pulse and I_pulse models. There's no need to be an expert in electronic circuit simulation to agree with that.

This would allow us to close this issue now and proceed happily with the MSL 4.1.0 release.

We can then reopen the discussion in #4479 with a broader scope, trying to figure out a better way to handle these issues in general, which will possibly require coordinated changes in the MSL (version >= 3.7) and MLS (version >= 4.2.0), as well as FMI and eFMI, to the maximum possible extent possible.

I don't think there is any reason to hasten the discussion in #4479, which has potentially a much broader scope than this specific issue, to the very close deadline of MSL 4.1.0 release, nor to delay the release of MSL 4.1.0 because of this extremely niche issue.

Do you agree with that?

@dzimmer
Copy link
Contributor

dzimmer commented Oct 21, 2024

Seems reasonable to me.

People take the MSL often as an example how to model stuff. This use of Modelica.Constants.inf is not a good example. So it is an improvment when we remove this.

@casella
Copy link
Contributor Author

casella commented Oct 24, 2024

I wish some of the summoned reviewers (@AHaumer, @christiankral, @kristinmajetta, @christophclauss) of #4481 also agrees, so we can move forward with the 4.1.0 release.

@casella
Copy link
Contributor Author

casella commented Nov 12, 2024

Maybe after all #4481 is not necessary, see discussion in #4503

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
L: Electrical.Spice3 Issue addresses Modelica.Electrical.Spice3 V: 4.1.0-dev Issue originates in MSL v4.1.0-dev (and is not present in earlier releases)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants