Datetime Extension #80

apg · 2024-08-13T22:33:52Z

Proposal for adding a datetime extension, co-authored with @khieta.

emina · 2024-08-13T23:38:40Z

Thank you for the carefully thought out RFC!

I agree that having some support for dates/times would be useful, but unfortunately, this proposal is out of bounds of what's analyzable.

Specifically, we cannot support the proposed time representation and operators for two reasons.

First, if we keep everything else the same, then it's not possible to enforce the well-formedness constraints on time components (year, month, day, hours, minutes, seconds) without using quantifiers, which would put the encoding into the undecidable fragment of SMT. This is because Cedar lets you put arbitrary types into sets, and to ensure that every time value in a set is well-formed, we would have to use quantifiers.

Second, even if we were able to ground the quantifiers by imposing restrictions on how times can be used (e.g., time values can't be included in sets, either directly or transitively as part of another value), enforcing and solving these ground time constraints would still be expensive. For each occurrence of a datetime value, we would need to emit an assumption that says if the year field corresponds to a leap year, and if the month is February, then the month has 29 days, otherwise it has 28 days. We would also need to emit an if-then-else condition that covers all the other months too. Note that determining if a given year is a leap year itself requires checking multiple conditions. Specifically, it is not enough to check if the year is divisible by four (see here).

To retain analyzability, we can go the IAM route (see here) and support date comparisons, perhaps with some duration operations as well. The IAM operators are analyzable because we can internally represent time values as Unix epochs, which are unconstrained 64-bit integers, and the IAM operations on this representation reduce to long comparisons. We are free to make the string representation of times richer than what IAM supports as long as the internal representation is longs, and the operators on times are expressible as simple operations on longs.

apg · 2024-08-14T07:56:04Z

Thanks, @emina, for taking a look! A few ideas / questions in what follows...

Specifically, we cannot support the proposed time representation and operators for two reasons.

First, if we keep everything else the same, then it's not possible to enforce the well-formedness constraints on time components (year, month, day, hours, minutes, seconds) without using quantifiers, which would put the encoding into the undecidable fragment of SMT. This is because Cedar lets you put arbitrary types into sets, and to ensure that every time value in a set is well-formed, we would have to use quantifiers.

I'm not tied to the proposed representation. I am hopeful we can define an abstract data type with useful, and practical time operations. If using a long instead of a "record like" is the way to get there, than so be it.

Please forgive the ridiculous questions and thoughts I'm replying with, as I'm more or less ignorant in Automated Theorem Proving, SAT, and SMT.

As specified in this document, the datetime constructor is the only way to set the fields of a datetime object. The offset() method does not make an in-place modification, but rather, instantiates a new datetime with an updated set of component values. If the constructor only produces well formed datetime objects, or "errors out" at runtime, then I don't see how this is any different than a long?

Second, even if we were able to ground the quantifiers by imposing restrictions on how times can be used (e.g., time values can't be included in sets, either directly or transitively as part of another value), enforcing and solving these ground time constraints would still be expensive. For each occurrence of a datetime value, we would need to emit an assumption that says if the year field corresponds to a leap year, and if the month is February, then the month has 29 days, otherwise it has 28 days. We would also need to emit an if-then-else condition that covers all the other months too. Note that determining if a given year is a leap year itself requires checking multiple conditions. Specifically, it is not enough to check if the year is divisible by four (see here).

Can we prove these operations under a simpler calendar model, and then let implementations choose the most appropriate calendar model (cough Gregorian) for their implementation? Most general purpose programming languages have robust date/time facilities, so it stands to reason that different implementations of Cedar will adopt them, rather than start from scratch.

I believe that the operations requested in the this API spec are valid under any calendar that has years, months, days, hours, minutes, seconds, and whose measurements (instants in time) form a partial order.

To retain analyzability, we can go the IAM route (see here) and support date comparisons, perhaps with some duration operations as well. The IAM operators are analyzable because we can internally represent time values as Unix epochs, which are unconstrained 64-bit integers, and the IAM operations on this representation reduce to long comparisons. We are free to make the string representation of times richer than what IAM supports as long as the internal representation is longs, and the operators on times are expressible as simple operations on longs.

The current IAM operations are insufficient to express many of the examples in this RFC. Adding additional functions and a duration type could improve the situation, but I think we'll still fall short. There wouldn't be a facility for policies that have periodic decision making abilities (e.g. "everyday between $x and $y"), which we've identified as important for our customers.

emina · 2024-08-14T11:47:49Z

I'll try to clarify some of the concerns around the proposed API and explain better (hopefully :) why the current API is not analyzable. To a first approximation, the best we can do is IAM plus a few duration operations (TBD). Let's dig into why this is the case.

Please forgive the ridiculous questions and thoughts I'm replying with, as I'm more or less ignorant in Automated Theorem Proving, SAT, and SMT.

As specified in this document, the datetime constructor is the only way to set the fields of a datetime object. The offset() method does not make an in-place modification, but rather, instantiates a new datetime with an updated set of component values. If the constructor only produces well formed datetime objects, or "errors out" at runtime, then I don't see how this is any different than a long?

These are great questions :). And your point is exactly right for the runtime implementation of the datetime constructor that we'd have in Lean, Rust, or Go. The issue is the analysis-time encoding of various operators.

To understand the issues here, it helps to have a mental model of how the analysis works. The analyzer is a compiler: it takes as input a set of Cedar policies, a property we want to check about them (e.g., are they equivalent?), and compiles these policies into the input language of the SMT solver. The solver then checks if the property holds and, if not, it returns a concrete input (a request and entities) that violate the property. It helps to think of the solver as testing the encoded policies on the (infinite) set of all possible inputs.

Now, the language that the solvers support is very limited compared to a general purpose language. And the practical subset of this language is even more limited than what the SMT standard suggests, because some operators in the SMT language are either undecidable (the solver can run forever trying to test the policies) or extremely inefficient to reason about (the solver will come back but not in any reasonable time).

So, what's expressible in Cedar is limited to what we can efficiently compile to this core functional language: think, no loops or recursion, machine integers (aka bitvectors), some string operators, some set operators, some map operators (uninterpreted functions), structs, conditionals, and boolean operators.

If you're curious, here are some more details about the compilation process (Section 4).

Given this limited compilation target, we can't actually translate the datetime constructor itself, because this would require writing a parser for datetime strings in the SMT language. This parser is hugely complicated, because it also needs to validate its input, and establish that the input string is a correct date time, enforcing the constraints on leap years, numbers of days in months etc. Even if we could fully unroll all loops in this parser to express it in SMT, the resulting SMT expression would be analyzable in name only :). In practice, there is a limit to how complicated and big of a formula a solver can solve in a reasonable amount of time.

This is why we don't actually encode the parsing of Decimals or IPAddresses either. Note that the strict validator creates a loophole for us to avoid parsing in the solver: it rejects attempts to call the decimal constructor on a non-literal, e.g., decimal(context.someStringRepresentation), and instead asks that arguments to these constructor are string literals. We use the regular Rust/Lean parser to parse these strings, and then turn them into concrete values (bitvectors) representable in SMT. Input (e.g., context) values of these types are represented as bitvector variables (unknowns), and the solver tries to find values for them that violate the desired property.

Ok, so if we can't translate the datetime parser to SMT, how would we enforce the desired constraints on date values? The only way to do is by emitting extra formulas that say the date components are within the right ranges. Think of it as emitting a giant conditional expression that says "if all dates occurrences are well-formed, then the property we care about holds."
Now, we can't emit this expression for a general Cedar policy because of sets: checking that every element of a set is well-formed conceptually requires iteration, which isn't supported by SMT. And even if prohibit putting dates in sets, the resulting well-formedness constraints would still be too complex in practice. This is why we can't support the proposed representation while remaining in the decidable / practical fragment of the SMT language.

Can we prove these operations under a simpler calendar model, and then let implementations choose the most appropriate calendar model (cough Gregorian) for their implementation? Most general purpose programming languages have robust date/time facilities, so it stands to reason that different implementations of Cedar will adopt them, rather than start from scratch.

Unfortunately, not for analysis. We can definitely do this in general (in Lean), but for the analysis to work, we must be able to express each operation in the very limited SMT language.

The current IAM operations are insufficient to express many of the examples in this RFC. Adding additional functions and a duration type could improve the situation, but I think we'll still fall short. There wouldn't be a facility for policies that have periodic decision making abilities (e.g. "everyday between $x and $y"), which we've identified as important for our customers.

I totally understand that a richer set of operators would be desirable. But if a datetime operation cannot be expressed (roughly) as a loop-free expression over integers, it's out of bounds of what's analyzable. The IAM operators have the nice property of being expressible as < and <= over longs. We can push this a bit further than IAM, but not to the point where we can support very complex calculations, like extracting the year or month from an epoch.

I'd be happy to chat more offline (to save us some typing :) about whether there is a middle ground that we can design. I'm hoping we can find a reasonable design that combines IAM++ and some of the alternatives discussed in the RFC.

text/0080-datetime-extension.md

apg · 2024-08-15T17:13:15Z

Based on all the feedback so far (Thank you!!!!!), it seems like I should reposition this RFC to be Alt C, with the API discussed in the comments, and millisecond precision.

I'll try to post a rewrite by the end of the week.

text/0080-datetime-extension.md

philhassey · 2024-08-20T17:17:54Z

I agree that now() or currentTime() should be removed. These are better supplied by the caller as part of the context.

text/0080-datetime-extension.md

mwhicks1 · 2024-08-20T18:40:05Z

I think I like this proposal, but I am looking for more details. Spelling them out may reveal gotchas that are relevant to approval.

For example, I think what's going on is that a datetime and duration are represented as long integers under the covers, like Unix time. This means that evaluation of various operations is pretty simple, and encoding for SMT-based policy analysis is simple, too. But some details are important, e.g., if duration cannot be negative then x-y is really |x-y|. So I'd like to see those spelled out.

I'd also like to see the basic approach to validation spelled out. If you look at the recent proposals for function macros or embedded attribute maps, etc. you'll see these features written out carefully.

andrewmwells-amazon · 2024-08-21T17:22:01Z

text/0080-datetime-extension.md

+To construct and manipulate these types we will provide the functions listed below.
+All of this functionality will be hidden behind a `datetime` feature flag (analogous to the current decimal and IP extensions), allowing users to opt-out.
+
+- `datetime(string)` constructs a datetime value. Like with other extension function constructors, strict validation requires `string` to be a string literal, although evaluation/authorization support any string-typed expression. The string must be of one of the forms, and regardless of the timezone offset is always normalized to UTC:


Maybe we should also include datetime("epoch_start") and datetime("epoch_end")?
A potential usecase would be half-open intervals. E.g., the same shape of policy could be used to express buying and renting. For buying it could be:

permit(principal = User::"Bob", action = Action::"view", resource = Movie::"UnReleasedFilm") when { context.curr_time >= resource.release_date && context.curr_time <= datetime("epoch_end") };

While for renting you'd use the end of the rental period, e.g., context.curr_time <= datetime("2024-08-21T20:07:23").

This seems fine to me. Perhaps this is mostly useful when policies are templated, possibly in the context of #61, but why would a policy author include context.curr_time <= datetime("epoch_end") otherwise?

Yeah, I'm envisioning templated / programmatically generated policies where it's easier to just give every policy a start and end time. In my example the RHS condition would always true so it would hopefully be dropped if the policies were written by hand.

I don't think I understand the utility here: if you want to time-box context.curr_time, wouldn't it make more sense to give specific values for the bounds rather than using epoch_start and/or epoch_end?

I'm not sure that epoch_start and epoch_end are useful abstractions here: one is in 1970, and the other is the largest representable 64-bit value, the date of which will vary with our supported precision level.

I think the usefulness is only in generated policies, but also agree with @khieta that the user could provide a date far into the future, or far into the past just as easily. I'm 👎 on adding epoch_start and epoch_end.

apg · 2024-08-21T21:18:29Z

I'd also like to see the basic approach to validation spelled out. If you look at the recent proposals for function macros or embedded attribute maps, etc. you'll see these features written out carefully.

@khieta can you help with the validation approach?

apg · 2024-08-21T21:24:28Z

For example, I think what's going on is that a datetime and duration are represented as long integers under the covers, like Unix time.

Yes, this is detailed in the Detailed Design.

This means that evaluation of various operations is pretty simple, and encoding for SMT-based policy analysis is simple, too. But some details are important, e.g., if duration cannot be negative then x-y is really |x-y|. So I'd like to see those spelled out.

I can explicitly state that durations should be negatable, because I believe they should be. It does seems like that stance is not yet shared amongst everyone.

apg · 2024-08-23T00:35:18Z

OK. I think most feedback that I'm aware of has been addressed by either @khieta or myself, and it should be "safe" to reevaluate.

I've also implemented this API in Python to validate that the behavior makes sense. It caused me to realize that we'd have a bug with -, so that's helpful!

philhassey · 2024-08-23T19:10:52Z

How does everyone feel about this signature for duration?

duration(string) where string is something like

"5d 3h 2m 1s"

This gives us the ability to define more precise durations in policy without having to add a bunch of them together. It also keeps a single arg (which is convenient.).

It's worth noting that with validation turned on, arguments to extension constructors are meant to be literals (see https://docs.cedarpolicy.com/policies/validation.html#:~:text=Errors%20due%20to%20incorrect,validate%20(regardless%20of%20mode). )

However, by default the validator runs in a strict mode that forbids passing non-literal strings to extension function constructors; in this mode, the expression above will fail to validate with the error extension constructors may not be called with non-literal expressions. An expression like ip("XYZ") in a policy will fail to validate (regardless of mode).

This suggests that there's no room for doing things like this in validated Cedar:
duration(context.expireDays,"days")

And really the better way would be to have context.expireDays just be the exact duration already. Or if you are writing it as policy text, then duration("5d") would be more terse anyways.

mwhicks1

I'm still not sold on the use of - for DTs, and I suggested an alternative. I also suggested some small wording/structure adjustments. Overall I think this is looking really good. Well done @khieta and @apg !

text/0080-datetime-extension.md

cdisselkoen · 2024-08-23T19:23:12Z

To @philhassey's toplevel comment (sorry, I can't thread, because it wasn't made as a comment on any particular part of the RFC text)

How does everyone feel about this signature for duration?

I like this direction for all of the reasons given, and wonder if there is some standard for specifying durations as a string just like there are ISO standards for specifying datetimes as a string?

emina · 2024-08-24T01:40:39Z

text/0080-datetime-extension.md

+
+### Durations of Time (`duration`)
+
+The `duration(long, string)` function constructs a duration value. The string argument must be one of `"days", "hours", "minutes", "seconds", "milliseconds"`. Strict validation requires `long` and `string` to be literals, although evaluation/authorization support any appropriately-typed expressions. Values of type `duration` have the following methods:


Technically, strict validation doesn’t need to require the long and string in this case to be literals. And that’s because it’s possible to encode the meaning of this constructor to SMT for arbitrary (i.e., unknown) values.

We require arguments to ip and decimal to be literals only because it would be too expensive / infeasible to encode their behavior (parsing) on arbitrary (unknown) strings.

That said, we may want to make these literals anyway for consistency with other operators. And, admittedly, making them literals for the purpose of strict validation would help make the analysis more efficient.

text/0080-datetime-extension.md

mwhicks1

A few suggestions for final changes. Looks great!

mwhicks1 · 2024-08-27T05:36:54Z

text/0080-datetime-extension.md

+
+The current proposal supports milliseconds. Do we want nanoseconds too? Or is just seconds sufficient? The ISO 8601 format does not specify a maximum precision, so we can technically allow any number of `S`s after the `.` in `YYYY-MM-DDThh:mm:ss.SSSZ`. Based on [this blog post](https://nickb.dev/blog/iso8601-and-nanosecond-precision-across-languages/), it appears that Javascript supports milliseconds (3 digits), Python supports microseconds (6 digits), and Rust and Go support nanoseconds (9 digits). Assuming nanosecond accuracy, the maximum (signed) 64-bit number (2^63 - 1) represents April 11, 2262. This date seems far enough out that any of these choices (milliseconds, microseconds, or nanoseconds) seems reasonable.
+
+Note that the backwards compatible option is to stick with milliseconds (or seconds). We can add precision later, but not remove it without a breaking change.


In what sense is it backward compatible? The representation (as a long) would change its meaning, which means that data stored in the app would have to change. Are we expecting never to have to store a long (for duration or time) in the app? I was thinking "current time" or various deadlines would be stored that way.

Current time will need to be passed as a ISO date string with the current proposal. Users should not use the Unix long value directly.

Increasing precision is backwards compatible in the sense that all ISO date strings that were previously accepted will continue to be accepted in the future. But you are right that this isn't truly backwards-compatible: more precision will affect the underlying representation, which will impact when certain operators will overflow. My comment was sweeping this under the rug 🙂

Note that there isn't a (nice) way to make it backwards compatible for analysis because extending the bitvector representation won't correspond to exactly to N extra decimal digits.

text/0080-datetime-extension.md

cdisselkoen

Thanks for the team effort on this RFC.

text/0080-datetime-extension.md

apg · 2024-08-28T17:54:34Z

@philhassey was questioning the special way to compute durationSince that's listed in the RFC, and it turns out that's completely wrong. I believe I convinced myself that that was necessary by having a bug in my reference implementation. Updating the RFC with that mistake removed.

shaobo-he-aws

LGTM.

Co-developed-by: Kesha Hietala <[email protected]> Signed-off-by: Andrew Gwozdziewycz <[email protected]>

khieta · 2024-09-03T12:08:31Z

The final comment period (FCP) for this RFC is starting now, with intent to accept the RFC. The FCP will end 2024-09-10 at noon PT / 3pm ET / 7pm UTC. Please add comments, and especially any objections, if you have any. For more on the RFC process, see https://github.com/cedar-policy/rfcs.

apg force-pushed the datetime-rfc branch from d4c12bd to 2c24d2d Compare August 13, 2024 23:24

andrewmwells-amazon reviewed Aug 14, 2024

View reviewed changes

text/0080-datetime-extension.md Outdated Show resolved Hide resolved

text/0080-datetime-extension.md Outdated Show resolved Hide resolved

andrewmwells-amazon reviewed Aug 15, 2024

View reviewed changes

text/0080-datetime-extension.md Outdated Show resolved Hide resolved

khieta added the pending This RFC is pending; for definitions see the README label Aug 15, 2024

emina reviewed Aug 20, 2024

View reviewed changes

text/0080-datetime-extension.md Show resolved Hide resolved

emina reviewed Aug 20, 2024

View reviewed changes

text/0080-datetime-extension.md Outdated Show resolved Hide resolved

emina reviewed Aug 20, 2024

View reviewed changes

text/0080-datetime-extension.md Outdated Show resolved Hide resolved

mwhicks1 reviewed Aug 20, 2024

View reviewed changes

andrewmwells-amazon reviewed Aug 21, 2024

View reviewed changes

mwhicks1 reviewed Aug 23, 2024

View reviewed changes

text/0080-datetime-extension.md Show resolved Hide resolved

text/0080-datetime-extension.md Outdated Show resolved Hide resolved

text/0080-datetime-extension.md Outdated Show resolved Hide resolved

text/0080-datetime-extension.md Show resolved Hide resolved

emina reviewed Aug 24, 2024

View reviewed changes

khieta reviewed Aug 26, 2024

View reviewed changes

text/0080-datetime-extension.md Outdated Show resolved Hide resolved

apg force-pushed the datetime-rfc branch 5 times, most recently from 6398ba4 to 20a9376 Compare August 27, 2024 02:07

mwhicks1 approved these changes Aug 27, 2024

View reviewed changes

andrewmwells-amazon approved these changes Aug 27, 2024

View reviewed changes

cdisselkoen approved these changes Aug 27, 2024

View reviewed changes

text/0080-datetime-extension.md Outdated Show resolved Hide resolved

text/0080-datetime-extension.md Outdated Show resolved Hide resolved

mwhicks1 approved these changes Aug 28, 2024

View reviewed changes

emina approved these changes Aug 29, 2024

View reviewed changes

shaobo-he-aws approved these changes Aug 29, 2024

View reviewed changes

aaronjeline approved these changes Aug 29, 2024

View reviewed changes

Add support for handling datetimes and durations to Cedar

c4b1021

Co-developed-by: Kesha Hietala <[email protected]> Signed-off-by: Andrew Gwozdziewycz <[email protected]>

apg force-pushed the datetime-rfc branch from af50589 to c4b1021 Compare August 30, 2024 16:13

khieta added final-comment-period This RFC is in its final comment period; for definitions see the README and removed pending This RFC is pending; for definitions see the README labels Sep 3, 2024

khieta merged commit 1046090 into cedar-policy:main Sep 11, 2024
1 check passed

khieta added active This RFC is active; for definitions see the README and removed final-comment-period This RFC is in its final comment period; for definitions see the README labels Sep 11, 2024

apg mentioned this pull request Sep 16, 2024

Add the extension types, datetime and duration cedar-policy/cedar-go#35

Merged

adpaco-aws mentioned this pull request Nov 25, 2024

Lean: Add parsing for duration in the datetime extension cedar-policy/cedar-spec#471

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Datetime Extension #80

Datetime Extension #80

apg commented Aug 13, 2024 •

edited

Loading

emina commented Aug 13, 2024

apg commented Aug 14, 2024

emina commented Aug 14, 2024

apg commented Aug 15, 2024

philhassey commented Aug 20, 2024

mwhicks1 commented Aug 20, 2024

andrewmwells-amazon Aug 21, 2024

apg Aug 21, 2024

andrewmwells-amazon Aug 21, 2024

khieta Aug 23, 2024

apg Aug 26, 2024

apg commented Aug 21, 2024

apg commented Aug 21, 2024

apg commented Aug 23, 2024

philhassey commented Aug 23, 2024

mwhicks1 left a comment

cdisselkoen commented Aug 23, 2024

emina Aug 24, 2024

mwhicks1 left a comment

mwhicks1 Aug 27, 2024

khieta Aug 27, 2024

andrewmwells-amazon Aug 27, 2024

cdisselkoen left a comment

apg commented Aug 28, 2024

shaobo-he-aws left a comment

khieta commented Sep 3, 2024 •

edited

Loading


		### Durations of Time (`duration`)

		The `duration(long, string)` function constructs a duration value. The string argument must be one of `"days", "hours", "minutes", "seconds", "milliseconds"`. Strict validation requires `long` and `string` to be literals, although evaluation/authorization support any appropriately-typed expressions. Values of type `duration` have the following methods:


		The current proposal supports milliseconds. Do we want nanoseconds too? Or is just seconds sufficient? The ISO 8601 format does not specify a maximum precision, so we can technically allow any number of `S`s after the `.` in `YYYY-MM-DDThh:mm:ss.SSSZ`. Based on [this blog post](https://nickb.dev/blog/iso8601-and-nanosecond-precision-across-languages/), it appears that Javascript supports milliseconds (3 digits), Python supports microseconds (6 digits), and Rust and Go support nanoseconds (9 digits). Assuming nanosecond accuracy, the maximum (signed) 64-bit number (2^63 - 1) represents April 11, 2262. This date seems far enough out that any of these choices (milliseconds, microseconds, or nanoseconds) seems reasonable.

		Note that the backwards compatible option is to stick with milliseconds (or seconds). We can add precision later, but not remove it without a breaking change.

Datetime Extension #80

Datetime Extension #80

Conversation

apg commented Aug 13, 2024 • edited Loading

emina commented Aug 13, 2024

apg commented Aug 14, 2024

emina commented Aug 14, 2024

apg commented Aug 15, 2024

philhassey commented Aug 20, 2024

mwhicks1 commented Aug 20, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

apg commented Aug 21, 2024

apg commented Aug 21, 2024

apg commented Aug 23, 2024

philhassey commented Aug 23, 2024

mwhicks1 left a comment

Choose a reason for hiding this comment

cdisselkoen commented Aug 23, 2024

Choose a reason for hiding this comment

mwhicks1 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cdisselkoen left a comment

Choose a reason for hiding this comment

apg commented Aug 28, 2024

shaobo-he-aws left a comment

Choose a reason for hiding this comment

khieta commented Sep 3, 2024 • edited Loading

apg commented Aug 13, 2024 •

edited

Loading

khieta commented Sep 3, 2024 •

edited

Loading