Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Behavior of subsecond extract in functions document is inconsistent amongst engines #600

Open
westonpace opened this issue Feb 9, 2024 · 1 comment

Comments

@westonpace
Copy link
Member

westonpace commented Feb 9, 2024

The extract function pulls out a part of a timestamp. In particular this issue is about these components:

      * MILLISECOND Return number of milliseconds since the last full second.
      * MICROSECOND Return number of microseconds since the last full millisecond.
      * NANOSECOND Return number of nanoseconds since the last full microsecond.
      * SUBSECOND Return number of microseconds since the last full second of the given timestamp.

Consider the query SELECT EXTRACT(MICROSECONDS FROM TIMESTAMP '2016-12-31 13:30:15.100259');

I would expect, according to that definition, that I would get 259. However, I get the following:

Engine Result
Postgres 15100259
DuckDb 15100259
Datafusion Not supported
Pyarrow Compute 259
SQL Server (datepart) 100259
Velox unknown
Spark can only extract seconds, which returns 15.100259

So, we have at least 3 different behaviors. Maybe we should just drop these from the method?

@CurtHagenlocher
Copy link

At least with Spark 3.4, it doesn't appear to be possible to extract anything smaller than a second from a timestamp, and extracting seconds gives fractional seconds. That is SELECT EXTRACT(SECONDS FROM TIMESTAMP '2016-12-31 13:30:15.100259') returns 15.100259.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants