Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow measurement units via units package? #201

Open
mpadge opened this issue Jun 25, 2020 · 6 comments
Open

allow measurement units via units package? #201

mpadge opened this issue Jun 25, 2020 · 6 comments

Comments

@mpadge
Copy link

mpadge commented Jun 25, 2020

This reprex illustrates the problem:

library (tsibble)
library (units)
#> udunits system database from /usr/share/udunits
daily <- set_units (1:100, "day")
class (daily)
#> [1] "units"
x <- tsibble (day = daily,
              index = daily)
#> Error: Must extract column with a single valid subscript.
#> ✖ Subscript `var` has the wrong type `units`.
#> ℹ It must be numeric or character.

Created on 2020-06-25 by the reprex package (v0.3.0)

With due acknowledgement of your statement in #134 that

With respect to modelling, there's no difference between 1 year and 1 unit

there is nevertheless a difference in internal representations within software. In this case, it may be considered important to retain explicit specifications of measurement units, here via the units package. This is arguably the only way to put an absolute scale on interval data which have no fixed time scale, and that is surely an important thing to be able to do?

@mitchelloharawild
Copy link
Member

Minor MRE fix: to create a tsibble, the index argument should match a column name, not a value:

library (tsibble)
library (units)
#> udunits system database from /usr/share/xml/udunits
daily <- set_units (1:100, "day")
class(daily)
#> [1] "units"
x <- tsibble (day = daily, index = day)
#> Error: Unsupported index type: units

Created on 2020-06-25 by the reprex package (v0.3.0)

@earowang
Copy link
Member

Are you looking for relative days as index instead of absolute dates? I'd suggest to use hms::hms() or lubridate::period() natively supported by tsibble.

library(tsibble)
daily <- hms::hms(day = 1:100)
tsibble (day = daily, index = day)
#> # A tsibble: 100 x 1 [24h]
#>    day   
#>    <time>
#>  1  24:00
#>  2  48:00
#>  3  72:00
#>  4  96:00
#>  5 120:00
#>  6 144:00
#>  7 168:00
#>  8 192:00
#>  9 216:00
#> 10 240:00
#> # … with 90 more rows

Created on 2020-06-26 by the reprex package (v0.3.0)

Tsibble supports commonly-used time classes. It's up to the package developer to implement custom index classes and their associated intervals for tsibble in the package.

@mpadge
Copy link
Author

mpadge commented Jun 26, 2020

Thanks @earowang, but the problem is that the straightforward ways to implement intervals do not work:

library (tsibble)
library (units)
#> udunits system database from /usr/share/udunits
daily <- set_units (1:100, "day")
x <- tsibble (day = daily, index = day)
#> Error: Unsupported index type: units

library (lubridate)
daily <- days (1:100)
x <- tsibble (day = daily, index = day)
#> Error in vec_proxy_period(x): trying to get slot "year" from an object (class "Period") that is not an S4 object

Created on 2020-06-26 by the reprex package (v0.3.0)

The only standard units which seem acceptable are absolute ones (lubridate::hms and the like), but no relative units seem to work at all (including none of the lubridate::Period-class ones). The most straightforward way to specify intervals is to use either the units package or these Period-class objects, yet neither of these work. I would also suggest that simple specification of intervals shouldn't be relegated to "custom classes" - this is a very general task that I think would find very general use, and code like that above should simply work.

@earowang
Copy link
Member

The lubridate::period() index support is very recent, and sits in the gh dev.

library(lubridate)
#> 
#> Attaching package: 'lubridate'
#> The following objects are masked from 'package:base':
#> 
#>     date, intersect, setdiff, union
library(tsibble)
#> 
#> Attaching package: 'tsibble'
#> The following object is masked from 'package:lubridate':
#> 
#>     interval
daily <- days(1:100)
tsibble(day = daily, index = day)
#> # A tsibble: 100 x 1 [1D]
#>    day         
#>    <Period>    
#>  1 1d 0H 0M 0S 
#>  2 2d 0H 0M 0S 
#>  3 3d 0H 0M 0S 
#>  4 4d 0H 0M 0S 
#>  5 5d 0H 0M 0S 
#>  6 6d 0H 0M 0S 
#>  7 7d 0H 0M 0S 
#>  8 8d 0H 0M 0S 
#>  9 9d 0H 0M 0S 
#> 10 10d 0H 0M 0S
#> # … with 90 more rows

Created on 2020-06-26 by the reprex package (v0.3.0)

@mpadge
Copy link
Author

mpadge commented Jun 26, 2020

Awesome! Any chance of similar integration of units? It is the interface in R to the udunits2 library, so should be considered the definitive implementation of units in R, including units for time series. Ping @edzer

@edzer
Copy link

edzer commented Jun 26, 2020

That would definitely ease the adoption by people from the modelling communities who use udunits2.

Note however that units and calendars have a difficult relationship, here you will read: "CAUTION: The timestamp-unit was created to be analogous to, for example, the degree celsius—but for the time dimension. I've come to believe, however, that creating such a unit was a mistake, primarily because users try to use the unit in ways for which it was not designed (such as converting dates in a calendar whose year is exactly 365 days long). Such activities are much better handled by a dedicated calendar package. Please be careful about using timestamp-units." illustrated by

> library(units)
udunits system database from /usr/share/xml/udunits
> set_units(set_units(1, "year"), "days")
365.2422 [d]

R's native time/date classes (Date, POSIXt) don't give difftime objects with units "months" or "years", for that reason.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants