Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

declare() type hints #169

Open
t-kalinowski opened this issue Sep 9, 2024 · 3 comments
Open

declare() type hints #169

t-kalinowski opened this issue Sep 9, 2024 · 3 comments

Comments

@t-kalinowski
Copy link

t-kalinowski commented Sep 9, 2024

It would be nice to use declare() for type hints, following the style of a Fortran subroutine type manifest. These type declarations could then:

  • Be parsed by roxygen to autogenerate or augment parameter documentation
  • Be used by a compiler to compile the R function to machine code
  • Be used by a runtime checker to validate arguments

Example syntax:

fun <- function(a, b, c, d, e) {
  declare(
    a = integer(1),      # Vector of a specific length
    b = integer(NA),     # Vector of any length
    c = integer(c(NA, 3)), # Matrix with 3 columns and any number of rows
    d = integer(c(NA, NA, NA)), # 3D array of any size
    
    # Data frame with columns `name` and `age`
    # and optionally other columns `...` that are ignored
    e = data.frame(
      name = character(NA),
      age = integer(NA),
      ...
    ),
    
    # Declare return type
    return = logical(1)
  )
  
  TRUE
}

Some additional, more experimental syntax could also be supported:

  • Length constraints:
declare(
  a = integer(.>3),           # Vector with a length constraint
  b = integer(10 <= . <= 20)  # Vector with a more complex length constraint
)
  • Union types, which could also be a way to specify optional (NULLable) args:
declare(
  x = union(integer(1), character(1))
  # or 
  x = integer(1) || character(1)
  
  # optional arg
  x = integer(1) || NULL
)
  • Function types:
    Include a way to specify function parameters and their expected signatures:
declare(
  f = function(x = numeric(1)) -> logical(1)
)
  • Named dimensions:
    For multidimensional arrays, allow naming dimensions for clarity :
declare(
  matrix = numeric(c(rows = NA, cols = 3))
)
  • Value constraints:
declare(
  age = integer(1, 0 <= . <= 120),
  color = character(1, . %in% c("red", "green", "blue"))
)
@georgestagg
Copy link

georgestagg commented Sep 9, 2024

A few of my own thoughts are below, from when I was thinking about this myself:

Possibly the declarations should be one level deeper for clarity of what we're actually declaring, though I guess this depends on if the intention is for declare() to be used for other things too (e.g. evaluation semantics).

 declare(type(a = integer(1)))

Allowing for literals would be good:

 declare(type(a = "abc" | 4 | 5 | FALSE))

Creating new named types from e.g. type unions would also be good. In this example I use <- rather than = to define a new type instead of asserting a type for a variable:

declare(type(my_type <- integer(5) | FALSE ))
declare(type(a = my_type))

Also, type generics would be really nice. Here, I use -> to indicate a type parameter U.

declare(type(maybe <- (U -> U | FALSE)))
# declare(type(bar = integer(1) | FALSE))
declare(type(bar = maybe(integer(1))))

One can imagine a combination of type generics and function definitions:

declare(type(
  wrapper <- (U -> function(U) { list(U) })
))

declare(type(
  fn = wrapper(integer(1))
))

fn <- function(x) {
  list(x + 1L)
}

Also, should other attributes be possible to declare?

declare(type(list(3), names = c("abc", "def", "ghi"), class = "myclass"))

@t-kalinowski
Copy link
Author

t-kalinowski commented Sep 10, 2024

Do you imagine one call like declare(type(...)) per symbol? That seems like a lot of syntax is required.

What do you think of:

declare(
  name1 = type(...),
  name2 = type(...),
  ...,
  return = type(...)
)

Regarding the last question about attributes, I think it's a good idea. This could be supported with this approach too, like:

declare(
  time = type(double(), class = "POSIXct", tz = NULL | character(1)),
  name2 = type(...),
  name3 = type(...)
)

Ideally, all this would work nicely with S7, so one could do:

declare(
  time = type(S7::class_POSIXct)
)

@t-kalinowski
Copy link
Author

The class attribute does raise some interesting questions. Would it behave like inherits() and check for the existence of that string in the class vector, ignoring other classes, or would it do a strict check using identical()? 🤔

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants