Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RStan is passing the wrong type for the code argument to stanc.js on Windows #1145

Open
WardBrian opened this issue Dec 4, 2024 · 15 comments

Comments

@WardBrian
Copy link
Member

This has lead to stan-dev/stanc3#1446 and the seven duplicates therein.

I think the issue is the code that reads from a file is returning a vector of strings, rather than just one string. No idea why this is OS-specific.

This might be related to my pr #1124 which touches some of the same code

Description:

This model consistently raises an error on Windows, but not on other platforms:

data {
  int<lower=0> T; # rows
  int<lower=0> N; # cols
  
  matrix[T, N] x;
  matrix[T, N] y;
}

parameters {
  real alpha;
  real beta;
  
  real mu_alpha;
  real<lower=0> sigma_alpha;
  
  real mu_beta;
  real<lower=0> sigma_beta;
}

model {
  
  
  for(i in 1:T){
    for(j in 1:N){
      y[i, j] ~ poisson(exp(alpha + beta*x[i, j]))
    }
  }
  
  alpha ~ normal(mu_alpha, sigma_alpha);
  beta ~ normal(mu_beta, sigma_beta);
  
  mu_alpha ~ normal(0, 1)
  sigma_alpha ~ gamma(2, 2)
  
  mu_beta ~ normal(0, 1)
  sigma_beta ~ gamma(2, 2)
  
}

> rstan::stanc("./m2.stan") yields

Error in rstan::stanc("./m2.stan") : 0
Internal compiler error:
TypeError: b.charCodeAt is not a function

By replacing the stanc.js with the non-minified version, and adding some extra printing code to it, this became clear:

  1. b above is the code string
  2. In this case, the code is not actually a string, but an array of strings.

That is why this code does not lead to a crash:

code <- paste(readLines('./m2.stan'), collapse="")
rstan::stanc(model_code=code)

(though, it does raise a genuine syntax error, which is good!)

RStan Version:

‘2.36.0.9000’

R Version:

"R version 4.4.2 (2024-10-31 ucrt)"

Operating System:

Windows 11

@WardBrian
Copy link
Member Author

WardBrian commented Dec 4, 2024

Note that directly using rstan:::stanc_ctx$call("stanc", ...) also never seems to raise this error, so something that is munging model_code is returning something that isn't a string.

The array being passed seems to be split at newlines:
["data {"," int<lower=0> T; # rows"," int<lower=0> N; # cols",""," matrix[T, N] x;"," matrix[T, N] y;","}","parameters {"," real alpha;"," real beta;",""," real mu_alpha;"," real<lower=0> sigma_alpha;",""," real mu_beta;"," real<lower=0> sigma_beta;","}","model {","",""," for(i in 1:T){"," for(j in 1:N){"," y[i, j] ~ poisson(exp(alpha + beta*x[i, j]))"," }"," }",""," alpha ~ normal(mu_alpha, sigma_alpha);"," beta ~ normal(mu_beta, sigma_beta);",""," mu_alpha ~ normal(0, 1)"," sigma_alpha ~ gamma(2, 2)",""," mu_beta ~ normal(0, 1)"," sigma_beta ~ gamma(2, 2)","","}"]

@hsbadr could you look into this? It's generating a lot of noise in the stanc3 repo, and it looks like most of the users are using experimental

@WardBrian
Copy link
Member Author

Note: setting my line endings to CRLF on ubuntu still didn't trigger the same behavior as Windows

@andrjohns
Copy link
Contributor

@WardBrian something might have changed recently, but I'm not able to reproduce this error across either Windows 10 or Windows 11 (with a fresh install of rstan from the r-universe repo).

All attempts return:

> rstan::stanc("test.stan")
Error in rstan::stanc("test.stan") : 0
Syntax error in 'string', line 2, column 17, lexing error:
   -------------------------------------------------
     1:  data {
     2:    int<lower=0> T; # rows
                           ^
     3:    int<lower=0> N; # cols
     4:    matrix[T, N] x;
   -------------------------------------------------

Invalid character found.

@WardBrian
Copy link
Member Author

That is the correct error for the specific model I posted, since you can’t use # for comments any more.

Did you also try CRAN rstans?

@andrjohns
Copy link
Contributor

Similar for CRAN rstan, just an error about the missing semicolons instead:

> packageVersion("rstan")
[1] ‘2.32.6’

> rstan::stanc("test.stan")
Error in rstan::stanc("test.stan") : 0
Syntax error in 'string', line 19, column 4 to column 5, parsing error:
   -------------------------------------------------
    17:      for(j in 1:N){
    18:        y[i, j] ~ poisson(exp(alpha + beta*x[i, j]))
    19:      }
             ^
    20:    }
    21:    alpha ~ normal(mu_alpha, sigma_alpha);
   -------------------------------------------------

Ill-formed "~"-statement. Expected ";" or "T[" optional expression "," optional expression "];".

@WardBrian
Copy link
Member Author

I was just able to reproduce on a completely clean install of

Win 11 24H2 (26100.2605) (using Windows Sandbox)
R 4.4.2
RTools 44 (6335-6327)
RStan/StanHeaders from R-universe

with the same model as in the OP.

I created the file and ran the code in the RGui, if that matters for some reason?

@WardBrian
Copy link
Member Author

It also seems RStan on runiverse hasn’t been re built since I added type checking to the compiler inputs, because it’s still just throwing a “should never happen”

@andrjohns
Copy link
Contributor

What does the locale section in sessionInfo() return for you? I wonder if there's an encoding difference. For me, it's:

locale:
[1] LC_COLLATE=English_United States.utf8
[2] LC_CTYPE=English_United States.utf8
[3] LC_MONETARY=English_United States.utf8
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.utf8

Could you upload/attach the model text file? So we can rule out the editor/file saving

@andrjohns
Copy link
Contributor

It also seems RStan on runiverse hasn’t been re built since I added type checking to the compiler inputs, because it’s still just throwing a “should never happen”

Ah, looks like the experimental branch hasn't been synchronised in a couple of months. I'll have a look at setting up the workflows to automate the sync

@andrjohns
Copy link
Contributor

I was just able to reproduce on a completely clean install of

Win 11 24H2 (26100.2605) (using Windows Sandbox) R 4.4.2 RTools 44 (6335-6327) RStan/StanHeaders from R-universe

with the same model as in the OP.

I created the file and ran the code in the RGui, if that matters for some reason?

Ah managed to reproduce it in the Windows Sandbox, odd that it doesn't happen in the native system - even if I use the file created in the Sandbox. How bizarre, will do a bit more debugging over the next few days and update the experimental branch

@WardBrian
Copy link
Member Author

Oh, super weird! I assume the users who have been reporting aren’t all running in a VM, so there must be some difference in that environment compared to your real install. Maybe it is an encoding thing? I won’t have access to my windows machine until the end of the day at this point

@andrjohns
Copy link
Contributor

The errors seem to come from this block which seems to have something to do with processing '#' characters using the C pre-processor, but I have no idea why it's necessary and removing it fixes things.

Can you try installing this branch of rstan to see if it works for you:

remotes::install_github("stan-dev/rstan@no-pound-process", subdir = "rstan/rstan")

(Or just edit a local source to remove the block)

@WardBrian
Copy link
Member Author

At least on the experimental branch, that code can definitely be removed and replaced with stan-dev/stanc3#1433

I’ll give it a try later today!

@WardBrian
Copy link
Member Author

The no-pound-process branch did fix it for me, so that block does seem to be the culprit

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants