Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CausalImpact() causes R to crash if not all dates exist in the defined period #32

Open
k3jiang opened this issue Jun 12, 2019 · 22 comments

Comments

@k3jiang
Copy link

k3jiang commented Jun 12, 2019

I am trying to exclude a single date in my post period, and model the impact with the rest of the dates. If I don't filter out the single date (see grey_date below) the code runs fine and returns results. If I do filter out the grey_date then it causes R to crash.

pre_start = '2019-02-17'
pre_end = '2019-04-13'
post_start = '2019-04-28'
post_end = '2019-05-25'

pre.period <- as.Date(c(pre_start, pre_end))
post.period <- as.Date(c(post_start, post_end))

grey_date = '2019-05-17'
metric = 'xyz'

data_comp <- dat %>%
  filter(ds >= as.Date(pre_start) & ds <= as.Date(post_end)) %>%
  filter(ds != as.Date(grey_date)) %>% # this line causes crash, works fine without it
  select(ds, country, metric) %>%
  spread(country, metric)

timeseries <- zoo(data_comp %>% select(NO, NL), data_comp$ds)

# CausalImpact
impact <- CausalImpact(
  timeseries, 
  pre.period, 
  post.period,
  model.args = list(prior.level.sd = 0.01, niter = 5000, nseasons = 7, season.duration = 1)
)
@steve-the-bayesian
Copy link

steve-the-bayesian commented Jun 12, 2019 via email

@k3jiang
Copy link
Author

k3jiang commented Jun 12, 2019

bsts_0.9.0, ah I guess its not actually CausalImpact causing the problem but bsts?

@steve-the-bayesian
Copy link

steve-the-bayesian commented Jun 12, 2019 via email

@steve-the-bayesian
Copy link

Can you provide a full reproducible example, including sharing your data and loading all the packages you use?

@k3jiang
Copy link
Author

k3jiang commented Jun 13, 2019

ah, probably can't share the data, but I will try to reproduce with some random data and then give you the full code.

@k3jiang
Copy link
Author

k3jiang commented Jun 13, 2019

here's my repro:

library(CausalImpact)
library(plyr)
library(tibble)
library(stringr)
library(dplyr)
library(ggplot2)
theme_set(theme_bw())
library(forcats)
library(knitr)
library(scales)
library(viridis)
library(leaflet)
library(tidyr)
library(jsonlite)
library(lubridate)
library(bigrquery)
options(repr.plot.width=6, repr.plot.height=4)

set.seed(1)
NO <- 100 + arima.sim(model = list(ar = 0.999), n = 100)
NL <- 1.2 * NO + rnorm(100)
NL[71:100] <- NL[71:100] + 10
ds <- seq.Date(as.Date("2019-02-17"), by = 1, length.out = 100)

dat = data.frame(ds, NL, NO, stringsAsFactors = FALSE)

pre_start = '2019-02-17'
pre_end = '2019-04-13'
post_start = '2019-04-28'
post_end = '2019-05-25'

pre.period <- as.Date(c(pre_start, pre_end))
post.period <- as.Date(c(post_start, post_end))

grey_date = '2019-05-17'

data_comp = dat %>%
  filter(ds >= as.Date(pre_start) & ds <= as.Date(post_end)) %>%
  filter(ds != as.Date(grey_date)) # this will cause crash, without it will be fine

timeseries <- zoo(data_comp %>% select(NO, NL), data_comp$ds)

# CausalImpact
impact <- CausalImpact(
  timeseries, 
  pre.period, 
  post.period,
  model.args = list(prior.level.sd = 0.01, niter = 5000, nseasons = 7, season.duration = 1)
)

plot(impact)

@steve-the-bayesian
Copy link

steve-the-bayesian commented Jun 18, 2019 via email

@k3jiang
Copy link
Author

k3jiang commented Jun 18, 2019

hmm ok it still crashes for me in RStudio. I will try updating the packages to see if it helps.

@steve-the-bayesian
Copy link

steve-the-bayesian commented Jun 18, 2019 via email

@DeFilippis
Copy link

DeFilippis commented Apr 7, 2020

Running the above repro, gives me the following error after running CausalImpact

Error in seq.int(trunc((start - xtsp[1L]) * xfreq + 1.5), trunc((end - : 'from' must be of length 1

I get the same error when running a number of Repros, so I assume there's some package error. I've installed the most recent packages from source using the GDrive copies. Any ideas?

EDIT: Solved the problem by re-installing CausalInference and rebooting R session.

@DeFilippis
Copy link

After > 2hours of investigating. It appears you get this error when you have tsibble loaded and the tidyverse loaded. I have no idea why.

@steve-the-bayesian
Copy link

steve-the-bayesian commented Apr 7, 2020 via email

@DeFilippis
Copy link

I was just using tidyverse to clean up the data prior to using bsts, but cannot run anything while the library is loaded. I have to restart my R session, load only CausalImpact and run the impact command. Strange behavior.

@DeFilippis
Copy link

FYI: For future people, I'm able to get the CausalImpact package to play nice by using the withr package. Use like so:


library(withr)
with_package("ggplot2", {
  ggplot(mtcars) + geom_point(aes(wt, hp))
})
# Calling geom_point outside withr context 
exists("geom_point")
# [1] FALSE


@darynaiva
Copy link

The issue mentioned by @DeFilippis was not present in January 2021 but returned in February 2021. Detaching tsibble and tidyverse doesn't help. I managed to run the CausalImpact command only in the clean R session without any other packages being loaded.

@kevin-m-kent
Copy link

I came across this issue that @DeFilippis mentioned and got around it by not using date sequences. Numeric indexes work fine for me, even with tsibble and the tidyverse loaded.

@AndrewKostandy
Copy link

AndrewKostandy commented Feb 13, 2022

Hi,

Thank you for this great package!

I'm experiencing the same issue now (Feb 2022) as discussed towards the bottom of the thread above where running CausalImpact() works fine except if the {tsibble} package is loaded in which case CausalImpact() starts throwing the below error. Even if {tsibble} is not fully loaded and simply the line tsibble::as_tsibble(some_data) is called, CausalImpact() throws the error too. Restarting R & RStudio fails to get the CausalImpact() function to work again until I uninstall the {tsibble} package. Reinstalling {tsibble} causes no issues as long as I didn't load it or use a function from it.

Error in seq.int(trunc((start - xtsp[1L]) * xfreq + 1.5), trunc((end - : 
'from' must be of length 1

@kaybrodersen
Copy link
Collaborator

Could you share a minimal reproducible example?

@AndrewKostandy
Copy link

AndrewKostandy commented Apr 9, 2022

Sure

library(tidyverse)
library(lubridate)
#> 
#> Attaching package: 'lubridate'
#> The following objects are masked from 'package:base':
#> 
#>     date, intersect, setdiff, union
library(CausalImpact)
#> Loading required package: bsts
#> Loading required package: BoomSpikeSlab
#> Loading required package: Boom
#> Loading required package: MASS
#> 
#> Attaching package: 'MASS'
#> The following object is masked from 'package:dplyr':
#> 
#>     select
#> 
#> Attaching package: 'Boom'
#> The following object is masked from 'package:stats':
#> 
#>     rWishart
#> 
#> Attaching package: 'BoomSpikeSlab'
#> The following object is masked from 'package:stats':
#> 
#>     knots
#> Loading required package: zoo
#> 
#> Attaching package: 'zoo'
#> The following objects are masked from 'package:base':
#> 
#>     as.Date, as.Date.numeric
#> Loading required package: xts
#> 
#> Attaching package: 'xts'
#> The following objects are masked from 'package:dplyr':
#> 
#>     first, last
#> 
#> Attaching package: 'bsts'
#> The following object is masked from 'package:BoomSpikeSlab':
#> 
#>     SuggestBurn

df <- tibble(date = seq.Date(ymd("2022-01-01"), ymd("2022-03-01"), "day"),
             value = c(rnorm(40), rnorm(20, 2, 1)))

pre.period <- as.Date(c("2022-01-01", "2022-02-10"))
post.period <- as.Date(c("2022-02-11", "2022-03-01"))

impact <- CausalImpact(df, pre.period, post.period, model.args = list(niter = 5000))

plot(impact) +
  scale_x_date(date_breaks = "1 month", date_labels = "%b") 

install.packages("tsibble")
#> 
#> The downloaded binary packages are in
#>  /var/folders/ky/6cwx2wgd63z02kzsr7mfv35h0000gn/T//RtmpVrKALn/downloaded_packages

# Just to use a tsibble function
another_name_df <- tsibble::as_tsibble(df, index = date)

# The below statement which is exactly the same as the one used earlier, now fails
impact <- CausalImpact(df, pre.period, post.period, model.args = list(niter = 5000))
#> Error in seq.int(trunc((start - xtsp[1L]) * xfreq + 1.5), trunc((end - : 'from' must be of length 1

Created on 2022-04-09 by the reprex package (v2.0.1)

Session info
sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value
#>  version  R version 4.1.2 (2021-11-01)
#>  os       macOS Big Sur 10.16
#>  system   x86_64, darwin17.0
#>  ui       X11
#>  language (EN)
#>  collate  en_CA.UTF-8
#>  ctype    en_CA.UTF-8
#>  tz       America/Toronto
#>  date     2022-04-09
#>  pandoc   2.17.1.1 @ /Applications/RStudio.app/Contents/MacOS/quarto/bin/ (via rmarkdown)
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package       * version date (UTC) lib source
#>  anytime         0.3.9   2020-08-27 [1] CRAN (R 4.1.0)
#>  assertthat      0.2.1   2019-03-21 [1] CRAN (R 4.1.0)
#>  backports       1.4.1   2021-12-13 [1] CRAN (R 4.1.0)
#>  Boom          * 0.9.7   2021-02-23 [1] CRAN (R 4.1.0)
#>  BoomSpikeSlab * 1.2.4   2021-04-06 [1] CRAN (R 4.1.0)
#>  broom           0.7.12  2022-01-28 [1] CRAN (R 4.1.2)
#>  bsts          * 0.9.7   2021-07-02 [1] CRAN (R 4.1.0)
#>  CausalImpact  * 1.2.7   2021-06-07 [1] CRAN (R 4.1.0)
#>  cellranger      1.1.0   2016-07-27 [1] CRAN (R 4.1.0)
#>  cli             3.2.0   2022-02-14 [1] CRAN (R 4.1.2)
#>  colorspace      2.0-3   2022-02-21 [1] CRAN (R 4.1.2)
#>  crayon          1.5.1   2022-03-26 [1] CRAN (R 4.1.2)
#>  curl            4.3.2   2021-06-23 [1] CRAN (R 4.1.0)
#>  DBI             1.1.2   2021-12-20 [1] CRAN (R 4.1.0)
#>  dbplyr          2.1.1   2021-04-06 [1] CRAN (R 4.1.0)
#>  digest          0.6.29  2021-12-01 [1] CRAN (R 4.1.1)
#>  dplyr         * 1.0.8   2022-02-08 [1] CRAN (R 4.1.2)
#>  ellipsis        0.3.2   2021-04-29 [1] CRAN (R 4.1.0)
#>  evaluate        0.15    2022-02-18 [1] CRAN (R 4.1.2)
#>  fansi           1.0.3   2022-03-24 [1] CRAN (R 4.1.2)
#>  farver          2.1.0   2021-02-28 [1] CRAN (R 4.1.0)
#>  fastmap         1.1.0   2021-01-25 [1] CRAN (R 4.1.0)
#>  forcats       * 0.5.1   2021-01-27 [1] CRAN (R 4.1.0)
#>  fs              1.5.2   2021-12-08 [1] CRAN (R 4.1.0)
#>  generics        0.1.2   2022-01-31 [1] CRAN (R 4.1.2)
#>  ggplot2       * 3.3.5   2021-06-25 [1] CRAN (R 4.1.0)
#>  glue            1.6.2   2022-02-24 [1] CRAN (R 4.1.2)
#>  gtable          0.3.0   2019-03-25 [1] CRAN (R 4.1.0)
#>  haven           2.4.3   2021-08-04 [1] CRAN (R 4.1.0)
#>  highr           0.9     2021-04-16 [1] CRAN (R 4.1.0)
#>  hms             1.1.1   2021-09-26 [1] CRAN (R 4.1.0)
#>  htmltools       0.5.2   2021-08-25 [1] CRAN (R 4.1.0)
#>  httr            1.4.2   2020-07-20 [1] CRAN (R 4.1.0)
#>  jsonlite        1.8.0   2022-02-22 [1] CRAN (R 4.1.2)
#>  knitr           1.38    2022-03-25 [1] CRAN (R 4.1.2)
#>  labeling        0.4.2   2020-10-20 [1] CRAN (R 4.1.0)
#>  lattice         0.20-45 2021-09-22 [1] CRAN (R 4.1.2)
#>  lifecycle       1.0.1   2021-09-24 [1] CRAN (R 4.1.0)
#>  lubridate     * 1.8.0   2021-10-07 [1] CRAN (R 4.1.0)
#>  magrittr        2.0.3   2022-03-30 [1] CRAN (R 4.1.2)
#>  MASS          * 7.3-56  2022-03-23 [1] CRAN (R 4.1.2)
#>  mime            0.12    2021-09-28 [1] CRAN (R 4.1.0)
#>  modelr          0.1.8   2020-05-19 [1] CRAN (R 4.1.0)
#>  munsell         0.5.0   2018-06-12 [1] CRAN (R 4.1.0)
#>  pillar          1.7.0   2022-02-01 [1] CRAN (R 4.1.2)
#>  pkgconfig       2.0.3   2019-09-22 [1] CRAN (R 4.1.0)
#>  purrr         * 0.3.4   2020-04-17 [1] CRAN (R 4.1.0)
#>  R.cache         0.15.0  2021-04-30 [1] CRAN (R 4.1.0)
#>  R.methodsS3     1.8.1   2020-08-26 [1] CRAN (R 4.1.0)
#>  R.oo            1.24.0  2020-08-26 [1] CRAN (R 4.1.0)
#>  R.utils         2.11.0  2021-09-26 [1] CRAN (R 4.1.0)
#>  R6              2.5.1   2021-08-19 [1] CRAN (R 4.1.0)
#>  Rcpp            1.0.8.3 2022-03-17 [1] CRAN (R 4.1.2)
#>  readr         * 2.1.2   2022-01-30 [1] CRAN (R 4.1.2)
#>  readxl          1.4.0   2022-03-28 [1] CRAN (R 4.1.2)
#>  reprex          2.0.1   2021-08-05 [1] CRAN (R 4.1.0)
#>  rlang           1.0.2   2022-03-04 [1] CRAN (R 4.1.2)
#>  rmarkdown       2.13    2022-03-10 [1] CRAN (R 4.1.2)
#>  rstudioapi      0.13    2020-11-12 [1] CRAN (R 4.1.0)
#>  rvest           1.0.2   2021-10-16 [1] CRAN (R 4.1.0)
#>  scales          1.1.1   2020-05-11 [1] CRAN (R 4.1.0)
#>  sessioninfo     1.2.2   2021-12-06 [1] CRAN (R 4.1.0)
#>  stringi         1.7.6   2021-11-29 [1] CRAN (R 4.1.0)
#>  stringr       * 1.4.0   2019-02-10 [1] CRAN (R 4.1.0)
#>  styler          1.7.0   2022-03-13 [1] CRAN (R 4.1.2)
#>  tibble        * 3.1.6   2021-11-07 [1] CRAN (R 4.1.0)
#>  tidyr         * 1.2.0   2022-02-01 [1] CRAN (R 4.1.2)
#>  tidyselect      1.1.2   2022-02-21 [1] CRAN (R 4.1.2)
#>  tidyverse     * 1.3.1   2021-04-15 [1] CRAN (R 4.1.0)
#>  tsibble         1.1.1   2021-12-03 [1] CRAN (R 4.1.0)
#>  tzdb            0.3.0   2022-03-28 [1] CRAN (R 4.1.2)
#>  utf8            1.2.2   2021-07-24 [1] CRAN (R 4.1.0)
#>  vctrs           0.4.0   2022-03-30 [1] CRAN (R 4.1.2)
#>  withr           2.5.0   2022-03-03 [1] CRAN (R 4.1.2)
#>  xfun            0.30    2022-03-02 [1] CRAN (R 4.1.2)
#>  xml2            1.3.3   2021-11-30 [1] CRAN (R 4.1.0)
#>  xts           * 0.12.1  2020-09-09 [1] CRAN (R 4.1.0)
#>  yaml            2.3.5   2022-02-21 [1] CRAN (R 4.1.2)
#>  zoo           * 1.8-9   2021-03-09 [1] CRAN (R 4.1.0)
#> 
#>  [1] /Library/Frameworks/R.framework/Versions/4.1/Resources/library
#> 
#> ──────────────────────────────────────────────────────────────────────────────


</details>

@kaybrodersen
Copy link
Collaborator

It seems this is an issue in {stats}.

This works:

dates <- as.Date(c("2022-01-01", "2022-01-08"))
stats::window(dates, 1)

This fails:

library(tsibble)
stats::window(dates, 1)

Error in seq.int(trunc((start - xtsp[1L]) * xfreq + 1.5), trunc((end - :
'from' must be of length 1

@AndrewKostandy
Copy link

Would you suggest I open an issue at the {tsibble} repo?
I'm not sure where to open one for the {stats} package.

@kaybrodersen
Copy link
Collaborator

Yes, that makes sense to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants
@DeFilippis @steve-the-bayesian @kaybrodersen @AndrewKostandy @k3jiang @kevin-m-kent @darynaiva and others