diff --git a/_posts/2018-06-25-sunspots-lstm/sunspots-lstm.Rmd b/_posts/2018-06-25-sunspots-lstm/sunspots-lstm.Rmd
index 4a16541e5..b719a0855 100644
--- a/_posts/2018-06-25-sunspots-lstm/sunspots-lstm.Rmd
+++ b/_posts/2018-06-25-sunspots-lstm/sunspots-lstm.Rmd
@@ -191,11 +191,11 @@ include_graphics("images/cowplot.png")
When doing cross validation on sequential data, the time dependencies on preceding samples must be preserved. We can create a cross validation sampling plan by offsetting the window used to select sequential sub-samples. In essence, we're creatively dealing with the fact that there's no future test data available by creating multiple synthetic "futures" - a process often, esp. in finance, called "backtesting".
-As mentioned in the introduction, the [rsample](https://cran.r-project.org/package=rsample) package includes facitlities for backtesting on time series. The vignette, ["Time Series Analysis Example"](https://topepo.github.io/rsample/articles/Applications/Time_Series.html), describes a procedure that uses the `rolling_origin()` function to create samples designed for time series cross validation. We'll use this approach.
+As mentioned in the introduction, the [rsample](https://cran.r-project.org/package=rsample) package includes facitlities for backtesting on time series. The vignette, ["Time Series Analysis Example"](https://tidymodels.github.io/rsample/articles/Applications/Time_Series.html), describes a procedure that uses the `rolling_origin()` function to create samples designed for time series cross validation. We'll use this approach.
#### Developing a backtesting strategy
-The sampling plan we create uses 50 years (`initial` = 12 x 50 samples) for the training set and ten years (`assess` = 12 x 10) for the testing (validation) set. We select a `skip` span of about twenty years (`skip` = 12 x 20 - 1) to approximately evenly distribute the samples into 6 sets that span the entire 265 years of sunspots history. Last, we select `cumulative = FALSE` to allow the origin to shift which ensures that models on more recent data are not given an unfair advantage (more observations) over those operating on less recent data. The tibble return contains the `rolling_origin_resamples`.
+The sampling plan we create uses 100 years (`initial` = 12 x 100 samples) for the training set and 50 years (`assess` = 12 x 50) for the testing (validation) set. We select a `skip` span of about 22 years (`skip` = 12 x 22 - 1) to approximately evenly distribute the samples into 6 sets that span the entire 265 years of sunspots history. Last, we select `cumulative = FALSE` to allow the origin to shift which ensures that models on more recent data are not given an unfair advantage (more observations) over those operating on less recent data. The tibble return contains the `rolling_origin_resamples`.
```{r}
periods_train <- 12 * 100
diff --git a/_posts/2018-06-25-sunspots-lstm/sunspots-lstm.html b/_posts/2018-06-25-sunspots-lstm/sunspots-lstm.html
index f9b4be051..b257eb138 100644
--- a/_posts/2018-06-25-sunspots-lstm/sunspots-lstm.html
+++ b/_posts/2018-06-25-sunspots-lstm/sunspots-lstm.html
@@ -66,6 +66,10 @@
width: 100%;
}
+ .pandoc-table>caption {
+ margin-bottom: 10px;
+ }
+
.pandoc-table th:not([align]) {
text-align: left;
}
@@ -82,6 +86,10 @@
padding-right: 16px;
}
+ .l-screen .caption {
+ margin-left: 10px;
+ }
+
.shaded {
background: rgb(247, 247, 247);
padding-top: 20px;
@@ -601,9 +609,9 @@
var fn = $('#' + id);
var fn_p = $('#' + id + '>p');
fn_p.find('.footnote-back').remove();
- var text = fn_p.text();
+ var text = fn_p.html();
var dtfn = $('
When doing cross validation on sequential data, the time dependencies on preceding samples must be preserved. We can create a cross validation sampling plan by offsetting the window used to select sequential sub-samples. In essence, we’re creatively dealing with the fact that there’s no future test data available by creating multiple synthetic “futures” - a process often, esp. in finance, called “backtesting”.
-As mentioned in the introduction, the rsample package includes facitlities for backtesting on time series. The vignette, “Time Series Analysis Example”, describes a procedure that uses the rolling_origin()
function to create samples designed for time series cross validation. We’ll use this approach.
As mentioned in the introduction, the rsample package includes facitlities for backtesting on time series. The vignette, “Time Series Analysis Example”, describes a procedure that uses the rolling_origin()
function to create samples designed for time series cross validation. We’ll use this approach.
The sampling plan we create uses 50 years (initial
= 12 x 50 samples) for the training set and ten years (assess
= 12 x 10) for the testing (validation) set. We select a skip
span of about twenty years (skip
= 12 x 20 - 1) to approximately evenly distribute the samples into 6 sets that span the entire 265 years of sunspots history. Last, we select cumulative = FALSE
to allow the origin to shift which ensures that models on more recent data are not given an unfair advantage (more observations) over those operating on less recent data. The tibble return contains the rolling_origin_resamples
.
The sampling plan we create uses 100 years (initial
= 12 x 100 samples) for the training set and 50 years (assess
= 12 x 50) for the testing (validation) set. We select a skip
span of about 22 years (skip
= 12 x 22 - 1) to approximately evenly distribute the samples into 6 sets that span the entire 265 years of sunspots history. Last, we select cumulative = FALSE
to allow the origin to shift which ensures that models on more recent data are not given an unfair advantage (more observations) over those operating on less recent data. The tibble return contains the rolling_origin_resamples
.
periods_train <- 12 * 100
diff --git a/docs/posts/2018-06-25-sunspots-lstm/index.html b/docs/posts/2018-06-25-sunspots-lstm/index.html
index 0b136f3fb..45cc43c12 100644
--- a/docs/posts/2018-06-25-sunspots-lstm/index.html
+++ b/docs/posts/2018-06-25-sunspots-lstm/index.html
@@ -1470,9 +1470,9 @@ Visualizing sunspot data with cow
When doing cross validation on sequential data, the time dependencies on preceding samples must be preserved. We can create a cross validation sampling plan by offsetting the window used to select sequential sub-samples. In essence, we’re creatively dealing with the fact that there’s no future test data available by creating multiple synthetic “futures” - a process often, esp. in finance, called “backtesting”.
-As mentioned in the introduction, the rsample package includes facitlities for backtesting on time series. The vignette, “Time Series Analysis Example”, describes a procedure that uses the rolling_origin()
function to create samples designed for time series cross validation. We’ll use this approach.
As mentioned in the introduction, the rsample package includes facitlities for backtesting on time series. The vignette, “Time Series Analysis Example”, describes a procedure that uses the rolling_origin()
function to create samples designed for time series cross validation. We’ll use this approach.
The sampling plan we create uses 50 years (initial
= 12 x 50 samples) for the training set and ten years (assess
= 12 x 10) for the testing (validation) set. We select a skip
span of about twenty years (skip
= 12 x 20 - 1) to approximately evenly distribute the samples into 6 sets that span the entire 265 years of sunspots history. Last, we select cumulative = FALSE
to allow the origin to shift which ensures that models on more recent data are not given an unfair advantage (more observations) over those operating on less recent data. The tibble return contains the rolling_origin_resamples
.
The sampling plan we create uses 100 years (initial
= 12 x 100 samples) for the training set and 50 years (assess
= 12 x 50) for the testing (validation) set. We select a skip
span of about 22 years (skip
= 12 x 22 - 1) to approximately evenly distribute the samples into 6 sets that span the entire 265 years of sunspots history. Last, we select cumulative = FALSE
to allow the origin to shift which ensures that models on more recent data are not given an unfair advantage (more observations) over those operating on less recent data. The tibble return contains the rolling_origin_resamples
.
periods_train <- 12 * 100
diff --git a/docs/posts/posts.json b/docs/posts/posts.json
index c53da9063..f0de7b7ff 100644
--- a/docs/posts/posts.json
+++ b/docs/posts/posts.json
@@ -323,7 +323,7 @@
"Time Series"
],
"preview": "posts/2018-06-25-sunspots-lstm/images/backtested_test.png",
- "last_modified": "2018-09-12T12:45:46-04:00",
+ "last_modified": "2019-01-07T09:09:45-05:00",
"preview_width": 800,
"preview_height": 416
},
diff --git a/docs/sitemap.xml b/docs/sitemap.xml
index 584a1d0d2..2c7731b82 100644
--- a/docs/sitemap.xml
+++ b/docs/sitemap.xml
@@ -78,7 +78,7 @@
https://blogs.rstudio.com/tensorflow/posts/2018-06-25-sunspots-lstm/
- 2018-09-12T12:45:46-04:00
+ 2019-01-07T09:09:45-05:00
https://blogs.rstudio.com/tensorflow/posts/2018-06-06-simple-audio-classification-keras/