-
Notifications
You must be signed in to change notification settings - Fork 4
/
README.Rmd
77 lines (56 loc) · 3.18 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
---
output:
md_document:
variant: markdown_github
---
```{r, echo = FALSE}
knitr::opts_chunk$set(collapse = TRUE, comment = "#>")
```
# resamplr
[![lifecycle](https://img.shields.io/badge/lifecycle-retired-orange.svg)](https://www.tidyverse.org/lifecycle/#retired)
[![Travis-CI Build Status](https://travis-ci.org/jrnold/resamplr.svg?branch=master)](https://travis-ci.org/jrnold/resamplr)
[![codecov](https://codecov.io/gh/jrnold/resamplr/branch/master/graph/badge.svg)](https://codecov.io/gh/jrnold/resamplr)
## Status
I have stopped active development on the **resamplr** package.
I suggest using the [rsample](https://topepo.github.io/rsample/) package,
which implement
I am also working on a lower-level package [mlspearer](https://jrnold.github.com/mlspearer).
This package will implement the same resampling, cross-validation, and permutation methods, but work for a wider variety of objects.
## Introduction
The **resamplr** package provides functions that implement resampling methods including the bootstrap, jackknife, random test/train sets, k-fold cross-validation, leave-one-out and leave-p-out cross-validation, time-series cross validation, time-series k-fold cross validation, permutations, rolling windows.
These functions generate data frames with `resample` objects that work with the modelling pipeline of [modelr](https://github.com/hadley/modelr) and the [tidyverse](http://tidyverse.org/).
## Installation
**resamplr** is not on CRAN. You can install the
development version with
``` r
# install.packages("devtools")
devtools::install_github("jrnold/resamplr")
```
## Main Features
The **resamplr** package includes functions to generate data frames of
lazy resample objects, as introduced in the [tidyverse](http://tidyverse.org/) [modelr](https://github.com/hadley/modelr) package. The `resample` class
stores the a "pointer" to the original dataset and a vector of row indices.
The object can be coerced to a dataframe with `as.data.frame` and the row indices with `as.integer`.
```{r}
library("modelr")
library("resamplr")
rs <- resample(mtcars, 1:10)
as.data.frame(rs)
as.integer(rs)
```
While the **modelr** package contains a few functions with resampling methods (`crossv_kfold`, `crossv_mc`, and `bootstrap`), the **resamplr** package implements many more resampling methods including the following:
- boostrap
- bootstrap (weighted, Bayesian): `bootstrap`
- balanced bootstrap: `balanced_bootstrap`
- time-series bootstrap: `tsbootstrap`
- cross-validation
- test-training pairs: `holdout_n`, `holdout_frac`
- k-fold cross-validation: `crossv_kfold`
- time-series cross-validation: `crossv_ts`
- leave-one-out and leave-p-out cross-validation: `crossv_lpo`, `crossv_loo`
- time-series k-fold cross-validation: `crossv_tskfold`
- jackknife: `jackknife`
- permutations: `permute`
- rolling windows: `roll`
All resampling functions are implemented as generic functions with methods for data frames (`data.frame`) and grouped data frames (`grouped_df`).
When used with grouped data frames, these functions allow either resampling groups instead of rows, or resample rows within each group (stratification), or both, depending on what is appropriate for the method.