Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move combine_words() and write_bib() to xfun #2382

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -163,10 +163,10 @@ SystemRequirements: Package vignettes based on R Markdown v2 or reStructuredText
Collate:
'block.R'
'cache.R'
'utils.R'
'citation.R'
'hooks-html.R'
'plot.R'
'utils.R'
'defaults.R'
'concordance.R'
'engine.R'
Expand Down Expand Up @@ -199,3 +199,4 @@ Collate:
'utils-vignettes.R'
'zzz.R'
RoxygenNote: 7.3.2
Remotes: yihui/xfun
4 changes: 4 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,9 @@
# CHANGES IN knitr VERSION 1.50

## MINOR CHANGES

- Moved implementations of `combine_words()` and `write_bib()` to the **xfun** package as `xfun::join_words()` and `xfun::pkg_bib()`, respectively, since they are not directly relevant to **knitr**. The functions `combine_words()` and `write_bib()` are still kept in **knitr**, and can continue to be used in the future.

# CHANGES IN knitr VERSION 1.49

## NEW FEATURES
Expand Down
172 changes: 4 additions & 168 deletions R/citation.R
Original file line number Diff line number Diff line change
@@ -1,172 +1,8 @@
#' Generate BibTeX bibliography databases for R packages
#'
#' This function uses \code{utils::\link{citation}()} and
#' \code{utils::\link{toBibtex}()} to create bib entries for R packages and
#' write them in a file. It can facilitate the auto-generation of bibliography
#' databases for R packages, and it is easy to regenerate all the citations
#' after updating R packages.
#'
#' For a package, the keyword \samp{R-pkgname} is used for its bib item, where
#' \samp{pkgname} is the name of the package. Citation entries specified in the
#' \file{CITATION} file of the package are also included. The main purpose of
#' this function is to automate the generation of the package citation
#' information because it often changes (e.g. author, year, package version,
#' ...).
#'
#' There are at least two different uses for the URL in a reference list. You
#' might want to tell users where to go for more information; in that case, use
#' the default \code{packageURL = TRUE}, and the first URL listed in the
#' \file{DESCRIPTION} file will be used. Be careful: some authors don't put the
#' most relevant URL first. Alternatively, you might want to identify exactly
#' which version of the package was used in the document. If it was installed
#' from CRAN or some other repositories, the version number identifies it, and
#' \code{packageURL = FALSE} will use the repository URL (as used by
#' \code{utils::\link{citation}()}).
#'
#' @param x Package names. Packages which are not installed are ignored.
#' @param file The (\file{.bib}) file to write. By default, or if \code{NULL},
#' output is written to the R console.
#' @param tweak Whether to fix some known problems in the citations, especially
#' non-standard format of author names.
#' @param width Width of lines in bibliography entries. If \code{NULL}, lines
#' will not be wrapped.
#' @param prefix Prefix string for keys in BibTeX entries; by default, it is
#' \samp{R-} unless \code{\link{option}('knitr.bib.prefix')} has been set to
#' another string.
#' @param lib.loc A vector of path names of R libraries.
#' @param packageURL Use the \code{URL} field from the \file{DESCRIPTION} file.
#' See Details below.
#' @return A list containing the citations. Citations are also written to the
#' \code{file} as a side effect.
#' @note Some packages on CRAN do not have standard bib entries, which was once
#' reported by Michael Friendly at
#' \url{https://stat.ethz.ch/pipermail/r-devel/2010-November/058977.html}. I
#' find this a real pain, and there are no easy solutions except contacting
#' package authors to modify their DESCRIPTION files. Anyway, the argument
#' \code{tweak} has provided ugly hacks to deal with packages which are known
#' to be non-standard in terms of the format of citations; \code{tweak = TRUE}
#' is by no means intended to hide or modify the original citation
#' information. It is just due to the loose requirements on package authors
#' for the DESCRIPTION file. On one hand, I apologize if it really mangles the
#' information about certain packages; on the other, I strongly recommend
#' package authors to consider the \samp{Authors@@R} field (see the manual
#' \emph{Writing R Extensions}) to make it easier for other people to cite R
#' packages. See \code{knitr:::.tweak.bib} for details of tweaks. Also note
#' this is subject to future changes since R packages are being updated. If
#' you want to contribute more tweaks, please edit the file
#' \file{inst/misc/tweak_bib.csv} in the source package.
#' A wrapper function of \code{xfun::pkg_bib()}.
#' @param ...,prefix Arguments passed to \code{xfun::\link[xfun]{pkg_bib}()}.
#' @export
#' @author Yihui Xie and Michael Friendly
#' @examplesIf interactive()
#' write_bib(c('RGtk2', 'gWidgets'), file = 'R-GUI-pkgs.bib')
#' unlink('R-GUI-pkgs.bib')
#'
#' write_bib(c('animation', 'rgl', 'knitr', 'ggplot2'))
#' write_bib(c('base', 'parallel', 'MASS')) # base and parallel are identical
#' write_bib('cluster', prefix = '') # a empty prefix
#' write_bib('digest', prefix = 'R-pkg-') # a new prefix
#' write_bib('digest', tweak = FALSE) # original version
#'
#' # what tweak=TRUE does
#' str(knitr:::.tweak.bib)
write_bib = function(
x = .packages(), file = '', tweak = TRUE, width = NULL,
prefix = getOption('knitr.bib.prefix', 'R-'), lib.loc = NULL,
packageURL = TRUE
) {
system.file = function(...) base::system.file(..., lib.loc = lib.loc)
citation = function(...) utils::citation(..., lib.loc = lib.loc)
x = x[nzchar(x)] # remove possible empty string
idx = mapply(system.file, package = x) == ''
if (any(idx)) {
warning('package(s) ', paste(x[idx], collapse = ', '), ' not found')
x = x[!idx]
}
# no need to write bib for packages in base R other than `base` itself
x = setdiff(x, setdiff(xfun::base_pkgs(), 'base'))
x = sort(x)
bib = sapply(x, function(pkg) {
meta = packageDescription(pkg, lib.loc = lib.loc)
# don't use the citation() URL if the package has provided its own URL
cite = citation(pkg, auto = if (is.null(meta$URL)) meta else {
if (packageURL) meta$Repository = meta$RemoteType = NULL
# use the first URL in case the package provided multiple URLs
meta$URL = sub('[, \t\n].*', '', meta$URL)
meta
})
if (tweak) {
# e.g. gpairs has "gpairs: " in the title
cite$title = gsub(sprintf('^(%s: )(\\1)', pkg), '\\1', cite$title)
}
entry = toBibtex(cite)
entry[1] = sub('\\{,$', sprintf('{%s%s,', prefix, pkg), entry[1])
entry
}, simplify = FALSE)
if (tweak) {
for (i in intersect(names(.tweak.bib), x)) {
message('tweaking ', i)
bib[[i]] = merge_list(bib[[i]], .tweak.bib[[i]])
}
bib = lapply(bib, function(b) {
b['author'] = sub('Duncan Temple Lang', 'Duncan {Temple Lang}', b['author'])
# remove the ugly single quotes required by CRAN policy
b['title'] = gsub("(^|\\W)'([^']+)'(\\W|$)", '\\1\\2\\3', b['title'])
# keep the first URL if multiple are provided
if (!is.na(b['note'])) b['note'] = gsub(
'(^.*?https?://.*?),\\s+https?://.*?(},\\s*)$', '\\1\\2', b['note']
)
if (!('year' %in% names(b))) b['year'] = .this.year
b
})
}
# also read citation entries from the CITATION file if provided
bib2 = lapply(x, function(pkg) {
if (pkg == 'base') return()
if (system.file('CITATION', package = pkg) == '') return()
cites = citation(pkg, auto = FALSE)
cites = Filter(x = cites, function(cite) {
# exclude entries identical to citation(pkg, auto = TRUE)
!isTRUE(grepl('R package version', cite$note))
})
s = make_unique(unlist(lapply(cites, function(cite) {
if (is.null(cite$year)) format(Sys.Date(), '%Y') else cite$year
})))
mapply(cites, s, FUN = function(cite, suffix) {
# the entry is likely to be the same as citation(pkg, auto = TRUE)
if (isTRUE(grepl('R package version', cite$note))) return()
entry = toBibtex(cite)
entry[1] = sub('\\{,$', sprintf('{%s%s,', pkg, suffix), entry[1])
entry
}, SIMPLIFY = FALSE)
})
bib = c(bib, unlist(bib2, recursive = FALSE))
bib = lapply(bib, function(b) {
idx = which(names(b) == '')
if (!is.null(width)) b[-idx] = str_wrap(b[-idx], width, 2, 4)
lines = c(b[idx[1L]], b[-idx], b[idx[2L]], '')
if (tweak) {
# e.g. KernSmooth and spam has & in the title and the journal, respectively
lines = gsub('(?<!\\\\)&', '\\\\&', lines, perl = TRUE)
}
structure(lines, class = 'Bibtex')
})
if (!is.null(file) && length(x)) write_utf8(unlist(bib), file)
invisible(bib)
write_bib = function(..., prefix = getOption('knitr.bib.prefix', 'R-')) {
xfun::pkg_bib(..., prefix = prefix)
}

.this.year = sprintf(' year = {%s},', format(Sys.Date(), '%Y'))

#' @include utils.R

# hack non-standard author fields
.tweak.bib = local({
x = read.csv(inst_dir('misc/tweak_bib.csv'), stringsAsFactors = FALSE)
if (Sys.getlocale('LC_COLLATE') == 'en_US.UTF-8') {
x = x[order(xtfrm(x$package)), , drop = FALSE] # reorder entries by package names
try_silent(write.csv(x, inst_dir('misc/tweak_bib.csv'), row.names = FALSE))
}
setNames(
lapply(x$author, function(a) c(author = sprintf(' author = {%s},', a))),
x$package
)
})
2 changes: 1 addition & 1 deletion R/output.R
Original file line number Diff line number Diff line change
Expand Up @@ -529,7 +529,7 @@ sew.source = function(x, options, ...) {
msg_wrap = function(message, type, options) {
# when the output format is LaTeX, do not wrap messages (let LaTeX deal with wrapping)
if (!length(grep('\n', message)) && !out_format(c('latex', 'listings', 'sweave')))
message = str_wrap(message, width = getOption('width'))
message = xfun::str_wrap(message, width = getOption('width'))
knit_log$set(setNames(
list(c(knit_log$get(type), paste0('Chunk ', options$label, ':\n ', message))),
type
Expand Down
8 changes: 0 additions & 8 deletions R/utils-string.R
Original file line number Diff line number Diff line change
Expand Up @@ -17,14 +17,6 @@ str_insert = function(x, i, value) {
paste0(substr(x, 1, i), value, substr(x, i + 1, n))
}

# a wrapper function to make strwrap() return a character vector of the same
# length as the input vector; each element of the output vector is a string
# formed by concatenating wrapped strings by \n
str_wrap = function(...) {
res = strwrap(..., simplify = FALSE)
unlist(lapply(res, one_string))
}

# a simplified replacement for stringr::str_locate_all() that returns a list
# having an element for every element of 'string'; every list element is an
# integer matrix having a row per match, and two columns: 'start' and 'end'.
Expand Down
55 changes: 3 additions & 52 deletions R/utils.R
Original file line number Diff line number Diff line change
Expand Up @@ -909,47 +909,10 @@ create_label = function(..., latex = FALSE) {

#' Combine multiple words into a single string
#'
#' When a value from an inline R expression is a character vector of multiple
#' elements, we may want to combine them into a phrase like \samp{a and b}, or
#' \code{a, b, and c}. That is what this a helper function does.
#'
#' If the length of the input \code{words} is smaller than or equal to 1,
#' \code{words} is returned. When \code{words} is of length 2, the first word
#' and second word are combined using the \code{and} string, or if blank,
#' \code{sep} if is used. When the length is greater than 2, \code{sep} is used
#' to separate all words, and the \code{and} string is prepended to the last
#' word.
#' @param words A character vector.
#' @param sep Separator to be inserted between words.
#' @param and Character string to be prepended to the last word.
#' @param before,after A character string to be added before/after each word.
#' @param oxford_comma Whether to insert the separator between the last two
#' elements in the list.
#' @return A character string marked by \code{xfun::\link[xfun]{raw_string}()}.
#' This is a wrapper function of \code{xfun::join_words()}.
#' @param ... Arguments passed to \code{xfun::\link[xfun]{join_words}()}.
#' @export
#' @examples combine_words('a'); combine_words(c('a', 'b'))
#' combine_words(c('a', 'b', 'c'))
#' combine_words(c('a', 'b', 'c'), sep = ' / ', and = '')
#' combine_words(c('a', 'b', 'c'), and = '')
#' combine_words(c('a', 'b', 'c'), before = '"', after = '"')
#' combine_words(c('a', 'b', 'c'), before = '"', after = '"', oxford_comma=FALSE)
combine_words = function(
words, sep = ', ', and = ' and ', before = '', after = before, oxford_comma = TRUE
) {
n = length(words); rs = xfun::raw_string
if (n == 0) return(words)
words = paste0(before, words, after)
if (n == 1) return(rs(words))
if (n == 2) return(rs(paste(words, collapse = if (is_blank(and)) sep else and)))
if (oxford_comma && grepl('^ ', and) && grepl(' $', sep)) and = gsub('^ ', '', and)
words[n] = paste0(and, words[n])
# combine the last two words directly without the comma
if (!oxford_comma) {
words[n - 1] = paste0(words[n - 1:0], collapse = '')
words = words[-n]
}
rs(paste(words, collapse = sep))
}
combine_words = function(...) xfun::join_words(...)

warning2 = function(...) warning(..., call. = FALSE)
stop2 = function(...) stop(..., call. = FALSE)
Expand Down Expand Up @@ -1100,18 +1063,6 @@ one_string = function(x, ...) paste(x, ..., collapse = '\n')
# double quote a vector and combine by "; "
quote_vec = function(x, sep = '; ') paste0(sprintf('"%s"', x), collapse = sep)

# c(1, 1, 1, 2, 3, 3) -> c(1a, 1b, 1c, 2a, 3a, 3b)
make_unique = function(x) {
if (length(x) == 0) return(x)
x2 = make.unique(x)
if (all(i <- x2 == x)) return(x)
x2[i] = paste0(x2[i], '.0')
i = as.numeric(sub('.*[.]([0-9]+)$', '\\1', x2)) + 1
s = letters[i]
s = ifelse(is.na(s), i, s)
paste0(x, s)
}

#' Encode an image file to a data URI
#'
#' This function is the same as \code{xfun::\link[xfun]{base64_uri}()} (only with a
Expand Down
44 changes: 0 additions & 44 deletions inst/misc/tweak_bib.csv

This file was deleted.

44 changes: 3 additions & 41 deletions man/combine_words.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading
Loading