Skip to content

Commit

Permalink
move combine_words() and write_bib() to xfun
Browse files Browse the repository at this point in the history
  • Loading branch information
yihui committed Nov 13, 2024
1 parent 2e36c36 commit d35f2f1
Show file tree
Hide file tree
Showing 11 changed files with 20 additions and 431 deletions.
3 changes: 2 additions & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -163,10 +163,10 @@ SystemRequirements: Package vignettes based on R Markdown v2 or reStructuredText
Collate:
'block.R'
'cache.R'
'utils.R'
'citation.R'
'hooks-html.R'
'plot.R'
'utils.R'
'defaults.R'
'concordance.R'
'engine.R'
Expand Down Expand Up @@ -199,3 +199,4 @@ Collate:
'utils-vignettes.R'
'zzz.R'
RoxygenNote: 7.3.2
Remotes: yihui/xfun
4 changes: 4 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,9 @@
# CHANGES IN knitr VERSION 1.50

## MINOR CHANGES

- Moved implementations of `combine_words()` and `write_bib()` to the **xfun** package as `xfun::join_words()` and `xfun::pkg_bib()`, respectively, since they are not directly relevant to **knitr**. The functions `combine_words()` and `write_bib()` are still kept in **knitr**, and can continue to be used in the future.

# CHANGES IN knitr VERSION 1.49

## NEW FEATURES
Expand Down
172 changes: 4 additions & 168 deletions R/citation.R
Original file line number Diff line number Diff line change
@@ -1,172 +1,8 @@
#' Generate BibTeX bibliography databases for R packages
#'
#' This function uses \code{utils::\link{citation}()} and
#' \code{utils::\link{toBibtex}()} to create bib entries for R packages and
#' write them in a file. It can facilitate the auto-generation of bibliography
#' databases for R packages, and it is easy to regenerate all the citations
#' after updating R packages.
#'
#' For a package, the keyword \samp{R-pkgname} is used for its bib item, where
#' \samp{pkgname} is the name of the package. Citation entries specified in the
#' \file{CITATION} file of the package are also included. The main purpose of
#' this function is to automate the generation of the package citation
#' information because it often changes (e.g. author, year, package version,
#' ...).
#'
#' There are at least two different uses for the URL in a reference list. You
#' might want to tell users where to go for more information; in that case, use
#' the default \code{packageURL = TRUE}, and the first URL listed in the
#' \file{DESCRIPTION} file will be used. Be careful: some authors don't put the
#' most relevant URL first. Alternatively, you might want to identify exactly
#' which version of the package was used in the document. If it was installed
#' from CRAN or some other repositories, the version number identifies it, and
#' \code{packageURL = FALSE} will use the repository URL (as used by
#' \code{utils::\link{citation}()}).
#'
#' @param x Package names. Packages which are not installed are ignored.
#' @param file The (\file{.bib}) file to write. By default, or if \code{NULL},
#' output is written to the R console.
#' @param tweak Whether to fix some known problems in the citations, especially
#' non-standard format of author names.
#' @param width Width of lines in bibliography entries. If \code{NULL}, lines
#' will not be wrapped.
#' @param prefix Prefix string for keys in BibTeX entries; by default, it is
#' \samp{R-} unless \code{\link{option}('knitr.bib.prefix')} has been set to
#' another string.
#' @param lib.loc A vector of path names of R libraries.
#' @param packageURL Use the \code{URL} field from the \file{DESCRIPTION} file.
#' See Details below.
#' @return A list containing the citations. Citations are also written to the
#' \code{file} as a side effect.
#' @note Some packages on CRAN do not have standard bib entries, which was once
#' reported by Michael Friendly at
#' \url{https://stat.ethz.ch/pipermail/r-devel/2010-November/058977.html}. I
#' find this a real pain, and there are no easy solutions except contacting
#' package authors to modify their DESCRIPTION files. Anyway, the argument
#' \code{tweak} has provided ugly hacks to deal with packages which are known
#' to be non-standard in terms of the format of citations; \code{tweak = TRUE}
#' is by no means intended to hide or modify the original citation
#' information. It is just due to the loose requirements on package authors
#' for the DESCRIPTION file. On one hand, I apologize if it really mangles the
#' information about certain packages; on the other, I strongly recommend
#' package authors to consider the \samp{Authors@@R} field (see the manual
#' \emph{Writing R Extensions}) to make it easier for other people to cite R
#' packages. See \code{knitr:::.tweak.bib} for details of tweaks. Also note
#' this is subject to future changes since R packages are being updated. If
#' you want to contribute more tweaks, please edit the file
#' \file{inst/misc/tweak_bib.csv} in the source package.
#' A wrapper function of \code{xfun::pkg_bib()}.
#' @param ...,prefix Arguments passed to \code{xfun::\link[xfun]{pkg_bib}()}.
#' @export
#' @author Yihui Xie and Michael Friendly
#' @examplesIf interactive()
#' write_bib(c('RGtk2', 'gWidgets'), file = 'R-GUI-pkgs.bib')
#' unlink('R-GUI-pkgs.bib')
#'
#' write_bib(c('animation', 'rgl', 'knitr', 'ggplot2'))
#' write_bib(c('base', 'parallel', 'MASS')) # base and parallel are identical
#' write_bib('cluster', prefix = '') # a empty prefix
#' write_bib('digest', prefix = 'R-pkg-') # a new prefix
#' write_bib('digest', tweak = FALSE) # original version
#'
#' # what tweak=TRUE does
#' str(knitr:::.tweak.bib)
write_bib = function(
x = .packages(), file = '', tweak = TRUE, width = NULL,
prefix = getOption('knitr.bib.prefix', 'R-'), lib.loc = NULL,
packageURL = TRUE
) {
system.file = function(...) base::system.file(..., lib.loc = lib.loc)
citation = function(...) utils::citation(..., lib.loc = lib.loc)
x = x[nzchar(x)] # remove possible empty string
idx = mapply(system.file, package = x) == ''
if (any(idx)) {
warning('package(s) ', paste(x[idx], collapse = ', '), ' not found')
x = x[!idx]
}
# no need to write bib for packages in base R other than `base` itself
x = setdiff(x, setdiff(xfun::base_pkgs(), 'base'))
x = sort(x)
bib = sapply(x, function(pkg) {
meta = packageDescription(pkg, lib.loc = lib.loc)
# don't use the citation() URL if the package has provided its own URL
cite = citation(pkg, auto = if (is.null(meta$URL)) meta else {
if (packageURL) meta$Repository = meta$RemoteType = NULL
# use the first URL in case the package provided multiple URLs
meta$URL = sub('[, \t\n].*', '', meta$URL)
meta
})
if (tweak) {
# e.g. gpairs has "gpairs: " in the title
cite$title = gsub(sprintf('^(%s: )(\\1)', pkg), '\\1', cite$title)
}
entry = toBibtex(cite)
entry[1] = sub('\\{,$', sprintf('{%s%s,', prefix, pkg), entry[1])
entry
}, simplify = FALSE)
if (tweak) {
for (i in intersect(names(.tweak.bib), x)) {
message('tweaking ', i)
bib[[i]] = merge_list(bib[[i]], .tweak.bib[[i]])
}
bib = lapply(bib, function(b) {
b['author'] = sub('Duncan Temple Lang', 'Duncan {Temple Lang}', b['author'])
# remove the ugly single quotes required by CRAN policy
b['title'] = gsub("(^|\\W)'([^']+)'(\\W|$)", '\\1\\2\\3', b['title'])
# keep the first URL if multiple are provided
if (!is.na(b['note'])) b['note'] = gsub(
'(^.*?https?://.*?),\\s+https?://.*?(},\\s*)$', '\\1\\2', b['note']
)
if (!('year' %in% names(b))) b['year'] = .this.year
b
})
}
# also read citation entries from the CITATION file if provided
bib2 = lapply(x, function(pkg) {
if (pkg == 'base') return()
if (system.file('CITATION', package = pkg) == '') return()
cites = citation(pkg, auto = FALSE)
cites = Filter(x = cites, function(cite) {
# exclude entries identical to citation(pkg, auto = TRUE)
!isTRUE(grepl('R package version', cite$note))
})
s = make_unique(unlist(lapply(cites, function(cite) {
if (is.null(cite$year)) format(Sys.Date(), '%Y') else cite$year
})))
mapply(cites, s, FUN = function(cite, suffix) {
# the entry is likely to be the same as citation(pkg, auto = TRUE)
if (isTRUE(grepl('R package version', cite$note))) return()
entry = toBibtex(cite)
entry[1] = sub('\\{,$', sprintf('{%s%s,', pkg, suffix), entry[1])
entry
}, SIMPLIFY = FALSE)
})
bib = c(bib, unlist(bib2, recursive = FALSE))
bib = lapply(bib, function(b) {
idx = which(names(b) == '')
if (!is.null(width)) b[-idx] = str_wrap(b[-idx], width, 2, 4)
lines = c(b[idx[1L]], b[-idx], b[idx[2L]], '')
if (tweak) {
# e.g. KernSmooth and spam has & in the title and the journal, respectively
lines = gsub('(?<!\\\\)&', '\\\\&', lines, perl = TRUE)
}
structure(lines, class = 'Bibtex')
})
if (!is.null(file) && length(x)) write_utf8(unlist(bib), file)
invisible(bib)
write_bib = function(..., prefix = getOption('knitr.bib.prefix', 'R-')) {
xfun::pkg_bib(..., prefix = prefix)
}

.this.year = sprintf(' year = {%s},', format(Sys.Date(), '%Y'))

#' @include utils.R

# hack non-standard author fields
.tweak.bib = local({
x = read.csv(inst_dir('misc/tweak_bib.csv'), stringsAsFactors = FALSE)
if (Sys.getlocale('LC_COLLATE') == 'en_US.UTF-8') {
x = x[order(xtfrm(x$package)), , drop = FALSE] # reorder entries by package names
try_silent(write.csv(x, inst_dir('misc/tweak_bib.csv'), row.names = FALSE))
}
setNames(
lapply(x$author, function(a) c(author = sprintf(' author = {%s},', a))),
x$package
)
})
2 changes: 1 addition & 1 deletion R/output.R
Original file line number Diff line number Diff line change
Expand Up @@ -529,7 +529,7 @@ sew.source = function(x, options, ...) {
msg_wrap = function(message, type, options) {
# when the output format is LaTeX, do not wrap messages (let LaTeX deal with wrapping)
if (!length(grep('\n', message)) && !out_format(c('latex', 'listings', 'sweave')))
message = str_wrap(message, width = getOption('width'))
message = xfun::str_wrap(message, width = getOption('width'))
knit_log$set(setNames(
list(c(knit_log$get(type), paste0('Chunk ', options$label, ':\n ', message))),
type
Expand Down
8 changes: 0 additions & 8 deletions R/utils-string.R
Original file line number Diff line number Diff line change
Expand Up @@ -17,14 +17,6 @@ str_insert = function(x, i, value) {
paste0(substr(x, 1, i), value, substr(x, i + 1, n))
}

# a wrapper function to make strwrap() return a character vector of the same
# length as the input vector; each element of the output vector is a string
# formed by concatenating wrapped strings by \n
str_wrap = function(...) {
res = strwrap(..., simplify = FALSE)
unlist(lapply(res, one_string))
}

# a simplified replacement for stringr::str_locate_all() that returns a list
# having an element for every element of 'string'; every list element is an
# integer matrix having a row per match, and two columns: 'start' and 'end'.
Expand Down
55 changes: 3 additions & 52 deletions R/utils.R
Original file line number Diff line number Diff line change
Expand Up @@ -909,47 +909,10 @@ create_label = function(..., latex = FALSE) {

#' Combine multiple words into a single string
#'
#' When a value from an inline R expression is a character vector of multiple
#' elements, we may want to combine them into a phrase like \samp{a and b}, or
#' \code{a, b, and c}. That is what this a helper function does.
#'
#' If the length of the input \code{words} is smaller than or equal to 1,
#' \code{words} is returned. When \code{words} is of length 2, the first word
#' and second word are combined using the \code{and} string, or if blank,
#' \code{sep} if is used. When the length is greater than 2, \code{sep} is used
#' to separate all words, and the \code{and} string is prepended to the last
#' word.
#' @param words A character vector.
#' @param sep Separator to be inserted between words.
#' @param and Character string to be prepended to the last word.
#' @param before,after A character string to be added before/after each word.
#' @param oxford_comma Whether to insert the separator between the last two
#' elements in the list.
#' @return A character string marked by \code{xfun::\link[xfun]{raw_string}()}.
#' This is a wrapper function of \code{xfun::join_words()}.
#' @param ... Arguments passed to \code{xfun::\link[xfun]{join_words}()}.
#' @export
#' @examples combine_words('a'); combine_words(c('a', 'b'))
#' combine_words(c('a', 'b', 'c'))
#' combine_words(c('a', 'b', 'c'), sep = ' / ', and = '')
#' combine_words(c('a', 'b', 'c'), and = '')
#' combine_words(c('a', 'b', 'c'), before = '"', after = '"')
#' combine_words(c('a', 'b', 'c'), before = '"', after = '"', oxford_comma=FALSE)
combine_words = function(
words, sep = ', ', and = ' and ', before = '', after = before, oxford_comma = TRUE
) {
n = length(words); rs = xfun::raw_string
if (n == 0) return(words)
words = paste0(before, words, after)
if (n == 1) return(rs(words))
if (n == 2) return(rs(paste(words, collapse = if (is_blank(and)) sep else and)))
if (oxford_comma && grepl('^ ', and) && grepl(' $', sep)) and = gsub('^ ', '', and)
words[n] = paste0(and, words[n])
# combine the last two words directly without the comma
if (!oxford_comma) {
words[n - 1] = paste0(words[n - 1:0], collapse = '')
words = words[-n]
}
rs(paste(words, collapse = sep))
}
combine_words = function(...) xfun::join_words(...)

warning2 = function(...) warning(..., call. = FALSE)
stop2 = function(...) stop(..., call. = FALSE)
Expand Down Expand Up @@ -1100,18 +1063,6 @@ one_string = function(x, ...) paste(x, ..., collapse = '\n')
# double quote a vector and combine by "; "
quote_vec = function(x, sep = '; ') paste0(sprintf('"%s"', x), collapse = sep)

# c(1, 1, 1, 2, 3, 3) -> c(1a, 1b, 1c, 2a, 3a, 3b)
make_unique = function(x) {
if (length(x) == 0) return(x)
x2 = make.unique(x)
if (all(i <- x2 == x)) return(x)
x2[i] = paste0(x2[i], '.0')
i = as.numeric(sub('.*[.]([0-9]+)$', '\\1', x2)) + 1
s = letters[i]
s = ifelse(is.na(s), i, s)
paste0(x, s)
}

#' Encode an image file to a data URI
#'
#' This function is the same as \code{xfun::\link[xfun]{base64_uri}()} (only with a
Expand Down
44 changes: 0 additions & 44 deletions inst/misc/tweak_bib.csv

This file was deleted.

44 changes: 3 additions & 41 deletions man/combine_words.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading

0 comments on commit d35f2f1

Please sign in to comment.