Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[R] CRAN packaging checklist for version 17.0.0 #43317

Closed
32 of 38 tasks
jonkeane opened this issue Jul 18, 2024 · 57 comments
Closed
32 of 38 tasks

[R] CRAN packaging checklist for version 17.0.0 #43317

jonkeane opened this issue Jul 18, 2024 · 57 comments

Comments

@jonkeane
Copy link
Member

jonkeane commented Jul 18, 2024

Describe the enhancement requested

Packaging checklist for CRAN release

For a high-level overview of the release process see the
Apache Arrow Release Management Guide.

cc @jonkeane @amoeba @nealrichardson @assignUser @paleolimbot for coordination on the various tasks to release

We will want to cherry pick:

Before the release candidate is cut

  • Create a GitHub issue entitled [R] CRAN packaging checklist for version X.X.X and copy this checklist to the issue.
  • Review deprecated functions to advance their deprecation status, including removing preprocessor directives that no longer apply (search for ARROW_VERSION_MAJOR in r/src).
  • Evaluate the status of any failing nightly tests and nightly packaging builds. These checks replicate most of the checks that CRAN runs, so we need them all to be passing or to understand that the failures may (though won't necessarily) result in a rejection from CRAN.
  • Check current CRAN check results
  • Ensure the contents of the README are accurate and up to date.
  • Run urlchecker::url_check() on the R directory at the release candidate.
    commit. Ignore any errors with badges as they will be removed in the CRAN release branch.
  • Polish NEWS but do not update version numbers (this is done automatically later). You can find commits by, for example, git log --oneline aa057d0..HEAD | grep "\[R\]"
  • Run preliminary reverse dependency checks using archery docker run r-revdepcheck.
  • For major releases, prepare tweet thread highlighting new features.

Wait for the release candidate to be cut:

After release candidate has been cut

  • Create a CRAN-release branch from the release candidate commit

Prepare and check the .tar.gz that will be released to CRAN.

  • git fetch upstream && git checkout release-X.X.X-rcXX && git clean -f -d
  • Run make build. This copies Arrow C++ into tools/cpp, prunes some
    unnecessary components, and runs R CMD build to generate the source tarball.
    Because this will install the package, you will need to ensure that the version
    of Arrow C++ available to the configure script is the same as the version
    that is vendored into the R package (e.g., you may need to unset ARROW_HOME).
  • devtools::check_built("arrow_X.X.X.tar.gz") locally
  • Run reverse dependency checks using archery docker run r-revdepcheck.

Release vote

  • Release vote passed!

Generate R package to submit to CRAN

  • If the release candidate commit updated, rebase the CRAN release branch
    on that commit.
  • Pick any commits that were made to main since the release commit that
    were needed to fix CRAN-related submission issues identified in the above
    steps.
  • Remove badges from README.md
  • Run urlchecker::url_check() on the R directory
  • Create a PR entitled WIP: [R] Verify CRAN release-10.0.1-rc0. Add
    a comment @github-actions crossbow submit --group r to run all R crossbow
    jobs against the CRAN-specific release branch.
  • Run Rscript tools/update-checksums.R <libarrow version> to download the checksums for the pre-compiled binaries from the ASF artifactory into the tools directory.
  • Regenerate arrow_X.X.X.tar.gz (i.e., make build)

Ensure linux binary packages are available:

Check binary Arrow C++ distributions specific to the R package

  • Upload the .tar.gz to win-builder (r-devel only)
    and confirm (with Nic, who will automatically receive an email about the results) that the check is clean.
    This step cannot be completed before Jeroen has put the binaries in the MinGW repository, i.e. here, here, and here.
  • Upload the .tar.gz to MacBuilder
    and confirm that the check is clean
  • Check install.packages("arrow_X.X.X.tar.gz") on Ubuntu and ensure that the
    hosted binaries are used
  • devtools::check_built("arrow_X.X.X.tar.gz") locally one more time (for luck)

CRAN submission

  • Upload arrow_X.X.X.tar.gz to the
    CRAN submit page
  • Confirm the submission email

Wait for CRAN...

  • Accepted!
  • Tag the tip of the CRAN-specific release branch
  • Add a new line to the matrix in the backwards compatability job
  • (patch releases only) Update the package version in ci/scripts/PKGBUILD, dev/tasks/homebrew-formulae/autobrew/apache-arrow.rb, r/DESCRIPTION, and r/NEWS.md
  • (CRAN-only releases) Rebuild news page with pkgdown::build_news() and submit a PR to the asf-site branch of the docs site with the contents of arrow/r/docs/news/index.html replacing the current contents of arrow-site/docs/r/news/index.html
  • (CRAN-only releases) Bump the version number in r/pkgdown/assets/versions.json, and update this on the the asf-site branch of the docs site too.
  • Update the packaging checklist template to reflect any new realities of the
    packaging process.
  • Wait for CRAN-hosted binaries on the
    CRAN package page to reflect the
    new version
  • Tweet!

Component(s)

R

Component(s)

R

@jonkeane
Copy link
Member Author

@assignUser Did you use Rscript tools/update-checksums.R <libarrow version> in the past to generate the contents of 12ec842 ?

@assignUser
Copy link
Member

Yes :)
Let me know if I can help with anything!

@amoeba
Copy link
Member

amoeba commented Jul 18, 2024

I added the NEWS.md entry from #43189 to the release blog post PR.

@assignUser
Copy link
Member

@jonkeane mac builder: https://mac.r-project.org/macbuilder/results/1721447405-da61deb72185bfbc/
win builder should reach you by mail, if that's clean I think only a revdep check is missing. @paleolimbot didn't you have a super quick script that doesn't take a million hours to run?

@jonkeane
Copy link
Member Author

Thanks for sending those, Jacob.

Here's the winbuilder output: https://win-builder.r-project.org/SrOKYg8yT1lm/

https://win-builder.r-project.org/SrOKYg8yT1lm/00check.log

The only thing in there that looks worrying is

* checking R code for possible problems ... [33s] NOTE
mutate.ArrowTabular: no visible global function definition for
  'left_join'
mutate.Dataset: no visible global function definition for 'left_join'
mutate.RecordBatchReader: no visible global function definition for
  'left_join'
mutate.arrow_dplyr_query: no visible global function definition for
  'left_join'
Undefined global functions or variables:
  left_join
* checking Rd files ... OK

Which I suspect is from https://github.com/apache/arrow/pull/41576/files and can be dealt with by adding dplyr:: onto those calls again

@amoeba
Copy link
Member

amoeba commented Jul 20, 2024

I brought in the changes from maint-17.0.0-r (see my main branch) and kicked off two recheck jobs (1, 2).

Edit: I need to tweak recheck a bit to run so the linked jobs failed right away. I'll update them once recheck is running.
Edit 2: Updated URLs.

@nealrichardson
Copy link
Member

Which I suspect is from https://github.com/apache/arrow/pull/41576/files and can be dealt with by adding dplyr:: onto those calls again

Sorry! I thought I fixed those but I guess I added more. LMK if you need me to make a PR.

@jonkeane
Copy link
Member Author

jonkeane commented Jul 20, 2024

No problem, I've got one at #43348

I'll merge that if CI shows it gets rid of the note (and there are no other bad consequences) then cherrypick that back to the maintenance branch.

This isn't a here-and-now problem, but boy do I wish we could say "these notes we don't care about but all the others we do" for things like this 🤦

@jonkeane
Copy link
Member Author

@amoeba am I reading these logs from recheck correctly that with new arrow parquetize is failing in this example:

> ### Name: csv_to_parquet
> ### Title: Convert a csv or a txt file to parquet format
> ### Aliases: csv_to_parquet
> 
> ### ** Examples
> 
> 
> # Conversion from a local csv file to a single parquet file :
> 
> csv_to_parquet(
+   path_to_file = parquetize_example("region_2022.csv"),
+   path_to_parquet = tempfile(fileext=".parquet")
+ )
Reading data...
Writing data...
Error in Table__from_dots(dots, schema, option_use_threads()) : 
  No Set_elt found for ALTSTRING class [class: vroom_chr, pkg: vroom]
Calls: csv_to_parquet ... as_arrow_table.data.frame -> <Anonymous> -> Table__from_dots
Reading data...
Execution halted

@amoeba
Copy link
Member

amoeba commented Jul 20, 2024

Hey @jonkeane, I think that's the right interpretation.

@jonkeane
Copy link
Member Author

Ah, that is a regression on us: #43349 I need to step away for a bit, in case someone wants to investigate what's up there. I don't think much has changed with our altrep code, but this could be from our non-api avoidance changes (or possibly the change we made to avoid executing functions while reading data).

@jonkeane
Copy link
Member Author

Ok, I've merged the fix for the altrep thing that was hitting parquetize. @amoeba would you mind triggering the revdep checks again?

@jonkeane
Copy link
Member Author

I just got a note about errors from lz4 on CRAN's devel debian runners.

With current Debian testing, installation now fails, see the install log. Apparently this is from yesterday's system upgrade of

liblz4-dev:amd64 (1.9.4-2, 1.9.4-3), liblz4-1:amd64 (1.9.4-2, 1.9.4-3)

I can confirm that ARROW_WITH_LZ4=OFF ./configure works where ./configure now fails.

We will need to fix this as will in our 17 submission (and get it in within two weeks too).

We are linking against the shared, system lz4 which is fine in this case because CRAN does not distribute these binaries. But one possibility is to force non-shared lz4 and it'll grab source + build.

cc @assignUser @nealrichardson

@nealrichardson
Copy link
Member

We can work around that, sure, but if that's failing, then it sounds like there's something off in our cmake routine in this case, and that should get addressed. For example, I see the message from this line in the logs: https://github.com/apache/arrow/blob/main/cpp/cmake_modules/ThirdpartyToolchain.cmake#L352 But I don't see either of the messages that should follow, about whether lz4 was found or not.

@jonkeane
Copy link
Member Author

For example, I see the message from this line in the logs: https://github.com/apache/arrow/blob/main/cpp/cmake_modules/ThirdpartyToolchain.cmake#L352 But I don't see either of the messages that should follow, about whether lz4 was found or not.

When you say you don't see either of the messages that follow are you talking about

message(STATUS "Using ${RESOLVE_DEPENDENCY_PC_PACKAGE}")
or
message(STATUS "${RESOLVE_DEPENDENCY_PC_PACKAGE} isn't found")
? If so, that first one (line 354) is actually where the log line that you're seeing is emitted, I believe. The line you linked 352 is adding to the string, but the part that starts with Using ... is emitted at line 354 (so we do get there!)

@jonkeane
Copy link
Member Author

@kou might have some ideas + help here too (sorry I should have tagged you on that original comment)

@nealrichardson
Copy link
Member

Ah my bad, you are right.

@kou
Copy link
Member

kou commented Jul 31, 2024

#43468 will solve the problem. I'll complete it soon.

@assignUser
Copy link
Member

Ah thanks @kou! I was just exploring the lz4 that comes with debian:testing you are right: foreach(_cmake_expected_target IN ITEMS LZ4::lz4_shared LZ4::lz4_static)

@jonkeane
Copy link
Member Author

Thank you (again) @kou !

@amoeba
Copy link
Member

amoeba commented Jul 31, 2024

Revdepchecks are running at https://github.com/amoeba/arrow/actions. It looks like one already failed but I don't have enough signal to check right now (will tomorrow).

@jonkeane
Copy link
Member Author

FYI, I got another email from CRAN preponing the deadline to ~today since CRAN shuts down for two weeks tomorrow 😬

@jonkeane
Copy link
Member Author

jonkeane commented Jul 31, 2024

Weird, I see ^^^ in the logs https://github.com/amoeba/arrow/actions/runs/10172857322/job/28137640795 but when I download the archives I don't see any issues with either of those in the actual logs.

Ugh, nevermind, I was looking at the wrong thing! HMMM I see the same issue for parquetize now that I'm looking at the right thing.

Is it possible you haven't pulled the latest updates from https://github.com/apache/arrow/tree/maint-17.0.0-r ?

As for the tidyquery failure, maybe @ianmcook could help figure out what's up with that??

Package: tidyquery 0.2.4
Check: tests, Result: ERROR
    Running ‘testthat.R’
  Running the tests in ‘tests/testthat.R’ failed.
  Last 13 lines of output:
      6. ├─tidyquery::query("SELECT origin, dest,\n          COUNT(flight) AS num_flts,\n          round(SUM(seats)) AS num_seats,\n          round(AVG(arr_delay)) AS avg_delay\n      FROM flights_at f LEFT OUTER JOIN planes_at p\n        ON f.tailnum = p.tailnum\n      WHERE distance BETWEEN 200 AND 300\n      AND air_time IS NOT NULL\n      GROUP BY origin, dest\n      -- HAVING num_flts > 3000\n      ORDER BY num_seats DESC, avg_delay ASC\n      LIMIT 100;")
      7. │ └─tidyquery:::query_(data, sql, TRUE)
      8. │   └─out %>% verb(filter, !!(tree$where[[1]]))
      9. ├─tidyquery:::verb(., filter, !!(tree$where[[1]]))
     10. │ └─input$data %>% fun(...)
     11. ├─dplyr (local) fun(., ...)
     12. └─arrow:::filter.arrow_dplyr_query(., ...)
     13.   ├─arrow:::try_arrow_dplyr(...)
     14.   │ └─base::evalq(call <- match.call(), parent)
     15.   │   └─base::evalq(call <- match.call(), parent)
     16.   └─base::match.call()
    
    [ FAIL 9 | WARN 0 | SKIP 4 | PASS 226 ]
    Error: Test failures
    Execution halted

@jonkeane
Copy link
Member Author

I've also tried to replicate the CRAN devel image locally (I'll push a PR up with that for CI after this fire drill). After Kou's change, we get past LZ4. But then it seems to fail with the following. This isn't related directly to the LZ4 change I don't think, and as far as I could tell on github that abseil code hasn't changed in years. This could be a false positive, since I don't have an exact replica of the CRAN runner, but wonder if anyone has any ideas about what this might be.

[ 55%] Building CXX object absl/strings/CMakeFiles/str_format_internal.dir/internal/str_format/extension.cc.o

-- stderr output is:
...skipping to end...
230e44/absl_ep-prefix/src/absl_ep/absl/strings/internal/str_format/extension.cc:45:1: note: in expansion of macro ‘ABSL_INTERNAL_CONVERSION_CHARS_EXPAND_’
   45 | ABSL_INTERNAL_CONVERSION_CHARS_EXPAND_(ABSL_INTERNAL_CHAR_SET_CASE, )
      | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/tmp/RtmpabvneB/file7a47e230e44/absl_ep-prefix/src/absl_ep/absl/strings/internal/str_format/extension.cc:44:13: error: ‘FormatConversionCharSet’ does not name a type; did you mean ‘FormatConversionCharInternal’?
   44 |   constexpr FormatConversionCharSet FormatConversionCharSetInternal::c;
      |             ^~~~~~~~~~~~~~~~~~~~~~~
/tmp/RtmpabvneB/file7a47e230e44/absl_ep-prefix/src/absl_ep/absl/strings/internal/str_format/extension.h:167:3: note: in expansion of macro ‘ABSL_INTERNAL_CHAR_SET_CASE’
  167 |   X_VAL(g) X_SEP X_VAL(G) X_SEP X_VAL(a) X_SEP X_VAL(A) X_SEP \
      |   ^~~~~
/tmp/RtmpabvneB/file7a47e230e44/absl_ep-prefix/src/absl_ep/absl/strings/internal/str_format/extension.cc:45:1: note: in expansion of macro ‘ABSL_INTERNAL_CONVERSION_CHARS_EXPAND_’
   45 | ABSL_INTERNAL_CONVERSION_CHARS_EXPAND_(ABSL_INTERNAL_CHAR_SET_CASE, )
      | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/tmp/RtmpabvneB/file7a47e230e44/absl_ep-prefix/src/absl_ep/absl/strings/internal/str_format/extension.cc:44:13: error: ‘FormatConversionCharSet’ does not name a type; did you mean ‘FormatConversionCharInternal’?
   44 |   constexpr FormatConversionCharSet FormatConversionCharSetInternal::c;
      |             ^~~~~~~~~~~~~~~~~~~~~~~
/tmp/RtmpabvneB/file7a47e230e44/absl_ep-prefix/src/absl_ep/absl/strings/internal/str_format/extension.h:167:18: note: in expansion of macro ‘ABSL_INTERNAL_CHAR_SET_CASE’
  167 |   X_VAL(g) X_SEP X_VAL(G) X_SEP X_VAL(a) X_SEP X_VAL(A) X_SEP \
      |                  ^~~~~
/tmp/RtmpabvneB/file7a47e230e44/absl_ep-prefix/src/absl_ep/absl/strings/internal/str_format/extension.cc:45:1: note: in expansion of macro ‘ABSL_INTERNAL_CONVERSION_CHARS_EXPAND_’
   45 | ABSL_INTERNAL_CONVERSION_CHARS_EXPAND_(ABSL_INTERNAL_CHAR_SET_CASE, )
      | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/tmp/RtmpabvneB/file7a47e230e44/absl_ep-prefix/src/absl_ep/absl/strings/internal/str_format/extension.cc:44:13: error: ‘FormatConversionCharSet’ does not name a type; did you mean ‘FormatConversionCharInternal’?
   44 |   constexpr FormatConversionCharSet FormatConversionCharSetInternal::c;
      |             ^~~~~~~~~~~~~~~~~~~~~~~
/tmp/RtmpabvneB/file7a47e230e44/absl_ep-prefix/src/absl_ep/absl/strings/internal/str_format/extension.h:167:33: note: in expansion of macro ‘ABSL_INTERNAL_CHAR_SET_CASE’
  167 |   X_VAL(g) X_SEP X_VAL(G) X_SEP X_VAL(a) X_SEP X_VAL(A) X_SEP \
      |                                 ^~~~~
/tmp/RtmpabvneB/file7a47e230e44/absl_ep-prefix/src/absl_ep/absl/strings/internal/str_format/extension.cc:45:1: note: in expansion of macro ‘ABSL_INTERNAL_CONVERSION_CHARS_EXPAND_’
   45 | ABSL_INTERNAL_CONVERSION_CHARS_EXPAND_(ABSL_INTERNAL_CHAR_SET_CASE, )
      | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/tmp/RtmpabvneB/file7a47e230e44/absl_ep-prefix/src/absl_ep/absl/strings/internal/str_format/extension.cc:44:13: error: ‘FormatConversionCharSet’ does not name a type; did you mean ‘FormatConversionCharInternal’?
   44 |   constexpr FormatConversionCharSet FormatConversionCharSetInternal::c;
      |             ^~~~~~~~~~~~~~~~~~~~~~~
/tmp/RtmpabvneB/file7a47e230e44/absl_ep-prefix/src/absl_ep/absl/strings/internal/str_format/extension.h:167:48: note: in expansion of macro ‘ABSL_INTERNAL_CHAR_SET_CASE’
  167 |   X_VAL(g) X_SEP X_VAL(G) X_SEP X_VAL(a) X_SEP X_VAL(A) X_SEP \
      |                                                ^~~~~
/tmp/RtmpabvneB/file7a47e230e44/absl_ep-prefix/src/absl_ep/absl/strings/internal/str_format/extension.cc:45:1: note: in expansion of macro ‘ABSL_INTERNAL_CONVERSION_CHARS_EXPAND_’
   45 | ABSL_INTERNAL_CONVERSION_CHARS_EXPAND_(ABSL_INTERNAL_CHAR_SET_CASE, )
      | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/tmp/RtmpabvneB/file7a47e230e44/absl_ep-prefix/src/absl_ep/absl/strings/internal/str_format/extension.cc:44:13: error: ‘FormatConversionCharSet’ does not name a type; did you mean ‘FormatConversionCharInternal’?
   44 |   constexpr FormatConversionCharSet FormatConversionCharSetInternal::c;
      |             ^~~~~~~~~~~~~~~~~~~~~~~
/tmp/RtmpabvneB/file7a47e230e44/absl_ep-prefix/src/absl_ep/absl/strings/internal/str_format/extension.h:169:3: note: in expansion of macro ‘ABSL_INTERNAL_CHAR_SET_CASE’
  169 |   X_VAL(n) X_SEP X_VAL(p)
      |   ^~~~~
/tmp/RtmpabvneB/file7a47e230e44/absl_ep-prefix/src/absl_ep/absl/strings/internal/str_format/extension.cc:45:1: note: in expansion of macro ‘ABSL_INTERNAL_CONVERSION_CHARS_EXPAND_’
   45 | ABSL_INTERNAL_CONVERSION_CHARS_EXPAND_(ABSL_INTERNAL_CHAR_SET_CASE, )
      | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/tmp/RtmpabvneB/file7a47e230e44/absl_ep-prefix/src/absl_ep/absl/strings/internal/str_format/extension.cc:44:13: error: ‘FormatConversionCharSet’ does not name a type; did you mean ‘FormatConversionCharInternal’?
   44 |   constexpr FormatConversionCharSet FormatConversionCharSetInternal::c;
      |             ^~~~~~~~~~~~~~~~~~~~~~~
/tmp/RtmpabvneB/file7a47e230e44/absl_ep-prefix/src/absl_ep/absl/strings/internal/str_format/extension.h:169:18: note: in expansion of macro ‘ABSL_INTERNAL_CHAR_SET_CASE’
  169 |   X_VAL(n) X_SEP X_VAL(p)
      |                  ^~~~~
/tmp/RtmpabvneB/file7a47e230e44/absl_ep-prefix/src/absl_ep/absl/strings/internal/str_format/extension.cc:45:1: note: in expansion of macro ‘ABSL_INTERNAL_CONVERSION_CHARS_EXPAND_’
   45 | ABSL_INTERNAL_CONVERSION_CHARS_EXPAND_(ABSL_INTERNAL_CHAR_SET_CASE, )
      | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/tmp/RtmpabvneB/file7a47e230e44/absl_ep-prefix/src/absl_ep/absl/strings/internal/str_format/extension.cc:49:11: error: ‘FormatConversionCharSet’ does not name a type; did you mean ‘FormatConversionCharInternal’?
   49 | constexpr FormatConversionCharSet FormatConversionCharSetInternal::kStar;
      |           ^~~~~~~~~~~~~~~~~~~~~~~
      |           FormatConversionCharInternal
/tmp/RtmpabvneB/file7a47e230e44/absl_ep-prefix/src/absl_ep/absl/strings/internal/str_format/extension.cc:51:11: error: ‘FormatConversionCharSet’ does not name a type; did you mean ‘FormatConversionCharInternal’?
   51 | constexpr FormatConversionCharSet FormatConversionCharSetInternal::kIntegral;
      |           ^~~~~~~~~~~~~~~~~~~~~~~
      |           FormatConversionCharInternal
/tmp/RtmpabvneB/file7a47e230e44/absl_ep-prefix/src/absl_ep/absl/strings/internal/str_format/extension.cc:53:11: error: ‘FormatConversionCharSet’ does not name a type; did you mean ‘FormatConversionCharInternal’?
   53 | constexpr FormatConversionCharSet FormatConversionCharSetInternal::kFloating;
      |           ^~~~~~~~~~~~~~~~~~~~~~~
      |           FormatConversionCharInternal
/tmp/RtmpabvneB/file7a47e230e44/absl_ep-prefix/src/absl_ep/absl/strings/internal/str_format/extension.cc:55:11: error: ‘FormatConversionCharSet’ does not name a type; did you mean ‘FormatConversionCharInternal’?
   55 | constexpr FormatConversionCharSet FormatConversionCharSetInternal::kNumeric;
      |           ^~~~~~~~~~~~~~~~~~~~~~~
      |           FormatConversionCharInternal
/tmp/RtmpabvneB/file7a47e230e44/absl_ep-prefix/src/absl_ep/absl/strings/internal/str_format/extension.cc:57:11: error: ‘FormatConversionCharSet’ does not name a type; did you mean ‘FormatConversionCharInternal’?
   57 | constexpr FormatConversionCharSet FormatConversionCharSetInternal::kPointer;
      |           ^~~~~~~~~~~~~~~~~~~~~~~
      |           FormatConversionCharInternal
make[5]: *** [absl/strings/CMakeFiles/str_format_internal.dir/build.make:104: absl/strings/CMakeFiles/str_format_internal.dir/internal/str_format/extension.cc.o] Error 1
make[5]: *** Waiting for unfinished jobs....
make[4]: *** [CMakeFiles/Makefile2:2382: absl/strings/CMakeFiles/str_format_internal.dir/all] Error 2
make[3]: *** [Makefile:136: all] Error 2

@amoeba
Copy link
Member

amoeba commented Jul 31, 2024

Whoops, sorry about that @jonkeane. New revdep check runs are now here: strong, most. Those are being run from the latest maint-17.0.0-r. They seem to be stuck at the moment so hopefully they'll start soon.

@amoeba
Copy link
Member

amoeba commented Jul 31, 2024

#36969 looks similar to your above output, maybe this is the same cause?

@jonkeane
Copy link
Member Author

@assignUser Do you know if we will need to re-generate the Apache-hosted binaries to include #43468 and #43157 and use those in this CRAN release? Or are we ok without those in the binaries since those are both build tools only?

@jonkeane
Copy link
Member Author

Looks like tidyquery is still an issue (I'm surprised we didn't catch this the first time we ran through those!)

I'm seeing a bunch of errors like this one. Which I suspect are coming from

evalq(call <- match.call(), parent)

✖ | 3    1   1 | arrow-dataset                                                                    
──────────────────────────────────────────────────────────────────────────────────────────────────
Error (test-arrow-dataset.R:3:3): Simple SELECT example query #1 returns expected result on Arrow Dataset
Error in `match.call()`: ... used in a situation where it does not exist
Backtrace:
     ▆
  1. ├─testthat::expect_equal(...) at test-arrow-dataset.R:3:3
  2. │ └─testthat::quasi_label(enquo(object), label, arg = "object")
  3. │   └─rlang::eval_bare(expr, quo_get_env(quo))
  4. ├─... %>% as.data.frame()
  5. ├─base::as.data.frame(.)
  6. ├─tidyquery::query("SELECT Species, COUNT(*) AS n FROM iris_ad GROUP BY Species")
  7. │ └─tidyquery:::query_(data, sql, TRUE) at tidyquery-9546a7d191009cc56c2c11c5a54c33d3330a42f3/R/query.R:77:3
  8. │   └─... %>% verb(ungroup) at tidyquery-9546a7d191009cc56c2c11c5a54c33d3330a42f3/R/query.R:275:7
  9. ├─tidyquery:::verb(., ungroup)
 10. │ └─base::paste(...) at tidyquery-9546a7d191009cc56c2c11c5a54c33d3330a42f3/R/query.R:450:3
 11. ├─tidyquery:::verb(...)
 12. │ └─input$data %>% fun(...) at tidyquery-9546a7d191009cc56c2c11c5a54c33d3330a42f3/R/query.R:450:3
 13. ├─dplyr (local) fun(., ...)
 14. └─arrow:::summarise.arrow_dplyr_query(., ...)
 15.   ├─arrow:::try_arrow_dplyr(...)
 16.   │ └─base::evalq(call <- match.call(), parent)
 17.   │   └─base::evalq(call <- match.call(), parent)
 18.   └─base::match.call()

jonkeane pushed a commit that referenced this issue Jul 31, 2024
### Rationale for this change

See #43317 (comment). `tidyquery` is assembling queries in some way such that when `summarize.arrow_dplyr_query()` is called, the calling environment isn't a call, so `match.call()` fails.

### What changes are included in this PR?

This PR wraps the `match.call()` call in a `try()`. The call is only used to do `abandon_ship()` on in-memory data anyway. So if the call is not available, it treats it like you're making a query on a Dataset and it tells you to `collect()` yourself. 

### Are these changes tested?

I couldn't figure out how to reproduce what was going on inside `tidyquery` to write a reproducer, and I don't think this is worth adding `tidyquery` to Suggests for. I confirmed locally that `tidyquery` tests pass with this change, so our revdeps should be clear.

### Are there any user-facing changes?

🙅 

Authored-by: Neal Richardson <[email protected]>
Signed-off-by: Jonathan Keane <[email protected]>
@jonkeane
Copy link
Member Author

Thanks for all of the help, everyone. I just sent the submission, I'll keep this issue posted if there are other issues.

@jonkeane
Copy link
Member Author

jonkeane commented Aug 1, 2024

Womp womp. I received the not-passing-auto-checks email. The problem is that we are calling XXX on <4.2 and CRAN's checks for these apparently are static and not actually checking if we are truly calling them (on R versions that they run on).

I'm partially tempted to wait and see if they end up accepting it anyway after seeing what's going on. I could also reply and explain (though I have extremely low hopes of that landing, even though it is simply a fact that before 4.2 the approved thing does not exist!). @nealrichardson do I remember you having an idea on possibly obfuscating this call for our compatibility branch?

The only NOTE is:

* checking compiled code ... NOTE
File 'arrow/libs/x64/arrow.dll':
  Found non-API calls to R: 'Rf_findVarInFrame3', 'SETLENGTH',
    'SET_GROWABLE_BIT', 'SET_TRUELENGTH'

Compiled code should not call non-API entry points in R.

See 'Writing portable packages' in the 'Writing R Extensions' manual,
and section 'Moving into C API compliance' for issues with the use of
non-API entry points.

https://win-builder.r-project.org/incoming_pretest/arrow_17.0.0_20240801_011430/Windows/00check.log
https://win-builder.r-project.org/incoming_pretest/arrow_17.0.0_20240801_011430/Debian/00check.log

@nealrichardson
Copy link
Member

You could wait to see if Uwe waves it through, or you could reply and explain that these are ifdef'd for backwards compatibility only.

The in-case-of-emergency hack would be to sed out these references from the .cpp files in the configure script if R is > 4.4 (the checks are only on devel now, right?). But it seems like this is a more general problem CRAN will have to solve, if they're adding new "official" API functions to replace some of these things but they're only in R-devel, how can you be on CRAN if you depend on them?

What's surprising to me is that, at least by the internal function names check is calling, they're looking at the compiled object files, which I would expect not to have these symbols in them because of the #ifdef: https://github.com/wch/r-source/blob/032b8240be6c8cf5e8e85ecde6f3f23e46739716/src/library/tools/R/sotools.R#L721-L768 And that function is wrapping nm.

In fact, I do see these symbols showing up locally, on new enough R, for example:

npr@nealrichardson-3P29 r % nm src/arrow.so | grep Rf_findVarInFrame3
                 U _Rf_findVarInFrame3

@nealrichardson
Copy link
Member

Of course CRAN is closed starting tomorrow (and it's already tomorrow CEST), so this is probably not something we can solve until they're back.

@jonkeane
Copy link
Member Author

jonkeane commented Aug 1, 2024

Hmmmm, yeah something strange is going on here. It's possible that the #IFDEF isn't the right thing for detecting functions (https://stackoverflow.com/questions/6234755/does-ifdef-or-other-preprocessor-directives-work-for-function-declarations-t) and instead we should do something like #if (R_VERSION >= R_Version(4, 2, 0)) buuuuut I actually still see _Rf_findvarInFrame3 even if I remove any calls (conditional or otherwise) from arrow_cpp11.h (which is also the only place that calls it in our code) 🫠 I do see it in cpp11 https://github.com/r-lib/cpp11/blob/1c9dbb65cc9279a404f035f2fca990b338c9f602/inst/include/cpp11/environment.hpp#L37-L38 so maybe that's where it's coming from?

@jonkeane
Copy link
Member Author

jonkeane commented Aug 1, 2024

Ah yes, I think it is, indeed, coming from cpp11's environment.hpp. I just tried removing all of our imports / references to that and only then is nm -v src/arrow.so | grep Rf_findVarInFrame3 clean.

We still need #43504 but will still have an issue on that for automatic checks until cpp11 fixes + releases their fix.

@nealrichardson
Copy link
Member

We may need to vendor (and possibly patch) cpp11 for this release then.

@nealrichardson
Copy link
Member

nealrichardson commented Aug 14, 2024

Need to cherry-pick #43649 (ab432b1) into the CRAN release

@jonkeane
Copy link
Member Author

Ok, I've cherry-picked the latest fix onto this branch (#43735). I'll resubmit later this evening — in case anyone has any objections + suggestions for other things.

@jonkeane
Copy link
Member Author

I have resubmitted 17.0.0.1

@jonkeane
Copy link
Member Author

Aaaand we have been accepted 🎉

Do we already have a draft social media post?

@amoeba
Copy link
Member

amoeba commented Aug 22, 2024

I haven't seen a draft yet. @thisisnic do you want to take this on again?

@thisisnic
Copy link
Member

I was taking this release "off" to focus on my other commitments, though would be happy to review it if someone else wanted to write something up?

@amoeba
Copy link
Member

amoeba commented Aug 22, 2024

I can get it started later today, I'll drop a link in here for reviews when that's done.

@amoeba
Copy link
Member

amoeba commented Aug 23, 2024

Draft social media content is here. I still need to generate some code images for the examples and tweak a bit of the language but please have a look to see if I missed any tweet-worthy highlights. I see binaries are almost all up on CRAN which is great.

@thisisnic
Copy link
Member

Language here looks great, thanks @amoeba!

@jonkeane
Copy link
Member Author

jonkeane commented Aug 23, 2024

Thanks, this is really great! I'll also post it on linked in (though not as a series). I also took the liberty to add emoji in the places to make it engaging 😂

@assignUser
Copy link
Member

assignUser commented Aug 23, 2024

We are missing a checkbox to move the r-univerese tag, I'll move the tag forward in a bit
e: done

@amoeba
Copy link
Member

amoeba commented Aug 26, 2024

Thanks @jonkeane and @thisisnic for taking a look at the social media doc. I accepted suggestions and added code screenshots to the Drive folder where appropriate, see https://drive.google.com/drive/folders/1DzGoZNHbiZXNca2oHWjpBtJrEti3W-zQ?usp=drive_link, so I think this is ready for me and @jonkeane to post when x86_64 binaries go live.

@amoeba
Copy link
Member

amoeba commented Sep 12, 2024

It looks like the macOS x86_64 binary finally got built, see https://cran.r-project.org/web/packages/arrow/index.html. The binary for oldrel still isn't built but I don't think that should prevent any announcement about the release. Thoughts?

@thisisnic
Copy link
Member

Let's do it! Thanks @amoeba!

@jonkeane
Copy link
Member Author

Agreed, let's call it. The oldrelease is one that I've asked about before on mailing lists and they sometimes are super slow, but I've seen periods where they don't build for any packages for weeks and eventually sort themselves.

I can RTFS, but does anyone know how / if we can update the docs website to show the 17.0.0.1 number?

@amoeba
Copy link
Member

amoeba commented Sep 17, 2024

I scheduled tweets/toots to go out tomorrow morning.

Re: The website showing version 17.0.0... I think we might want to find a better workflow but for now I created apache/arrow-site#542 to do the update. My steps are described there.

Edit: Threads up at https://x.com/brycem/status/1836040825752490139 and https://elk.zone/toot.cafe/@amoeba/113153321269054346

@jonkeane
Copy link
Member Author

I've also made an article on linkedin not sure if that's the right level of linked in posting, but I'll keep an eye on if it gets engagement.

Thanks for the help, everyone! I'm going to close this out.

@amoeba
Copy link
Member

amoeba commented Sep 23, 2024

Thanks @jonkeane. Do we need to update the r-universe-release tag still and add a r-17.0.01 tag?

@jonkeane
Copy link
Member Author

Yeah, if we don't have one already we should. Sorry I didn't look at that

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants