Skip to content

Commit

Permalink
Merge pull request #419 from r-world-devs/devel
Browse files Browse the repository at this point in the history
Release 2.0.1
  • Loading branch information
maciekbanas authored May 14, 2024
2 parents 0cbbe43 + 713d510 commit cb1dbf8
Show file tree
Hide file tree
Showing 35 changed files with 547 additions and 369 deletions.
2 changes: 1 addition & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Package: GitStats
Title: Get Statistics from GitHub and GitLab
Version: 2.0.0
Version: 2.0.1
Authors@R: c(
person(given = "Maciej", family = "Banas", email = "[email protected]", role = c("aut", "cre")),
person(given = "Kamil", family = "Koziej", email = "[email protected]", role = "aut"),
Expand Down
16 changes: 16 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,19 @@
# GitStats 2.0.1

This is a patch release with some hot issues that needed to be addressed, notably covering `set_*_host()` functions with `verbose` control, tweaking a bit `verbose` feature in general, fixing pulling data for GitLab subgroups and speeding up `get_files()` function.

## Features:

- Getting files feature has been speeded up when `GitStats` is set to scan whole hosts, with switching to `Search API` instead of pulling files via `GraphQL` (with iteration over organizations and repositories) ([#411](https://github.com/r-world-devs/GitStats/issues/411)).
- When setting hosts to be scanned in whole (without specifying `orgs` or `repos`) GitStats does not pull no more all organizations. Pulling all organizations from host is triggered only when user decides to pull repositories from organizations. If he decides, e.g. to pull repositories by code, there is no need to pull all organizations (which may be a time consuming process), as GitStats uses then `Search API` ([#393](https://github.com/r-world-devs/GitStats/issues/393)).
- It is now possible to mute messages also from `set_*_host()` functions with `verbose_off()` or `verbose` parameter ([#413](https://github.com/r-world-devs/GitStats/issues/413)).
- Setting `verbose` to `FALSE` does not lead to hiding output of the `get_*()` functions - i.e. a glimpse of table will always appear after pulling data, even if the `verbose` is switched off. `verbose` parameter serves now only the purpose to show and hide messages to user ([#423](https://github.com/r-world-devs/GitStats/issues/423)).

## Fixes:

- Pulling repositories from GitLab subgroups was fixed. It did not work, as the URL of a group (org) was passed to GraphQL API the same way as to REST API, i.e. with URL sign ("%2F", instead of "/").
- GitStats returns now proper error, when you pass wrong host URL to `set_*_host()` function ([#415](https://github.com/r-world-devs/GitStats/issues/415))

# GitStats 2.0.0

This is a major release with general changes in workflow (simplifying it), changes in setting `GitStats` hosts, deprecation of some not very useful features (like plots, setting parameters separately) and new `get_release_logs()` function.
Expand Down
19 changes: 6 additions & 13 deletions R/EngineGraphQLGitHub.R
Original file line number Diff line number Diff line change
Expand Up @@ -84,19 +84,12 @@ EngineGraphQLGitHub <- R6::R6Class("EngineGraphQLGitHub",
},

# Pull all given files from all repositories of an organization.
pull_files_from_org = function(org, file_path, pulled_repos = NULL) {
if (is.null(pulled_repos)) {
repos_list <- self$pull_repos_from_org(
org = org
)
repositories <- purrr::map(repos_list, ~ .$repo_name)
def_branches <- purrr::map(repos_list, ~ .$default_branch$name)
} else {
repos_table <- pulled_repos %>%
dplyr::filter(organization == org)
repositories <- repos_table$repo_name
def_branches <- repos_table$default_branch
}
pull_files_from_org = function(org, file_path) {
repos_list <- self$pull_repos_from_org(
org = org
)
repositories <- purrr::map(repos_list, ~ .$repo_name)
def_branches <- purrr::map(repos_list, ~ .$default_branch$name)
files_list <- purrr::map(file_path, function(file_path) {
files_list <- purrr::map2(repositories, def_branches, function(repository, def_branch) {
files_query <- self$gql_query$files_by_repo()
Expand Down
93 changes: 33 additions & 60 deletions R/EngineGraphQLGitLab.R
Original file line number Diff line number Diff line change
Expand Up @@ -72,52 +72,43 @@ EngineGraphQLGitLab <- R6::R6Class("EngineGraphQLGitLab",
},

# Pull all given files from all repositories of a group.
pull_files_from_org = function(org, file_path, pulled_repos = NULL) {
pull_files_from_org = function(org, file_path) {
org <- URLdecode(org)
if (!is.null(pulled_repos)) {
repos_table <- pulled_repos %>%
dplyr::filter(organization == org)
full_files_list <- private$pull_file_from_repos(
file_path = file_path,
repos_table = repos_table
full_files_list <- list()
next_page <- TRUE
end_cursor <- ""
while (next_page) {
files_query <- self$gql_query$files_by_org(
end_cursor = end_cursor
)
} else {
full_files_list <- list()
next_page <- TRUE
end_cursor <- ""
while (next_page) {
files_query <- self$gql_query$files_by_org(
end_cursor = end_cursor
)
files_response <- self$gql_response(
gql_query = files_query,
vars = list(
"org" = org,
"file_paths" = file_path
)
files_response <- self$gql_response(
gql_query = files_query,
vars = list(
"org" = org,
"file_paths" = file_path
)
if (length(files_response$data$group) == 0) {
cli::cli_alert_danger("Empty")
}
projects <- files_response$data$group$projects
files_list <- purrr::map(projects$edges, function(edge) {
edge$node
}) %>%
purrr::discard(~ length(.$repository$blobs$nodes) == 0)
if (is.null(files_list)) files_list <- list()
if (length(files_list) > 0) {
next_page <- files_response$pageInfo$hasNextPage
} else {
next_page <- FALSE
}
if (is.null(next_page)) next_page <- FALSE
if (next_page) {
end_cursor <- files_response$pageInfo$endCursor
} else {
end_cursor <- ""
}
full_files_list <- append(full_files_list, files_list)
)
if (length(files_response$data$group) == 0) {
cli::cli_alert_danger("Empty")
}
projects <- files_response$data$group$projects
files_list <- purrr::map(projects$edges, function(edge) {
edge$node
}) %>%
purrr::discard(~ length(.$repository$blobs$nodes) == 0)
if (is.null(files_list)) files_list <- list()
if (length(files_list) > 0) {
next_page <- files_response$pageInfo$hasNextPage
} else {
next_page <- FALSE
}
if (is.null(next_page)) next_page <- FALSE
if (next_page) {
end_cursor <- files_response$pageInfo$endCursor
} else {
end_cursor <- ""
}
full_files_list <- append(full_files_list, files_list)
}
return(full_files_list)
},
Expand Down Expand Up @@ -153,24 +144,6 @@ EngineGraphQLGitLab <- R6::R6Class("EngineGraphQLGitLab",
vars = list("org" = org)
)
return(response)
},

#Pull all given files from given repositories.
pull_file_from_repos = function(file_path, repos_table) {
files_list <- purrr::map(repos_table$repo_url, function(repo_url) {
files_query <- self$gql_query$files_from_repo()
files_response <- self$gql_response(
gql_query = files_query,
vars = list(
"file_paths" = file_path,
"project_path" = stringr::str_replace(repo_url, ".*(?<=.com/)", "")
)
)
return(files_response)
}) %>%
purrr::discard(~ length(.$data$project$repository$blobs$nodes) == 0) %>%
purrr::map(~ .$data$project)
return(files_list)
}
)
)
6 changes: 3 additions & 3 deletions R/EngineRest.R
Original file line number Diff line number Diff line change
Expand Up @@ -51,13 +51,13 @@ EngineRest <- R6::R6Class("EngineRest",
},

# Filtering handler if files are set for scanning scope
limit_search_to_files = function(repos_list, files) {
limit_search_to_files = function(search_result, files) {
if (!is.null(files)) {
repos_list <- purrr::keep(repos_list, function(repository) {
search_result <- purrr::keep(search_result, function(repository) {
any(repository$path %in% files)
})
}
return(repos_list)
return(search_result)
},

# Helper
Expand Down
49 changes: 36 additions & 13 deletions R/EngineRestGitHub.R
Original file line number Diff line number Diff line change
Expand Up @@ -4,15 +4,30 @@ EngineRestGitHub <- R6::R6Class("EngineRestGitHub",
inherit = EngineRest,
public = list(

# Pull repositories with files
pull_files = function(files) {
files_list <- list()
for (filename in files) {
search_file_endpoint <- paste0(private$endpoints[["search"]], "filename:", filename)
total_n <- self$response(search_file_endpoint)[["total_count"]]
if (length(total_n) > 0) {
search_result <- private$search_response(
search_endpoint = search_file_endpoint,
total_n = total_n
) %>%
purrr::keep(~ .$path == filename)
files_content <- private$get_files_content(search_result, filename)
files_list <- append(files_list, files_content)
}
}
return(files_list)
},

# Pulling repositories where code appears
# @param byte_max According to GitHub documentation only files smaller than
# 384 KB are searchable. See
# \link{https://docs.github.com/en/rest/search?apiVersion=2022-11-28#search-code}
pull_repos_by_code = function(org = NULL,
code,
verbose,
settings,
byte_max = "384000") {
settings) {
private$set_verbose(verbose)
user_query <- if (!is.null(org)) {
paste0('+user:', org)
Expand All @@ -24,17 +39,16 @@ EngineRestGitHub <- R6::R6Class("EngineRestGitHub",
total_n <- self$response(search_endpoint)[["total_count"]]
if (verbose) cli::cli_alert_info("Searching for code [{code}]...")
if (length(total_n) > 0) {
repos_list <- private$search_response(
search_result <- private$search_response(
search_endpoint = search_endpoint,
total_n = total_n,
byte_max = byte_max
total_n = total_n
)
repos_list <- private$limit_search_to_files(
repos_list = repos_list,
search_result <- private$limit_search_to_files(
search_result = search_result,
files = settings$files
)
repos_list <- private$map_search_into_repos(
search_response = repos_list
search_response = search_result
)
} else {
repos_list <- list()
Expand Down Expand Up @@ -123,10 +137,12 @@ EngineRestGitHub <- R6::R6Class("EngineRestGitHub",
# A wrapper for proper pagination of GitHub search REST API
# @param search_endpoint A character, a search endpoint
# @param total_n Number of results
# @param byte_max Max byte size
# @param byte_max According to GitHub documentation only files smaller than
# 384 KB are searchable. See
# \link{https://docs.github.com/en/rest/search?apiVersion=2022-11-28#search-code}
search_response = function(search_endpoint,
total_n,
byte_max) {
byte_max = "384000") {
if (total_n >= 0 & total_n < 1e3) {
resp_list <- list()
for (page in 1:(total_n %/% 100)) {
Expand Down Expand Up @@ -199,6 +215,13 @@ EngineRestGitHub <- R6::R6Class("EngineRestGitHub",
FALSE
})
repos_list
},

# Get files content
get_files_content = function(search_result, filename) {
purrr::map(search_result, ~ self$response(.$url),
.progress = glue::glue("Adding file [{filename}] info...")) %>%
unique()
}
)
)
Loading

0 comments on commit cb1dbf8

Please sign in to comment.